Skip to content

Duplicate same-module type definitions (self-recursive newtype + real defn) when generating against GitHub's OpenAPI spec — 11 collisions, 493 compile errors #1011

@z23cc

Description

@z23cc

Summary

When running progenitor (which uses typify internally) against GitHub's published OpenAPI 3.0.3 spec (9.1 MB, 1183 operations, 5193 generated types), typify emits two top-level type definitions with the same name in the same module for 11 of the 5193 types. The first definition is invariably a self-recursive newtype that wraps Option<Self>, which is itself an unsized recursive type. Compilation fails with 493 errors in the generated lib, all traceable to these 11 duplicate-name shadowings.

pub struct AutoMerge(pub ::std::option::Option<AutoMerge>);   // ← bad placeholder, self-recursive
// ...60 lines later, same module...
pub struct AutoMerge {                                          // ← real definition
    pub commit_message: ::std::string::String,
    pub commit_title: ::std::string::String,
    pub enabled_by: SimpleUser,
    pub merge_method: AutoMergeMergeMethod,
}

The first form is impossible (would have infinite size), so the placeholder shouldn't exist at all.

Error histogram (cargo check on the generated lib)

Errors Code Meaning
111 E0119 Conflicting trait impls (Debug, Display, Serialize, Deserialize)
92 E0599 No associated item (enum variant lookups against the shadowing struct)
83 E0592 Duplicate definitions with name builder (each duplicate has its own builder)
63 E0560 Struct has no field X (call-sites assume the first defn's fields)
42 E0609 No field X on type &T (same root)
19 E0428 Name defined multiple times
4 E0072 Recursive type has infinite size (the struct Foo(Option<Foo>) newtype)
3 E0391 Cycle detected when computing layout
493 Total

The 11 duplicated types

All are defined exactly twice in mod types {} in the generated lib:

AuthorAssociation  AutoMerge        CodespaceMachine  IssueComment
IssueField         IssueType        LicenseSimple     Milestone
NullableSecretScanningFirstDetectedLocation
ProjectsV2StatusUpdate              SimpleCommit      TeamSimple

Cross-referencing against the input spec, the 11 split into two patterns:

Pattern A: schema has top-level nullable: true (4 of 11)

auto-merge, author-association, issue-field, issue-type are each declared once in components.schemas with nullable: true at the schema root, e.g.

components:
  schemas:
    auto-merge:
      type: object
      nullable: true
      properties:
        commit_message: { type: string }
        commit_title:   { type: string }
        enabled_by:     { $ref: '#/components/schemas/simple-user' }
        merge_method:   { type: string, enum: [merge, squash, rebase] }
      required: [enabled_by, merge_method, commit_title, commit_message]

Pattern B: nullable-X sibling schema with identical structure (7 of 11)

The spec defines both foo and nullable-foo as separate component schemas with identical bodies. Example: codespace-machine and nullable-codespace-machine are byte-for-byte the same except for the key. Affected types in this pattern: codespace-machine, issue-comment, license-simple, milestone, projects-v2-status-update, simple-commit, team-simple, plus nullable-secret-scanning-first-detected-location which exists only in the nullable form.

Reproduction

Spec: https://raw.githubusercontent.com/github/rest-api-description/main/descriptions/api.github.com/api.github.com.yaml

Generator: progenitor 0.14typify-impl 0.6.2 via Generator::generate_tokens (no CLI feature involved).

let raw = std::fs::read_to_string("api.github.com.yaml")?;
let spec: openapiv3::OpenAPI = serde_yaml::from_str(&raw)?;
let mut s = progenitor::GenerationSettings::default();
s.with_interface(progenitor::InterfaceStyle::Builder);
let tokens = progenitor::Generator::new(&s).generate_tokens(&spec)?;
let formatted = prettyplease::unparse(&syn::parse2(tokens)?);
std::fs::write("out.rs", formatted)?;

Then:

grep -E "^    pub (struct|enum) " out.rs | sort | uniq -c | awk '$1 > 1'

returns the 11 names above.

What I tried for minimization (and failed)

I built three reduced specs covering the two patterns:

  1. A schema with top-level nullable: true, referenced once.
  2. A foo + nullable-foo sibling pair, referenced once each.
  3. A nullable: true schema referenced through allOf.

In all three, typify's renamer correctly disambiguates by suffixing the inner type (e.g. pub struct AutoMerge(Option<AutoMergeInner>); pub struct AutoMergeInner { ... }) — the bug does not reproduce on a single-spec, two-or-three-schema reduction. The trigger seems to require many cross-cutting $refs and/or the renamer's seen-names state interacting across multiple processing contexts. I'd value pointers from a maintainer on which axes to bisect; if useful I can run instrumented experiments and report back.

Discovered via

pp, an OpenAPI → installable Rust CLI generator I'm building on top of progenitor. pp does spec normalization before handing the spec to progenitor (dedups media types, drops colliding enum values / property names, strips unsupported schema types, etc.), so the spec going into progenitor is clean. Generation itself succeeds without panic; only the produced Rust source fails to compile.

Why this matters

Past the well-known "needs a downgrade for OpenAPI 3.1" hurdle, this is the most common blocker I've seen when pointing progenitor at large real-world specs that use a "nullable variant" naming convention — a pattern common in vendor-published specs.

Happy to share the full generated out.rs (43 MB), a Cargo project that reproduces, or run any diagnostic patch a maintainer wants.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions