Skip to content

[Refactor] Move _init_load_spec from model __init__ to TrainEngine.build_model#1676

Open
HIT-cwh wants to merge 1 commit intoInternLM:mainfrom
HIT-cwh:fix_init_load_spec
Open

[Refactor] Move _init_load_spec from model __init__ to TrainEngine.build_model#1676
HIT-cwh wants to merge 1 commit intoInternLM:mainfrom
HIT-cwh:fix_init_load_spec

Conversation

@HIT-cwh
Copy link
Copy Markdown
Collaborator

@HIT-cwh HIT-cwh commented Apr 14, 2026

No description provided.

…ild_model

Since build_model constructs the model on meta device, _init_load_spec
needs to run after model construction rather than inside __init__.
This moves the call to build_model where it recursively initializes
load specs for all BaseModel submodules after meta device construction
and before fully_shard.
@HIT-cwh HIT-cwh requested a review from HAOCHENYE April 14, 2026 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant