Skip to content

KeyError: 'default' for discrete diffusion language model LLaDA2 #13357

@ksasi

Description

@ksasi

Describe the bug

Hi,

The following code block from the documentation (https://huggingface.co/docs/diffusers/main/api/pipelines/llada2#diffusers.LLaDA2PipelineOutput) is giving key error :

model_id = "inclusionAI/LLaDA2.1-mini"

model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, dtype=torch.bfloat16, device_map="auto"
)

Reproduction

mport torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig

from diffusers import BlockRefinementScheduler, LLaDA2Pipeline

model_id = "inclusionAI/LLaDA2.1-mini"

model = AutoModelForCausalLM.from_pretrained(
model_id, trust_remote_code=True, dtype=torch.bfloat16, device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
scheduler = BlockRefinementScheduler()

Logs

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[/tmp/ipykernel_1800/3647447919.py](https://localhost:8080/#) in <cell line: 0>()
      7 model_id = "inclusionAI/LLaDA2.1-mini"
      8 
----> 9 model = AutoModelForCausalLM.from_pretrained(
     10     model_id, trust_remote_code=True, dtype=torch.bfloat16, device_map="auto"
     11 )

4 frames
[/usr/local/lib/python3.12/dist-packages/transformers/models/auto/auto_factory.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    363                 model_class.register_for_auto_class(auto_class=cls)
    364             model_class = add_generation_mixin_to_remote_model(model_class)
--> 365             return model_class.from_pretrained(
    366                 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    367             )

[/usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, weights_only, *model_args, **kwargs)
   4070         with ContextManagers(model_init_context):
   4071             # Let's make sure we don't run the init function of buffer modules
-> 4072             model = cls(config, *model_args, **model_kwargs)
   4073 
   4074             if hf_quantizer is not None:  # replace module with quantized modules (does not touch weights)

[~/.cache/huggingface/modules/transformers_modules/inclusionAI/LLaDA2_dot_1_hyphen_mini/f21be037104f6e044e1a86b6d8864a6b85cc868e/modeling_llada2_moe.py](https://localhost:8080/#) in __init__(self, config)
    960     def __init__(self, config: LLaDA2MoeConfig):
    961         super().__init__(config)
--> 962         self.model = LLaDA2MoeModel(config)
    963         self.vocab_size = config.vocab_size
    964         self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False)

[~/.cache/huggingface/modules/transformers_modules/inclusionAI/LLaDA2_dot_1_hyphen_mini/f21be037104f6e044e1a86b6d8864a6b85cc868e/modeling_llada2_moe.py](https://localhost:8080/#) in __init__(self, config)
    781         self._use_flex_attention = config._attn_implementation == "flex_attention"
    782         self.norm = LLaDA2MoeRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
--> 783         self.rotary_emb = LLaDA2MoeRotaryEmbedding(config=config)
    784         self.gradient_checkpointing = False
    785         # Initialize weights and apply final processing

[~/.cache/huggingface/modules/transformers_modules/inclusionAI/LLaDA2_dot_1_hyphen_mini/f21be037104f6e044e1a86b6d8864a6b85cc868e/modeling_llada2_moe.py](https://localhost:8080/#) in __init__(self, config, device)
    106 
    107         self.config = config
--> 108         self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type]
    109 
    110         inv_freq, self.attention_scaling = self.rope_init_fn(self.config, device)

KeyError: 'default'

System Info

0.38.0.dev0 (from git+https://github.com/huggingface/diffusers@f2be8bd6b3dc4035bd989dc467f15d86bf3c9c12)

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions