Skip to content

High Initial Loss When Fine-Tuning Gemma Model #38

@zhangzc21

Description

@zhangzc21

Dear authors,

Thans for your great works!

I am currently trying to fine-tune the Gemma-7B model using PiSSA, but I am encountering an issue where the initial loss and grad norm are extremely high.

This doesn't seem to be cuased by the pissa algorithm, since using LoRA to fine-tune Gemma-7B also has similar problem.

Do you have encounted this question, or have any ideas on how to solve it? Thanks a lot!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions