Skip to content

Reducing inference timings for Sd2.1 Base model #64

@pratos

Description

@pratos

I managed to shave off inference timings for SD2.1 by a few seconds for 512x512 (50 steps) and 768x768 (50 Steps).

Using just few additions:

torch.backends.cudnn.benchmark = True
torch.backends.cuda.matmul.allow_tf32 = True

pipe = StableDiffusionPipeline.from_pretrained(
            MODEL_ID,
            cache_dir=MODEL_CACHE,
            local_files_only=True,
        )
pipe = pipe.to("cuda")

pipe.enable_xformers_memory_efficient_attention()
pipe.enable_vae_slicing()

Overall output didn't suffer coz of this. Getting crisp images. Wanted to know how do I create a PR to add these? And are there any tests around this?

Here are the inferences:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions