Cosmos Transfer2.5 Auto-Regressive Inference Pipeline#13114
Cosmos Transfer2.5 Auto-Regressive Inference Pipeline#13114miguelmartin75 wants to merge 3 commits intohuggingface:mainfrom
Conversation
775f4b8 to
0d0eeae
Compare
0d0eeae to
bff6af9
Compare
yiyixuxu
left a comment
There was a problem hiding this comment.
thanks for the PR!
my main question is would it make sense to make this pipeline strictly ControlNet-focused? looking at the pipeline code, this would simplify the pipeline quite a bit
| self, | ||
| image: PipelineImageInput | None = None, | ||
| video: List[PipelineImageInput] | None = None, | ||
| controls: Optional[PipelineImageInput | List[PipelineImageInput]] = None, |
There was a problem hiding this comment.
maybe we should make this pipeline strictly about controlnet (i.e. not make it optional) and then remove the image and video argument? this is how other controlnet behave anyways
if they want to use without controlnet, they can switich to the base pipeline
| else: | ||
| width = int((height + 16) * (frame.shape[2] / frame.shape[1])) # NOTE: assuming C H W | ||
|
|
||
| if num_latent_conditional_frames is not None and num_conditional_frames is not None: |
There was a problem hiding this comment.
any reason we need two arguments here? is it possible we only keep num_conditional_frames?
There was a problem hiding this comment.
this is done to provide the user with the option to provide either one, the official GH uses num_conditional_frames for transfer but num_latent_conditional_frames for predict so I figured it would be best to provide both
55b09c0 to
16602ad
Compare
What does this PR do?
This builds off #13066 by adding auto-regressive inference for Cosmos Transfer2.5. This pipeline does not require the controlnet or controls to be input. From the documentation:
Who can review?