Skip to content

data_proecess支持纯文本list #31

@linengcs

Description

@linengcs

经过使用发现,好像不能支持形如下面这种代码,会报错:

inp = model.data_process(
                        text=batch,
                        q_or_c="q",
                        task_instruction="Retrieve the target image that best meets these criteria: ",
                    )

即不支持纯文本list,原来的data_process里只对text为None的情况做了image的placeholder,我把data_process修改成下面后就可以正常处理,如果bge-vl-mllm模型确实不支持的话希望可以更新一下,这样处理会更灵活点

def data_process(self, images=None, text=None, q_or_c=None, task_instruction=None):
        if images is not None:
            _is_list = isinstance(images, list)
        elif text is not None:
            _is_list = isinstance(text, list)
        else:
            raise ValueError("images and text cannot be both None.")
        
        assert q_or_c in ["query", "candidate", "q", "c"]

        if not _is_list :
            text_input = self.prepare_text_input(images, text, q_or_c, task_instruction)
            text_input = [text_input]
            
            processed_images = None
            if images is not None:
                processed_images = [Image.open(images).resize((512,512)).convert("RGB")]
            
            inputs = self.processor(images=processed_images, text=text_input, return_tensors="pt", padding=True)

        else:
            # If only one of the lists is provided, create a placeholder list for the other
            if text is None and images is not None:
                text = [None] * len(images)
            elif images is None and text is not None:
                images = [None] * len(text)

            text_input = [self.prepare_text_input(_image, _text, q_or_c, task_instruction) for _image, _text in zip(images, text)]
            
            processed_images = None
            if images is not None:
                # Filter out None values before trying to open images
                valid_images = [_image for _image in images if _image is not None]
                if valid_images:
                    processed_images = [Image.open(_image).resize((512,512)).convert("RGB") for _image in valid_images]

            inputs = self.processor(images=processed_images, text=text_input, return_tensors="pt", padding=True)
        
        return inputs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions