Skip to content

Passing files directly into Grobid without downloading #47

@matthieu-perso

Description

@matthieu-perso

Hey Grobid team,

Thanks again for these incredible tools. I've been testing out the Python client - and encountered an issue when passing a PDF as an argument both while using the CLI and Python. I didn't receive any output.

Sample code below

grobid_client --input ./resource/my.PDF --output ./out processFulltextDocument

I realized while debugging that L122 of the grobid_client.py file implies passing in a directory and not the file itself as in the below request.

grobid_client --input ./resource/mypdfdir --output ./out processFulltextDocument

On GCP, I was trying to pass files directly in Grobid without downloading them - which I would have to do with the current setup. Anyway to stream PDFs in Grobid ? Or to send them as file objects ? If not, I'll try to see if I can pull something off quickly and test it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions