Skip to content

Implement streamdown plugin ReadDocxFile for ReadFile #47

@gravity-api

Description

@gravity-api

Create a streamdown plugin ReadDocxFile that extracts text from .docx files. This plugin is executed via the ReadFile meta plugin when FileFormat=Docx.

Inputs

Name Type Mandatory Description
Path String Expression
Url String Expression
Base64 String Expression

Outputs (for meta normalization)

Output Key Description
Text Extracted text
ParagraphsCount Optional count of parsed paragraphs

Implementation Notes

  • Resolve bytes from source:

    • Base64 → bytes
    • Path → File.ReadAllBytes
    • Url → HTTP GET bytes
  • Parse using OpenXML SDK (DocumentFormat.OpenXml)

  • Preserve order; join paragraphs with double newlines for readability.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions