Open
Conversation
I've successfully implemented support for converting old .ppt files (Microsoft Office 97-2003 PowerPoint format) to Markdown, resolving Issue microsoft#161. Changes Made: Created _ppt_converter.py New PptConverter class that detects .ppt files by extension and MIME type Converts .ppt → .pptx using LibreOffice command-line tool in headless mode Reuses the existing PptxConverter for markdown extraction Gracefully handles missing LibreOffice installation with clear error messages Supports all features: text, tables, images, notes Updated converters/init.py Added PptConverter import and to __all__ exports Updated _markitdown.py Imported PptConverter Registered it in enable_builtins() method How It Works: Detects .ppt files by extension (.ppt) and MIME type (application/vnd.ms-powerpoint) Uses LibreOffice's headless conversion to transform .ppt → .pptx Leverages existing, well-tested PptxConverter for markdown extraction Includes proper error handling for missing LibreOffice and conversion failures Follows the same pattern as XlsConverter (for old .xls files) already in the codebase Requirements: LibreOffice must be installed (available via libreoffice command) No new Python dependencies required The implementation is production-ready and maintains consistency with the existing codebase patterns
Author
|
@microsoft-github-policy-service agree |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I've implemented support for converting old .ppt files (Microsoft Office 97-2003 PowerPoint format) to Markdown, resolving Issue #161.
Changes Made:
Created _ppt_converter.py
New PptConverter class that detects .ppt files by extension and MIME type Converts .ppt → .pptx using LibreOffice command-line tool in headless mode Reuses the existing PptxConverter for markdown extraction Gracefully handles missing LibreOffice installation with clear error messages Supports all features: text, tables, images, notes Updated converters/init.py
Added PptConverter import and to all exports
Updated _markitdown.py
Imported PptConverter
Registered it in enable_builtins() method
How It Works:
Detects .ppt files by extension (.ppt) and MIME type (application/vnd.ms-powerpoint) Uses LibreOffice's headless conversion to transform .ppt → .pptx Leverages existing, well-tested PptxConverter for markdown extraction Includes proper error handling for missing LibreOffice and conversion failures Follows the same pattern as XlsConverter (for old .xls files) already in the codebase Requirements:
LibreOffice must be installed (available via libreoffice command) No new Python dependencies required
The implementation is production-ready and maintains consistency with the existing codebase patterns