## Dimensions 1. Complexity of PDFs structure => Single PDF, 2 Delinked PDFs, Linked PDFs, PDFs with Images, PDFs with Tables. 2. Complexity of Questions => Fact Parrot, Neural Coref, Reasoning, Calculations. 3. Complexity of Ingestion of PDF - Reading, Chunking in the most simplest way 4. Complexity of Retrieval - Em only, Rank reeval, Hybrid, ColBERT, ... 5. Automation of 1 and 2 ## Evaluation 1. User Feedback 2. Maximum Content => Breadth ## Plan 1. Start with one PDF and then add more 2. Ingestion - manual first and then automated 3. Stages to production - PM Testing - User Testing - Production 1. Simpler PDF getting ingested => Working for a
Dimensions
Evaluation
Plan
Start with one PDF and then add more
Ingestion - manual first and then automated
Stages to production
Simpler PDF getting ingested => Working for a