Chat with PDF
Ask any document a question. Even the scanned ones nobody can search.
§01 · pipeline
§02 · problem
Most PDF interactions are limited to keyword search — useless for scanned documents and unable to understand context. Engineers, researchers, and analysts waste hours skimming long documents to find a single passage.
§03 · approach
An AI-powered document Q&A system. Upload any PDF — scanned or digital — and ask questions in natural language. The system extracts text via OCR, chunks and embeds it, then retrieves relevant passages to generate accurate, contextual answers.
§04 · decisions
What was chosen.
What was rejected.
Convex gives real-time reactivity for free. Chat messages appear instantly without polling, schema changes deploy without migrations. Vendor lock-in is the cost; shipping the full backend in days instead of weeks is the gain.
Tesseract chokes on multi-column layouts, tables, and handwriting. Document AI handles all three at 95%+ accuracy. ~$0.01/page is the price; making 40% of real-world PDFs actually queryable is the value.
Fewer moving parts. Single data layer. Simpler deployment. Works at current scale (150+ docs); the migration to pgvector or a dedicated index becomes worth it around 10K docs, not before.
§05 · tradeoffs
What this costs.
- t/01
Chunk size: 512 tokens with 50-token overlap. 256 was too fragmented and lost surrounding context. 1024 dropped retrieval precision. 512 was the only size that preserved both.
- t/02
OCR adds 2–3s per page. The cost of making 40% of uploaded PDFs queryable is paid once at upload, not every query.
- t/03
Convex's vector search adds ~100ms over a dedicated index. Acceptable for a conversational interface; users expect a brief pause between question and answer.
§06 · impact
What this returned.
§07 · stack