skip to content
back to projects

Chat with PDF

Ask any document a question. Even the scanned ones nobody can search.

year · 2025role · designed, built, shippedtags · AI, Full-stackread · ~1 minlive

§01 · pipeline

uploadocrchunkembedstoreretrieveanswer

§02 · problem

Most PDF interactions are limited to keyword search — useless for scanned documents and unable to understand context. Engineers, researchers, and analysts waste hours skimming long documents to find a single passage.

§03 · approach

An AI-powered document Q&A system. Upload any PDF — scanned or digital — and ask questions in natural language. The system extracts text via OCR, chunks and embeds it, then retrieves relevant passages to generate accurate, contextual answers.

§04 · decisions

What was chosen.
What was rejected.

d/01
Convex (real-time DB + functions)
REST + Postgres + manual websockets

Convex gives real-time reactivity for free. Chat messages appear instantly without polling, schema changes deploy without migrations. Vendor lock-in is the cost; shipping the full backend in days instead of weeks is the gain.

d/02
Google Document AI
Tesseract OCR

Tesseract chokes on multi-column layouts, tables, and handwriting. Document AI handles all three at 95%+ accuracy. ~$0.01/page is the price; making 40% of real-world PDFs actually queryable is the value.

d/03
Embeddings stored in Convex
Pinecone or Weaviate

Fewer moving parts. Single data layer. Simpler deployment. Works at current scale (150+ docs); the migration to pgvector or a dedicated index becomes worth it around 10K docs, not before.

§05 · tradeoffs

What this costs.

  • t/01

    Chunk size: 512 tokens with 50-token overlap. 256 was too fragmented and lost surrounding context. 1024 dropped retrieval precision. 512 was the only size that preserved both.

  • t/02

    OCR adds 2–3s per page. The cost of making 40% of uploaded PDFs queryable is paid once at upload, not every query.

  • t/03

    Convex's vector search adds ~100ms over a dedicated index. Acceptable for a conversational interface; users expect a brief pause between question and answer.

§06 · impact

What this returned.

150+
documents indexed
100+
active users
<3s
average answer time

§07 · stack

Next.jsConvexGoogle Document AIGemini Embedding 2
last edited · 2026-04-19~1 min read