User Story
As Maya, in order to get extractions that reference relevant regulations and form instructions, I want an extraction variant that retrieves policy context via RAG before extracting fields.
Context
The homework repo has a working ChromaDB + sentence-transformers RAG implementation. This story brings retrieval-augmented generation into the extraction pipeline: embed form instructions/CFR sections, retrieve relevant context for the uploaded PDF, and include it in the extraction prompt.
Acceptance Criteria
- RAG retrieval primitive (embeddings + vector store) integrated
- Policy corpus seeded (form instructions for the 3 evaluation fixtures)
- New variant
extraction/sonnet-with-ragregistered - Evaluation run comparing RAG variant against baseline Sonnet
- Catalog page with findings and course connection
- Roadmap updated
Definition of Done
- Tests pass (
bun run check) - Evaluation complete with LLM judge
- Catalog page documents approach, metrics, and course connection
A digital services project by Flexion