PDF Field Extraction: Amazon Nova Pro

Selectable in Settings → Variants → Extraction.

Approach

Uses the same free-JSON extraction prompt as baseline Sonnet, but with Amazon’s Nova Pro multimodal model via AWS Bedrock. Nova Pro supports native PDF input and costs roughly 1/4 the price of Sonnet.

Metrics (Deterministic scorer)

Metric	Nova Pro	Baseline Sonnet	Delta
Field Recall	0.6%	62.1%	-61.5pp
Field Precision	4.0%	78.9%	-74.9pp
Type Accuracy	100.0%	97.0%	+3.0pp
Group Accuracy	50.0%	31.4%	+18.6pp
Sensitivity Accuracy	100.0%	27.3%	+72.7pp

Findings

Nova Pro fails at field-level extraction. Despite achieving 97-100% on the homework’s tool-calling interview task, Nova Pro cannot perform PDF field extraction at a useful level. It produces section-level summaries (e.g., “contactInformation”, “familyInformation”) rather than individual fields (e.g., “firstName”, “lastName”, “emailAddress”).

The task complexity gap is larger than expected. The homework tested Nova Pro on a 10-field interview spec where it scored 97%. PDF extraction requires identifying 30-140 individual fields from visual document layout — a fundamentally different and harder task than following a pre-defined field list. This confirms the homework’s 15-field ceiling applies even more strongly to open-ended extraction (vs. tool-calling with a known schema).

The 100% type/sensitivity/group scores are vacuously true. With only 1 matched field across all fixtures, the accuracy metrics are meaningless — they represent 1/1 = 100% on a single data point.

Prompt optimization likely won’t fix this. The homework showed that prompt strategy helps small models follow instructions (hybrid prompt: 89% → 100% on Mistral 8B). But Nova Pro’s failure mode isn’t instruction-following — it’s a capability gap in document understanding. The model can read the PDF but cannot decompose it into granular fields.

Cost Comparison

Model	Input $/1K	Output $/1K	Field Recall	Viable?
Nova Pro	$0.0008	$0.0032	0.6%	No
Haiku 4.5	$0.0008	$0.004	~45%	Marginal
Sonnet 4	$0.003	$0.015	62.1%	Yes
Opus 4.6	$0.015	$0.075	~72%	Yes (best)

Conclusion: For PDF field extraction, Claude models remain necessary. The cost floor is Haiku at $0.0008/1K input tokens. Non-Claude models that work well for simpler tasks (tool-calling, classification) do not transfer to complex document understanding.

Course Connection

This result directly validates Assignment 10’s key finding: model selection dominates prompt engineering. The homework’s cost-performance frontier ($0.003/interview at 100% for Llama 4 Scout) applies specifically to tasks within the model’s capability range. PDF extraction is outside that range for all tested non-Claude models.

The implication for production: cost optimization for extraction should focus on Haiku (cheapest Claude) or prompt techniques that improve Sonnet’s recall (few-shot, prompt-opt), rather than switching to non-Claude models.