Use a two-stage approach for cost/accuracy balance.
Model Comparison
| Model | Cost/1K tokens | Accuracy | Best For |
|---|---|---|---|
| GPT-5-mini | $0.00015 | 70-80% | Initial filtering, high volume |
| GPT-5 | $0.005 | 90-95% | Final qualification, personalization |
| Claude 3 Haiku | $0.00025 | 75-85% | Balanced cost/accuracy |
| Claude 3.5 Sonnet | $0.003 | 92-97% | Complex qualification |
Recommended Strategy
Stage 1 - Fast Filter (High Volume)
- Model: GPT-5-mini
- Batch size: 100
- Goal: Quick yes/no filtering
Stage 2 - Deep Qualification (Borderline Cases)
- Model: GPT-5
- Batch size: 25
- Filter: Only re-check where stage1_score >= 0.5
Stage 3 - Personalization (Qualified Leads)
- Model: GPT-5
- Batch size: 10
- Goal: First line generation, custom messaging
Key Insight
It might be worth spending more money on a more expensive model than spending countless hours tweaking the prompt.
GPT-5 often gets the right answer immediately where GPT-5-mini fails.
AI GeneratedFebruary 2026