desirelovell
AI cost projection & token optimization workspace
Token Market Rates: Updated
Try:
Task Estimator
60.0K
8.0K
Cheapest
Gemini 3.1 Flash-Lite
$0.027
Google · per task
Most Capable
Claude Sonnet 4.6
ROI 9.7
Anthropic · $0.30
Sweet Spot (Best ROI)
Gemini 3.1 Flash-Lite
97/100
Google · $0.027
Cost Projection
Sweet SpotCheapestMost Capable
Predictive Matrix
| Model | Input | Output | Total | Efficiency |
|---|---|---|---|---|
Gemini 3.1 Flash-LiteSweet Spot Google · 1M ctx | $0.015 | $0.012 | $0.027 | 97 |
DeepSeek R1 DeepSeek · 128K ctx | $0.033 | $0.0175 | $0.0505 | 95 |
GPT-5.4 Mini OpenAI · 128K ctx | $0.045 | $0.036 | $0.081 | 91 |
Gemini 2.5 Pro Google · 1M ctx | $0.075 | $0.08 | $0.155 | 82 |
Mistral Large 2 Mistral · 128K ctx | $0.12 | $0.048 | $0.168 | 80 |
Gemini 3.1 Pro (Preview) Google · 10M ctx | $0.12 | $0.096 | $0.216 | 76 |
GPT-4o OpenAI · 128K ctx | $0.15 | $0.08 | $0.23 | 73 |
GPT-5.4 OpenAI · 128K ctx | $0.15 | $0.12 | $0.27 | 70 |
Claude Sonnet 4.6 Anthropic · 1M ctx | $0.18 | $0.12 | $0.30 | 68 |
Claude Opus 4.8 Anthropic · 1M ctx | $0.30 | $0.20 | $0.50 | 43 |
GPT-5.5 (Flagship) OpenAI · 1M ctx | $0.30 | $0.24 | $0.54 | 40 |
Spend & Routing Ontology
How a complex task flows through a Mixture-of-Experts pipeline to maximize ROI.
100%
User Request
Whisper / Text input
96%
Router Layer
MoE classification
92%
Cheap Token Model
Gemini 3.1 Flash-Lite · embeds & fast pass
88%
Expensive Token Model
Claude Sonnet 4.6 · reasoning & final pass
94%
High ROI Output
Routed via Gemini 3.1 Flash-Lite
System Prompt Optimizer
Tuned for Gemini 3.1 Flash-Lite. Omitting conversational filler saves an average of 18% on output tokens — roughly 1,440 tokens ($0.00) on this task.
You are Gemini 3.1 Flash-Lite. Optimize for token efficiency: - Omit conversational filler, greetings, and restatements of the question. - Return only the requested content; no preamble or sign-off. - Use compact formatting (tables/JSON) over prose when structured data is requested. - Reason internally; output only conclusions unless reasoning is explicitly asked for. - Cap responses to the minimum tokens needed to be correct and complete.