Available Models
Browse LLMs available through the coding plan API, organized by plan tier.
On this page
Overview
The coding plan provides access to a curated set of LLMs for chat, code generation, and completion. Models are organized by plan tier — each tier unlocks access to the full model pool. Use the API to list currently available models.
Free plan
- DeepSeek V4 Flash
Go plan
- DeepSeek V4 Flash
- Gemma 4 31B IT FP8
- OpenAI GPT-OSS-20B
- OpenAI GPT-OSS-120B
- Qwen3-Coder-Next FP8
- Qwen3.5-122B-A10B FP8
- Qwen3.6-35B-A3B FP8
Pro plan
- DeepSeek V4 Flash
- OpenAI GPT-OSS-20B
- OpenAI GPT-OSS-120B
- Qwen3-Coder-Next FP8
- Qwen3.5-122B-A10B FP8
- Qwen3.6-35B-A3B FP8
- DeepSeek V4 Pro
- GLM 5.2
- Kimi 2.7
- Minimax M3
Model recommendations
We recommend using DeepSeek V4 Flash and Tier 1 models for the best balance of performance, cost, and efficiency.
Tier 2 models consume approximately 2–4× more usage than Tier 1 models.
Tier 3 models consume approximately 8–10× more usage than Tier 1 models.
To ensure the best user experience, we continuously update the models available in each tier based on performance, reliability, availability, and the latest model releases. As newer and better-performing models become available, they may be added to or replace existing models within the respective tiers.
Choosing Tier 1 models whenever possible helps maximize your available usage while maintaining excellent performance for most workloads.
Fireworks Pricing (Current)
The current Fireworks serverless pricing for these models is approximately:
Fireworks serverless pricing table
| Model | Input ($/1M tokens) | Output ($/1M tokens) |
|---|---|---|
| DeepSeek V4 Flash | $0.14 | $0.28 |
| GPT-OSS-20B | $0.07 | $0.30 |
| Gemma 4 31B IT | $0.14 | $0.40 |
| GPT-OSS-120B | $0.15 | $0.60 |
| DeepSeek V4 Pro | $1.74 | $3.48 |
| GLM 5.2 | $1.40 | $4.40 |
| Kimi 2.7 | $0.95 | $4.00 |
| Minimax M3 | $0.30 | $1.20 |
Custom deployments
The Qwen FP8 models (Qwen3-Coder-Next FP8, Qwen3.5-122B-A10B FP8, Qwen3.6-35B-A3B FP8) are custom InferX deployments and are not currently listed in the public Fireworks serverless catalog.
List available models
curl https://code.in2peta.com/v1/coding/models -H "Authorization: Bearer ***Model selection
When sending a completion request, specify the model in the request body. Use "auto" to let the smart router pick the best model for your prompt.
Usage
curl -X POST https://code.in2peta.com/v1/coding/chat/completions -H "Authorization: Bearer *** -d '{
"model": "auto",
"messages": [{"role": "user", "content": "Hello"}]
}'