Docscoding planAvailable Models

Available Models

Browse LLMs available through the coding plan API, organized by plan tier.

Overview

The coding plan provides access to a curated set of LLMs for chat, code generation, and completion. Models are organized by plan tier — each tier unlocks access to the full model pool. Use the API to list currently available models.

Free plan

Tier 1
  • DeepSeek V4 Flash

Go plan

Tier 1
  • DeepSeek V4 Flash
  • Gemma 4 31B IT FP8
  • OpenAI GPT-OSS-20B
Tier 2
  • OpenAI GPT-OSS-120B
  • Qwen3-Coder-Next FP8
  • Qwen3.5-122B-A10B FP8
  • Qwen3.6-35B-A3B FP8

Pro plan

Tier 1
  • DeepSeek V4 Flash
  • OpenAI GPT-OSS-20B
Tier 2
  • OpenAI GPT-OSS-120B
  • Qwen3-Coder-Next FP8
  • Qwen3.5-122B-A10B FP8
  • Qwen3.6-35B-A3B FP8
Tier 3
  • DeepSeek V4 Pro
  • GLM 5.2
  • Kimi 2.7
  • Minimax M3

Model recommendations

We recommend using DeepSeek V4 Flash and Tier 1 models for the best balance of performance, cost, and efficiency.

Tier 2 models consume approximately 2–4× more usage than Tier 1 models.

Tier 3 models consume approximately 8–10× more usage than Tier 1 models.

To ensure the best user experience, we continuously update the models available in each tier based on performance, reliability, availability, and the latest model releases. As newer and better-performing models become available, they may be added to or replace existing models within the respective tiers.

Choosing Tier 1 models whenever possible helps maximize your available usage while maintaining excellent performance for most workloads.

Fireworks Pricing (Current)

The current Fireworks serverless pricing for these models is approximately:

Fireworks serverless pricing table

ModelInput ($/1M tokens)Output ($/1M tokens)
DeepSeek V4 Flash$0.14$0.28
GPT-OSS-20B$0.07$0.30
Gemma 4 31B IT$0.14$0.40
GPT-OSS-120B$0.15$0.60
DeepSeek V4 Pro$1.74$3.48
GLM 5.2$1.40$4.40
Kimi 2.7$0.95$4.00
Minimax M3$0.30$1.20

Custom deployments

The Qwen FP8 models (Qwen3-Coder-Next FP8, Qwen3.5-122B-A10B FP8, Qwen3.6-35B-A3B FP8) are custom InferX deployments and are not currently listed in the public Fireworks serverless catalog.

List available models

curl https://code.in2peta.com/v1/coding/models   -H "Authorization: Bearer ***

Model selection

When sending a completion request, specify the model in the request body. Use "auto" to let the smart router pick the best model for your prompt.

Usage

curl -X POST https://code.in2peta.com/v1/coding/chat/completions   -H "Authorization: Bearer ***   -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello"}]
  }'