Docscoding planAvailable Models

Available Models

Browse LLMs available through the coding plan API, organized by plan tier.

On this page

Overview
Free plan
Go plan
Pro plan
Model recommendations
Fireworks Pricing (Current)
Fireworks serverless pricing table
Custom deployments
List available models
Model selection
Usage

Overview

The coding plan provides access to a curated set of LLMs for chat, code generation, and completion. Models are organized by plan tier — each tier unlocks access to the full model pool. Use the API to list currently available models.

Free plan

Tier 1

DeepSeek V4 Flash

Go plan

Tier 1

DeepSeek V4 Flash
Gemma 4 31B IT FP8
OpenAI GPT-OSS-20B

Tier 2

OpenAI GPT-OSS-120B
Qwen3-Coder-Next FP8
Qwen3.5-122B-A10B FP8
Qwen3.6-35B-A3B FP8

Pro plan

Tier 1

DeepSeek V4 Flash
OpenAI GPT-OSS-20B

Tier 2

OpenAI GPT-OSS-120B
Qwen3-Coder-Next FP8
Qwen3.5-122B-A10B FP8
Qwen3.6-35B-A3B FP8

Tier 3

DeepSeek V4 Pro
GLM 5.2
Kimi 2.7
Minimax M3

Model recommendations

We recommend using DeepSeek V4 Flash and Tier 1 models for the best balance of performance, cost, and efficiency.

Tier 2 models consume approximately 2–4× more usage than Tier 1 models.

Tier 3 models consume approximately 8–10× more usage than Tier 1 models.

To ensure the best user experience, we continuously update the models available in each tier based on performance, reliability, availability, and the latest model releases. As newer and better-performing models become available, they may be added to or replace existing models within the respective tiers.

Choosing Tier 1 models whenever possible helps maximize your available usage while maintaining excellent performance for most workloads.

Fireworks Pricing (Current)

The current Fireworks serverless pricing for these models is approximately:

Fireworks serverless pricing table

Model	Input ($/1M tokens)	Output ($/1M tokens)
DeepSeek V4 Flash	$0.14	$0.28
GPT-OSS-20B	$0.07	$0.30
Gemma 4 31B IT	$0.14	$0.40
GPT-OSS-120B	$0.15	$0.60
DeepSeek V4 Pro	$1.74	$3.48
GLM 5.2	$1.40	$4.40
Kimi 2.7	$0.95	$4.00
Minimax M3	$0.30	$1.20

Custom deployments

The Qwen FP8 models (Qwen3-Coder-Next FP8, Qwen3.5-122B-A10B FP8, Qwen3.6-35B-A3B FP8) are custom InferX deployments and are not currently listed in the public Fireworks serverless catalog.

List available models

curl https://code.in2peta.com/v1/coding/models   -H "Authorization: Bearer ***

Model selection

When sending a completion request, specify the model in the request body. Use "auto" to let the smart router pick the best model for your prompt.

Usage

curl -X POST https://code.in2peta.com/v1/coding/chat/completions   -H "Authorization: Bearer ***   -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Getting Started Rate Limits & Quotas Coding Endpoints

Available Models

Overview

Free plan

Go plan

Pro plan

Model recommendations

Fireworks Pricing (Current)

Fireworks serverless pricing table

Custom deployments

List available models

Model selection

Usage

Related pages