Docsinference platformOverview

Overview

GPU-powered inference for image, video, OCR, TTS, and ASR models.

On this page

Overview
Available capabilities
How it works

Overview

The Inference Platform provides API access to production-ready ML models for image generation, video generation, OCR, text-to-speech, and speech recognition. Models run on GPU infrastructure and are billed per-inference in credits.

Available capabilities

Text-to-Image — generate images from text prompts
Image-to-Image — edit and transform existing images
Text-to-Video — generate video from text descriptions
OCR — extract text from images and documents
Text-to-Speech — generate natural speech from text
Speech Recognition — transcribe audio to text

How it works

1Browse the model catalog for available models
2Call the invoke API with a model slug and input
3Platform runs inference on GPU infrastructure
4Results are returned synchronously or via webhook
5Credits are deducted based on model pricing

Model Catalog Serverless Inference Inference Credits

Overview

Overview

Available capabilities

How it works

Related pages