Docsinference platformOverview

Overview

GPU-powered inference for image, video, OCR, TTS, and ASR models.

Overview

The Inference Platform provides API access to production-ready ML models for image generation, video generation, OCR, text-to-speech, and speech recognition. Models run on GPU infrastructure and are billed per-inference in credits.

Available capabilities

  • Text-to-Image — generate images from text prompts
  • Image-to-Image — edit and transform existing images
  • Text-to-Video — generate video from text descriptions
  • OCR — extract text from images and documents
  • Text-to-Speech — generate natural speech from text
  • Speech Recognition — transcribe audio to text

How it works

  1. 1Browse the model catalog for available models
  2. 2Call the invoke API with a model slug and input
  3. 3Platform runs inference on GPU infrastructure
  4. 4Results are returned synchronously or via webhook
  5. 5Credits are deducted based on model pricing