Docsinference platformVideo Models

Video Models

Generate video from text and image inputs.

On this page

Overview

Video models generate short-form video from text descriptions or reference images. Available models are based on the LTX architecture.

Slug	Capability	Pricing
ltx-2-distilled	Text-to-Video	10 credits/second
ltx-2-3-distilled	Text/Image-to-Video	10 credits/second