LongCat-Video is an advanced AI video generation model developed by Meituan, featuring 13.6 billion parameters. It stands out by enabling the creation of high-quality, minutes-long videos at 720p resolution and 30 frames per second. Built on a unified architecture, it supports multiple video-related tasks such as Text-to-Video, Image-to-Video, and Video-Continuation, all within a single model. Its design ensures consistent video quality without color drifting or degradation, making it ideal for generating smooth, long video sequences efficiently.
Key Features:
Unified architecture supporting Text-to-Video, Image-to-Video, and Video-Continuation tasks in one model.
Ability to generate minutes-long videos with stable quality and temporal coherence.
Efficient inference with a coarse-to-fine generation strategy and Block Sparse Attention for fast 720p/30fps video production.
Multi-reward reinforcement learning (GRPO) training for balanced output quality including text alignment, motion realism, and visual fidelity.
Use Cases:
Creating dynamic videos from static images for marketing, social media, or storytelling.
Generating narrative videos or product demonstration clips extending for several minutes.
Extending existing video clips smoothly, maintaining consistent colors and motion.