Background
Lead with Seedance 2.0, start with Seedance 1.5 Pro

Seedance 2.0 AI Video Generator - Text to Video with Sound

Create AI videos with Seedance 2.0 that feel closer to finished scenes.
Multi-shot storytelling, character consistency, native audio, and higher-end output for creators who care about final quality.

Start with a simple workflow. Generate your first clip in minutes.

How Seedance 2.0 Works

Seedance 2.0 is ByteDance's latest video generation model, built on a unified multimodal architecture that accepts text, images, video clips, and audio files as input — up to 12 reference files at once. It creates 2K video up to 15 seconds with multiple camera cuts in a single pass, maintaining character identity across every shot through an internal reference-locking system.

Text to Video

Write a scene in plain language — set the mood, describe the camera angle, specify how many characters appear and what they're doing. The model processes complex multi-part prompts (up to 2,500 characters) with multiple subjects, emotional tone, and environmental detail, then renders a video that matches your intent.

Image to Video

Upload a photo or illustration and the model determines how the scene should move — hair blowing, water rippling, a character turning their head. It preserves fine details like skin texture, jewelry, and fabric patterns. You can provide two reference images to define the start and end states, or use the Omnipotent Reference system to upload up to 9 images and 3 video clips for full creative control.

Built-in Audio Generation

Seedance 2.0 generates audio and video simultaneously through its Dual-Branch Diffusion Transformer — not as a post-processing step. Characters speak with phoneme-level lip-sync in 8+ languages, footsteps land on beat, doors creak when they open, and background ambience matches the setting. You can also upload your own audio files to sync video content to specific beats or dialogue.

Physics-Aware Motion

ByteDance incorporated physics-aware training that penalizes impossible motion during generation. Cloth drapes and wrinkles naturally, water splashes with the right weight, collisions have impact, and characters shift their balance when they walk. From a subtle eyebrow raise to a full martial arts fight scene with debris, motion stays physically grounded.

Gallery

See What Seedance Can Do

Every clip below was generated by Seedance models — no post-editing, no compositing. Click any thumbnail to play.

K-Pop Music Video
Motorcycle Portrait at Dusk
Anime Street Fighter
Supercar Ridge Jump
Snowy Pine Forest at Dusk
Classic Martial Arts Training
Sci-Fi Hero Lab
Gymnast on Balance Beam
Campus Romance
Dark Fantasy Monster Battle
Kaiju Cat vs Godzilla

Where Seedance 2.0 Pulls Ahead

We run 9 AI models on one platform — 5 for video, 4 for images. Here's why Seedance 2.0 is the flagship.

Most AI video tools cap out at 720p or 1080p and produce a single continuous shot. Seedance 2.0 renders at up to 2K resolution and generates multi-shot sequences up to 15 seconds with natural camera cuts — each shot can have independently specified framing, camera movement, and action. The output looks like an edited sequence, not a raw generation.

Not Just Seedance — 9 AI Models, One Dashboard

We also run Kling 3.0 (native 4K/60fps), Kling 2.6 (audio-video sync at 1080p), Kling 2.5 Turbo Pro (fastest generation), and four Nano Banana image models (text-to-image up to 4K). Pick the right model for each job.

Kling 3.0 — Native 4K at 60fps

Kuaishou's flagship model. Native 4K resolution at 60 frames per second with Visual Chain-of-Thought reasoning, multi-shot storyboarding with up to 6 camera cuts, and voice binding for multi-character scenes. The highest resolution AI video model available.

Kling 2.6 — First Kling Model with Audio

Kuaishou's first audio-video co-generation model. 1080p at 48fps with simultaneous generation of speech, dialogue, singing, rap, and ambient sound effects. Supports bilingual dialogue in English and Chinese, with motion reference input up to 30 seconds.

Kling 2.5 Turbo Pro — Speed-First Generation

3x faster than standard models. Negative prompt control to exclude unwanted elements, CFG scale adjustment for prompt adherence tuning, and 1080p output at 30-48fps. Ideal for rapid iteration and quick previews when you need volume over polish.

Nano Banana 2 — Google's Latest Image Model

Google DeepMind's Gemini 3.1 Flash Image model. Generates images from 512px to 4K, supports up to 14 reference images, and includes Google Search grounding for contextually accurate results. Ranked #1 on the Artificial Analysis Image Arena at launch.

Nano Banana Pro — Studio-Grade Image Generation

Built on Gemini 3 Pro with advanced reasoning ("Thinking") for complex multi-subject scenes. 94% text rendering accuracy for signage, logos, and UI mockups. Supports up to 8 reference images at resolutions up to 4K.

Nano Banana Edit — AI Image Editing

Semantic-aware image transformation with up to 10 reference images. Style transfer, object modification, and creative remixing while preserving the elements you want to keep. Works as image-to-image rather than text-to-image.

Frequently Asked Questions

Quick answers to the things people actually ask us.







Start Generating with Seedance 2.0

9 AI models for video and image generation. Seedance, Kling, and Nano Banana — all on one platform.