= NewExplore our comprehensive video suite, featuring a range of powerful models designed for video creation, animation, and transformation. Each entry lists the tool name, the underlying model it corresponds to, and a short description based on its dedicated docs page.
Turn a single image (plus optional prompt) into a dynamic 5‑second video. Supports looping, High/Low motion, Raw mode for tighter prompt control, 4-up previews, upscaling, and extension.
Kling-powered text/image-to-video generation with Standard/Pro modes, multiple model versions (v1.5–v2.1 Master), frames-to-video, multi-image “Elements,” and image “Effects,” plus optional audio.
Text-to-video and image-to-video generation focused on quick setup: set duration and motion strength, generate and manage outputs with upscaling and prompt optimization.
LTX Video model for T2V/I2V/Extend/Multi modes with controllable resolution, AR, steps, prompt expansion, and seeds; ideal for cinematic shots and structured camera direction prompting.
CogVideo-powered T2V/I2V/V2V with guidance/steps, video size presets, RIFE interpolation, FPS control, strength for V2V, LoRAs, and seeding; excels with cinematic camera/lighting/atmosphere prompts.
Create videos from text/images with PixVerse. Multiple workflows (Standard, Frame-to-Frame, Lip Sync, Extend, Restyle), camera moves, styles, auto sound, and voice options for flexible storytelling.
Hunyuan-powered suite for T2V, I2V, V2V, LoRA training/use, Portrait driving, and Custom consistency; supports Pro mode, seed control, frames, AR, and steps. Strong focus on subject, motion, camera, lighting, and composition prompting.
T2V, I2V, and Reference-to-Video with high fidelity and precise prompt/camera control. Excels at multi-agent interactions, complex action sequencing, and diverse styles (photoreal to felt/clay) with smooth, stable motion.
Google Veo T2V/I2V with model selection (Veo 2, Veo 3, Veo 3 Fast), AR (Veo 2), resolution (Veo 3 up to 1080p), duration (Veo 2), optional audio (Veo 3), seeds, and advanced prompt frameworks including JSON prompting.
Image-to-video, reference-to-video (up to 3 refs), frames mode (first/last), and effects with movement amplitude control, AR options, and Standard (720p)/Premium (1080p) quality; duration up to ~4 seconds.
Image-to-video and text-to-video with strong “camera language” prompting, motion cues, lighting/atmosphere, and cinematic techniques. Offers “Enhance Prompt” for easier creative direction and dynamic 5-second shots.
Animate still images with prompt-guided motion using controls like Motion Bucket ID, Cond Aug, guidance scale, and seed. Can also generate motion from text alone for quick animated clips.
Text-only prompting for animation with style/model selection, motion models, CFG, inference steps, seed, and duration—focused on turning descriptive prompts directly into motion.
Generate 6‑second clips from detailed prompts or image+text, with S2V for character-focused sequences and Director variants for fine camera control; high prompt length budget and single-queue reliability.
Video-to-video style transfer that maintains structure/motion while changing visual style. Control inference steps, guidance, processed seconds, FPS, and seed for consistent stylization across edits.
Autoregressive image-to-video with progressive, frame-by-frame generation, latent previews, anti-drift bi-directional sampling, and prompt-based animation for consistent, long sequences.
Map facial expressions/movement from a source video onto a target portrait image. Simple two-input workflow (video+image) with best practices for alignment, framing, and lighting.
Audio-driven talking portrait generation from a single image and audio. Includes enhancers (e.g., GFPGAN), preprocessing options, and “only animate face” for focused, realistic speech animation.
Advanced portrait animation driven by audio (up to ~60s), with emotional expression mapping and aspect ratio options—focused on lifelike speaking avatars with rich expressivity.
Lip-resync existing videos to new audio with stable/beta model options, multiple output resolutions/ARs, and guidance for best results (clean audio, clear front-facing video).
Create realistic talking head videos from a single portrait and an audio track up to 5 minutes. Add an optional guidance prompt to steer emotion and expressiveness. Delivers high‑quality, accurate lip‑sync with natural facial movements, and supports aspect ratios 1:1, 16:9, and 9:16.
Upload an image and audio (≤30s) to produce a realistic lip-synced video. Powered by Omnihuman, designed for high-fidelity mouth movements synced to speech.
Upload a talking-head video and an audio file to replace the soundtrack and re-sync visible lip movements to the new audio—simple, fast dubbing for quick turnarounds.
Swap a target face into a short video (≤20s) from a single photo. Quick setup and automatic processing for memes, experiments, and cinematic face replacements.
Blend four images into a single morphed compilation, guided by a base style image and prompt. Control style strength, modes (small/medium/upscaled/interpolated), checkpoints (realistic/3D/anime), AR, and ControlNet.
Automatic subtitle generation for short videos (≤60s) with customizable position, font, size, background opacity, stroke width, and perfect timing sync for accessibility and SEO.
⏩ Interpolate Video — AI Frame Interpolation (RIFE)
Increase FPS (24→48) with 1–2 recursive passes to produce smoother motion and reduce stutter; fast processing and clear guidance for content types and quality tradeoffs.
📈 Video Upscaler — AI Super‑Resolution (Standard/Anime)
Upscale short videos (≤5s) to higher resolutions (including 4K) with specialized models for real-life and anime/cartoon content. Consistent processing time across resolutions.
Automatically remove video backgrounds to isolate subjects for compositing—best results with solid, monotone backgrounds; simple upload-to-output workflow.
Convert real-life MP4 videos into stylized cartoons. Adjust frame rate and resolution, upload, and let the engine transform footage into playful, shareable animated looks.
Subscription feature that unlocks unlimited generations specifically on MotionCraft Ultra (MiniMax) and VidCraft Ultra (Runway). Ideal for rapid iteration and large creative workloads; not a generative model itself, but an access tier enabling unrestricted use of these two flagship engines.