FramePack - Practical Video Diffusion on Consumer GPUs

Image-to-5-Seconds (30fps, 150 frames)

All results are computed by RTX 3060 6GB laptop with 13B HY variant. (Videos compressed by h264crf18 to fit in GitHub repos.)

Image-to-60-Seconds (30fps, 1800 frames)

All results are computed by RTX 3060 laptop 6GB with 13B HY variant. (Videos compressed by h264crf18 to fit in GitHub repos.)

Why choose FramePack?

Low VRAM Requirements

Generate high-quality videos with just 6GB of VRAM, making it accessible for laptops and budget GPUs.

Full-Length 30 FPS Videos

Create smooth, professional-quality videos at 30 frames per second, up to 60 seconds long.

Open Source & Extensible

Built for researchers and developers with a modular Python codebase that's easy to extend.

Who is FramePack for?

Content Creators

Bring still images to life using FramePack for YouTube, TikTok, and marketing videos in minutes.

Researchers & Devs

Prototype new video diffusion ideas quickly by extending FramePack's modular Python codebase.

Studios & Agencies

Use FramePack to iterate storyboards and dynamic ads without costly render farms.

FramePack features you'll love

Optimized Performance

~2.5s per frame on RTX 4090 (1.5s with optimizations) and proportionally efficient on laptops.

High-Quality Output

Generate professional-grade videos with consistent style and smooth transitions.

Cross-Platform Support

Fully supported on Linux with Windows one-click package coming soon.

Active Community

Join a growing community of creators and researchers pushing the boundaries of AI video generation.

What FramePack users say

Reddit user @ai_enthusiast

"Finally, video diffusion that runs on my 8 GB laptop GPU! FramePack feels as snappy as Stable Diffusion for images."

FramePack FAQ

Installation

We recommend having an independent Python 3.10.

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt

To start the GUI, run:

python demo_gradio.py

Note that it supports --share, --port, --server, and so on.

The software supports PyTorch attention, xformers, flash-attn, sage-attention. By default, it will just use PyTorch attention. You can install those attention kernels if you know how.

For example, to install sage-attention (linux):

pip install sageattention==1.0.6

However, you are highly recommended to first try without sage-attention since it will influence results, though the influence is minimal.

Prompting Guidelines

Effective prompting is essential for achieving optimal results with FramePack. The following guidelines will help you craft prompts that generate high-quality video animations from your images.

Recommended Prompt Template

For consistent results, we recommend using the following ChatGPT instruction template to generate motion-focused prompts:

You are an assistant that writes short, motion-focused prompts for animating images. When the user sends an image, respond with a single, concise prompt describing visual motion (such as human activity, moving objects, or camera movements). Focus only on how the scene could come alive and become dynamic using brief phrases. Larger and more dynamic motions (like dancing, jumping, running, etc.) are preferred over smaller or more subtle ones (like standing still, sitting, etc.). Describe subject, then motion, then other things. For example: "The girl dances gracefully, with clear movements, full of charm." If there is something that can dance (like a man, girl, robot, etc.), then prefer to describe it as dancing. Stay in a loop: one image in, one motion prompt out. Do not explain, ask questions, or generate multiple options.

Example Output

When you provide an image to ChatGPT with the above instructions, you'll receive a concise motion prompt such as:

"The man dances powerfully, striking sharp poses and gliding smoothly across the reflective floor."

Writing Your Own Prompts

You can also craft your own prompts following these principles:

Keep prompts concise and focused on motion
Structure as: subject → motion → additional details
Prioritize dynamic movements over static poses
Be specific about the quality of movement (gracefully, powerfully, etc.)

Examples of effective custom prompts:

"The girl dances gracefully, with clear movements, full of charm."

"The man dances powerfully, with clear movements, full of energy."

FramePack: Practical Video Diffusion on Consumer GPUs