DynamiCrafter (576x1024) (text-)Image-to-Video/Image Animation Model Card

row01 row02

DynamiCrafter (576x1024) (Text-)Image-to-Video is a video diffusion model that
takes in a still image as a conditioning image and text prompt describing dynamics,
and generates videos from it.

Model Details

Model Description

DynamiCrafter, a (Text-)Image-to-Video/Image Animation approach, aims to generate
short video clips (~2 seconds) from a conditioning image and text prompt.

This model was trained to generate 16 video frames at a resolution of 576x1024
given a context frame of the same resolution.

  • Developed by: CUHK & Tencent AI Lab
  • Funded by: CUHK & Tencent AI Lab
  • Model type: Generative (text-)image-to-video model
  • Finetuned from model: DynamiCrafter (320x512)

Model Sources

For research purpose, we recommend our Github repository (https://github.com/Doubiiu/DynamiCrafter),
which includes the detailed implementations.

Uses

Limitations

  • The generated videos are relatively short (2 seconds, FPS=8).
  • The model cannot render legible text.
  • Faces and people in general may not be generated properly.
  • The autoencoding part of the model is lossy, resulting in slight flickering artifacts.

How to Get Started with the Model

Check out https://github.com/Doubiiu/DynamiCrafter

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Spaces using Doubiiu/DynamiCrafter_1024 9