---
title: AI
app_file: app.py
sdk: gradio
sdk_version: 4.44.1
---
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation
# News
Welcome contributors! Feel free to submit the pull requests!
- **[2024/10]** Welcome to try our [TANGO](<(https://huggingface.co/spaces/H-Liu1997/TANGO)!>) on Hugging face space !
- **[2024/10]** Code for create gesture graph is available.
# Results Videos
# Demo Video (on Youtube)
# 📝 Release Plans
- [ ] Training codes for AuMoClip and ACInterp
- [ ] Inference codes for ACInterp
- [ ] Processed Youtube Buiness Video data (very small, around 15 mins)
- [x] Scripts for creating gesture graph
- [x] Inference codes with AuMoClip and pretrained weights
# ⚒️ Installation
## Clone the repository
```shell
git clone https://github.com/CyberAgentAILab/TANGO.git
cd TANGO
git clone https://github.com/justinjohn0306/Wav2Lip.git
git clone https://github.com/dajes/frame-interpolation-pytorch.git
```
## Build Environtment
We Recommend a python version `==3.9.20` and cuda version `==11.8`. Then build environment as follows:
```shell
# [Optional] Create a virtual env
conda create -n tango python==3.9.20
conda activate tango
# Install with pip:
pip install -r ./pre-requirements.txt
pip install -r ./requirements.txt
```
# 🚀 Training and Inference
## Inference
Here is the command for running inference scripts under the path `/TANGO/`, it will take around 3 min to generate two 8s vidoes. You can visualize by directly check the video or check the result .npz files via blender using our blender addon in [EMAGE](https://github.com/PantoMatrix/PantoMatrix).
_Necessary checkpoints and pre-computed graphs will be automatically downloaded during the first run. Please ensure that at least 35GB of disk space is available._
```shell
python app.py
```
### Create the graph for custom character
```shell
python create_graph.py
```
# Copyright Information
We thanks the open-source project [Wav2Lip](https://github.com/Rudrabha/Wav2Lip), [FiLM](https://github.com/caffeinism/FiLM-pytorch), [SMPLerX](https://github.com/caizhongang/SMPLer-X).
Check out our previous works for Co-Speech 3D motion Generation DisCo, BEAT, EMAGE.
This project is only for research or education purposes, and not freely available for commercial use or redistribution. The srcipt is available only under the terms of the [Attribution-NonCommercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/legalcode) (CC BY-NC 4.0) license.