File size: 3,459 Bytes
95de230
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
855289d
95de230
855289d
95de230
855289d
95de230
 
 
310c531
5b6f07a
95de230
56bddbb
95de230
5b6f07a
2c1871e
b506158
2c1871e
5b6f07a
 
 
 
2c1871e
 
 
5b6f07a
 
 
95de230
 
 
 
 
 
 
 
 
b506158
95de230
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
license: apache-2.0
base_model: Wan-AI/Wan2.2-I2V-A14B
tags:
- image-to-video
- diffusion
- video-generation
- turbodiffusion
- wan2.2
pipeline_tag: image-to-video
---

<p align="center">
    <img src="assets/TurboDiffusion_Logo.png" width="300"/>
<p>

# TurboWan2.2-I2V-A14B-720P

- This HuggingFace repo contains the `TurboWan2.2-I2V-A14B-720P` model.

- For RTX 5090 or similar GPUs, please use the quantized versions (`TurboWan2.2-I2V-A14B-high-720P-quant` and `TurboWan2.2-I2V-A14B-low-720P-quant`). For other GPUs with a bigger GPU memory than 40GB, we recommend using (`TurboWan2.2-I2V-A14B-high-720P` and `TurboWan2.2-I2V-A14B-low-720P`).

- For usage instructions, please see **https://github.com/thu-ml/TurboDiffusion**

- Paper: [TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times](https://arxiv.org/pdf/2512.16093)


# Citation
```
@article{zhang2025turbodiffusion,
  title={TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times},
  author={Zhang, Jintao and Zheng, Kaiwen and Jiang, Kai and Wang, Haoxu and Stoica, Ion and Gonzalez, Joseph E and Chen, Jianfei and Zhu, Jun},
  journal={arXiv preprint arXiv:2512.16093},
  year={2025}
}

@software{turbodiffusion2025,
  title={TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times},
  author={The TurboDiffusion Team},
  url={https://github.com/thu-ml/TurboDiffusion},
  year={2025}
}

@inproceedings{zhang2025sageattention,
  title={SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration}, 
  author={Zhang, Jintao and Wei, Jia and Zhang, Pengle and Zhu, Jun and Chen, Jianfei},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2025}
}

@article{zhang2025sla,
  title={SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention},
  author={Zhang, Jintao and Wang, Haoxu and Jiang, Kai and Yang, Shuo and Zheng, Kaiwen and Xi, Haocheng and Wang, Ziteng and Zhu, Hongzhou and Zhao, Min and Stoica, Ion and others},
  journal={arXiv preprint arXiv:2509.24006},
  year={2025}
}

@article{zheng2025rcm,
  title={Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency},
  author={Zheng, Kaiwen and Wang, Yuji and Ma, Qianli and Chen, Huayu and Zhang, Jintao and Balaji, Yogesh and Chen, Jianfei and Liu, Ming-Yu and Zhu, Jun and Zhang, Qinsheng},
  journal={arXiv preprint arXiv:2510.08431},
  year={2025}
}

@inproceedings{zhang2024sageattention2,
  title={Sageattention2: Efficient attention with thorough outlier smoothing and per-thread int4 quantization},
  author={Zhang, Jintao and Huang, Haofeng and Zhang, Pengle and Wei, Jia and Zhu, Jun and Chen, Jianfei},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2025}
}

@article{zhang2025sageattention2++,
  title={Sageattention2++: A more efficient implementation of sageattention2},
  author={Zhang, Jintao and Xu, Xiaoming and Wei, Jia and Huang, Haofeng and Zhang, Pengle and Xiang, Chendong and Zhu, Jun and Chen, Jianfei},
  journal={arXiv preprint arXiv:2505.21136},
  year={2025}
}
@article{zhang2025sageattention3,
  title={SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training},
  author={Zhang, Jintao and Wei, Jia and Zhang, Pengle and Xu, Xiaoming and Huang, Haofeng and Wang, Haoxu and Jiang, Kai and Zhu, Jun and Chen, Jianfei},
  journal={arXiv preprint arXiv:2505.11594},
  year={2025}
}
```