TaylorJi
/

Diff-V2M

TaylorJi commited on Nov 11, 2025

Commit

c517793

verified ·

1 Parent(s): 8d47204

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,3 +1,31 @@
----
-license: mit
----

+---
+license: mit
+---
+# Diff-V2M: A Hierarchical Conditional Diffusion Model with Explicit Rhythmic Modeling for Video-to-Music Generation
+<!-- Provide a quick summary of what the model is/does. -->
+Here is the training checkpoints of **[Diff-V2M (AAAI'26)](https://arxiv.org/abs/2312.10307)**
+## Overview
+Diff-V2M is a hierarchical diffusion model with explicit rhythmic modeling and multi-view feature conditioning, achieving state-of-the-art results in video-to-music generation..
+<img src="model.png" width="770" height="500" alt="model"/>
+## Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/Tayjsl97/Diff-V2M
+- **Demo:** [demo page](https://tayjsl97.github.io/Diff-V2M-Demo)
+## Citation
+If you use our models in your research, please cite it as follows:
+```bib
+@inproceedings{ji2026diff,
+  title={Diff-V2M: A Hierarchical Conditional Diffusion Model with Explicit Rhythmic Modeling for Video-to-Music Generation},
+  author={Ji, Shulei and Wang, Zihao and Yu, Jiaxing and Yang, Xiangyuan and Li, Shuyu and Wu, Songruoyao and Zhang, Kejun},
+  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
+  year={2026}
+}