Spaces:

AudioVisual-Caption
/

README

Running

lyhisme commited on Feb 24

Commit

3a8b6f7

verified ·

1 Parent(s): 98341cb

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -9,6 +9,8 @@ pinned: false
 # ASID-Caption
 We build **ASID-Caption**, a data-and-model suite for **fine-grained audiovisual video understanding**.
 Our goal is to move beyond “one video → one generic caption” by providing **attribute-structured supervision** and **quality-verified annotations**, enabling models to produce **more complete, more controllable, and more temporally consistent** descriptions that cover both **visual content** and **audio cues**.

 # ASID-Caption
+[[🏠 Homepage]([https://](https://asid-caption.github.io/))] [[📖 Arxiv Paper](https://arxiv.org/pdf/2602.13013)] [[🤗 Models & Datasets](https://huggingface.co/AudioVisual-Caption)] [[💻 Code](https://github.com/HVision-NKU/ASID-Caption)]
 We build **ASID-Caption**, a data-and-model suite for **fine-grained audiovisual video understanding**.
 Our goal is to move beyond “one video → one generic caption” by providing **attribute-structured supervision** and **quality-verified annotations**, enabling models to produce **more complete, more controllable, and more temporally consistent** descriptions that cover both **visual content** and **audio cues**.