zeqianli
/

HowToStep-NSVA

Model card Files Files and versions Community

zeqianli commited on Jan 31, 2024

Commit

39b5592

·

verified ·

1 Parent(s): 5a86c8f

Create README.md

Files changed (1) hide show

README.md +9 -0

README.md ADDED Viewed

	@@ -0,0 +1,9 @@

+# Narrations / Steps to Video Aligner (NSVA)
+NSVA is a lightweight Transformer-based architecture, where the narration or procedural steps are used as queries, to iteratively attend the video features, and output the alignability or optimal temporal windows.
+[[project page]](https://lzq5.github.io/Video-Text-Alignment/)
+[[Arxiv]](https://arxiv.org/abs/2312.14055)
+[[GitHub]](https://github.com/Lzq5/Video-Text-Alignment)
+We provide pre-trained models for HTM-Align and HT-Step. You can use these two models for reproducing our results, following our [[code]](https://github.com/Lzq5/Video-Text-Alignment).