jnwnlee
/

video-foley

Model card Files Files and versions Community

Video-Foley

Official model checkpoint of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound".

(refer to the paper for further details)

Video2RMS

files: checkpoint_000500_Video2RMS.pt (weight), opts.yml (config)
training data: Greatest Hits (audio-visual, ~6hr)

RMS2Sound (including RMS-ControlNet)

file: ControlNetstep300000.pth (weight)
training data: FreeSound (audio only, ~6khr, in Wavcaps)

Citation

@article{video-foley,
          title={Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound},
          author={Lee, Junwon and Im, Jaekwon and Kim, Dabin and Nam, Juhan},
          journal={arXiv preprint arXiv:2408.11915},
          year={2024}
        }

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support