Xuenan Xu's picture

6 1

Xuenan Xu

wsntxxn

·

https://wsntxxn.github.io

AI & ML interests

Text to Speech Synthesis Text to Music Synthesis Singing Voice Synthesis

Recent Activity

new activity 15 days ago

wsntxxn/cnn8rnn-audioset-sed:Request of the Training Repository

new activity 16 days ago

wsntxxn/cnn8rnn-audioset-sed:Request of the Training Repository

new activity 2 months ago

wsntxxn/cnn8rnn-audioset-sed:Adding `safetensors` variant of this model

View all activity

Organizations

None yet

Papers 10

arxiv:2407.14329

arxiv:2407.02869

arxiv:2407.02857

arxiv:2406.08052

spaces 2

MM StoryAgent

Efficient Audio Captioning

models 7

wsntxxn/cnn8rnn-audioset-sed

Audio Classification • Updated Dec 30, 2024 • 643 • 3

wsntxxn/cnn14rnn-tempgru-audiocaps-captioning

Feature Extraction • Updated Dec 27, 2024 • 259 • 1

wsntxxn/effb2-trm-audiocaps-captioning

Feature Extraction • Updated Dec 20, 2024 • 69 • 1

wsntxxn/effb2-trm-clotho-captioning

Feature Extraction • Updated Dec 17, 2024 • 189 • 1

wsntxxn/cnn8rnn-w2vmean-audiocaps-grounding

Audio Classification • Updated Aug 19, 2024 • 244 • 2

wsntxxn/audiocaps-simple-tokenizer

Updated Jun 19, 2024

wsntxxn/clotho-simple-tokenizer

Updated Jun 19, 2024

datasets

None public yet