Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.16153

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

Paper • 2403.14621 • Published Mar 21 • 14
Garment3DGen: 3D Garment Stylization and Texture Generation

Paper • 2403.18816 • Published Mar 27 • 19

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55
microsoft/speecht5_tts

Text-to-Speech • Updated Nov 8, 2023 • 214k • 557
facebook/covost2

Updated Jan 18 • 251 • 16
facebook/detr-resnet-50

Object Detection • Updated Apr 10 • 465k • 567

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55

Datasets, Benchmark and Models of ChatMusician: Understanding and Generating Music Intrinsically with LLM

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55
Sleeping

16

💻

ChatMusician
m-a-p/ChatMusician

Text Generation • Updated Apr 8 • 1.7k • 106
m-a-p/ChatMusician-Base

Text Generation • Updated Mar 20 • 21 • 10

stabilityai/stable-code-3b

Text Generation • Updated Apr 12 • 19k • 619
ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55

Models - Multimodal

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Paper • 2402.12226 • Published Feb 19 • 37
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition

Paper • 2401.11649 • Published Jan 22 • 3
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition

Paper • 2402.15504 • Published Feb 23 • 19
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27 • 183

A Novel 1D State Space for Efficient Music Rhythmic Analysis

Paper • 2111.00704 • Published Nov 1, 2021
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

Paper • 2312.09911 • Published Dec 15, 2023 • 52
Music Style Transfer with Time-Varying Inversion of Diffusion Models

Paper • 2402.13763 • Published Feb 21 • 9
ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55

Speech and Audio

facebook/wav2vec2-base-960h

Automatic Speech Recognition • Updated Nov 14, 2022 • 2.95M • 254
ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25 • 55

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

Paper • 2402.06178 • Published Feb 9 • 12
DITTO: Diffusion Inference-Time T-Optimization for Music Generation

Paper • 2401.12179 • Published Jan 22 • 18
Fast Timing-Conditioned Latent Audio Diffusion

Paper • 2402.04825 • Published Feb 7 • 7
Brain2Music: Reconstructing Music from Human Brain Activity

Paper • 2307.11078 • Published Jul 20, 2023 • 39

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs