TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting Paper • 2503.17032 • Published 13 days ago • 22
Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation Paper • 2503.13424 • Published 17 days ago • 28
Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations Paper • 2503.06273 • Published 26 days ago • 5
QE4PE: Word-level Quality Estimation for Human Post-Editing Paper • 2503.03044 • Published 30 days ago • 6
LONGCODEU: Benchmarking Long-Context Language Models on Long Code Understanding Paper • 2503.04359 • Published 28 days ago • 6
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning Paper • 2503.05379 • Published 27 days ago • 33
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published 27 days ago • 74
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation Paper • 2503.01370 • Published Mar 3 • 14
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 28 days ago • 68
Persian Text Datasets Collection Collection of some good Persian datasets • 21 items • Updated 28 days ago • 3
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities Paper • 2503.03983 • Published 29 days ago • 22