Collections
Discover the best community collections!
Collections including paper arxiv:2311.00945
-
Random Field Augmentations for Self-Supervised Representation Learning
Paper • 2311.03629 • Published • 6 -
Levels of AGI: Operationalizing Progress on the Path to AGI
Paper • 2311.02462 • Published • 30 -
Idempotent Generative Network
Paper • 2311.01462 • Published • 22 -
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper • 2311.00945 • Published • 11
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 39 -
Data Filtering Networks
Paper • 2309.17425 • Published • 6 -
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 30 -
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper • 2311.00945 • Published • 11
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 37 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 69 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 77 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 79
-
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Paper • 2309.03895 • Published • 11 -
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Paper • 2309.16650 • Published • 7 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 7 -
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling
Paper • 2310.15169 • Published • 8
-
Large-Scale Automatic Audiobook Creation
Paper • 2309.03926 • Published • 52 -
FoleyGen: Visually-Guided Audio Generation
Paper • 2309.10537 • Published • 6 -
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Paper • 2310.11954 • Published • 24 -
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 16
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper • 2309.03199 • Published • 10 -
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper • 2311.00945 • Published • 11 -
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Paper • 2311.00430 • Published • 53 -
coqui/XTTS-v2
Text-to-Speech • Updated • 304k • 1.25k