Collections
Discover the best community collections!
Collections including paper arxiv:2402.12226
-
FaceStudio: Put Your Face Everywhere in Seconds
Paper • 2312.02663 • Published • 28 -
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Paper • 2401.08740 • Published • 10 -
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Paper • 2401.10061 • Published • 26 -
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices
Paper • 2311.16567 • Published • 21
-
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Paper • 2312.02087 • Published • 20 -
FaceStudio: Put Your Face Everywhere in Seconds
Paper • 2312.02663 • Published • 28 -
Orthogonal Adaptation for Modular Customization of Diffusion Models
Paper • 2312.02432 • Published • 12 -
ReconFusion: 3D Reconstruction with Diffusion Priors
Paper • 2312.02981 • Published • 8
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 117 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 109k • 2.29k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 48 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 27
-
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 49 -
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Paper • 2311.05698 • Published • 6 -
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Paper • 2311.06242 • Published • 24 -
PolyMaX: General Dense Prediction with Mask Transformer
Paper • 2311.05770 • Published • 6
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 37 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 70 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 77 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 80
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 21 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 8 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 6
-
Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model
Paper • 2309.03550 • Published • 11 -
Memory Augmented Language Models through Mixture of Word Experts
Paper • 2311.10768 • Published • 16 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 174 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 12