The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 566
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Paper • 2107.10833 • Published Jul 22, 2021 • 2
Petals: Collaborative Inference and Fine-tuning of Large Models Paper • 2209.01188 • Published Sep 2, 2022 • 2
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 58
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering Paper • 2310.08528 • Published Oct 12, 2023 • 2
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" Paper • 2309.12288 • Published Sep 21, 2023 • 3
Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models Paper • 2306.08997 • Published Jun 15, 2023 • 10
AstroLLaMA: Towards Specialized Foundation Models in Astronomy Paper • 2309.06126 • Published Sep 12, 2023 • 16
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 235
Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning Paper • 2307.02053 • Published Jul 5, 2023 • 23
RepoFusion: Training Code Models to Understand Your Repository Paper • 2306.10998 • Published Jun 19, 2023 • 13
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper • 2205.14135 • Published May 27, 2022 • 8
GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond Paper • 2309.16583 • Published Sep 28, 2023 • 12
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 167
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 37
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) Paper • 2309.17421 • Published Sep 29, 2023 • 4
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models Paper • 2310.06117 • Published Oct 9, 2023 • 3
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models Paper • 2309.10730 • Published Sep 19, 2023 • 2
SALMONN: Towards Generic Hearing Abilities for Large Language Models Paper • 2310.13289 • Published Oct 20, 2023 • 16
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis Paper • 2308.09713 • Published Aug 18, 2023 • 2
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion Paper • 2310.08579 • Published Oct 12, 2023 • 14
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors Paper • 2310.08529 • Published Oct 12, 2023 • 16
AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections Paper • 2309.02186 • Published Sep 5, 2023 • 19
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation Paper • 2310.08541 • Published Oct 12, 2023 • 17
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation Paper • 2309.15818 • Published Sep 27, 2023 • 18
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis Paper • 2310.00426 • Published Sep 30, 2023 • 60
Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion Paper • 2301.11757 • Published Jan 27, 2023 • 3
An Emulator for Fine-Tuning Large Language Models using Small Language Models Paper • 2310.12962 • Published Oct 19, 2023 • 13
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning Paper • 2310.06694 • Published Oct 10, 2023 • 3
CodeFusion: A Pre-trained Diffusion Model for Code Generation Paper • 2310.17680 • Published Oct 26, 2023 • 68
Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V Paper • 2310.19061 • Published Oct 29, 2023 • 8
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise Paper • 2310.19019 • Published Oct 29, 2023 • 9
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models Paper • 2309.12284 • Published Sep 21, 2023 • 16
ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation Paper • 2311.00272 • Published Nov 1, 2023 • 8
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing Paper • 2311.00571 • Published Nov 1, 2023 • 39
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning Paper • 2307.04725 • Published Jul 10, 2023 • 63
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Paper • 2005.11401 • Published May 22, 2020 • 11
Levels of AGI: Operationalizing Progress on the Path to AGI Paper • 2311.02462 • Published Nov 4, 2023 • 30
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation Paper • 2305.01210 • Published May 2, 2023 • 4
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs Paper • 2311.04901 • Published Nov 8, 2023 • 6
GPT4All: An Ecosystem of Open Source Compressed Language Models Paper • 2311.04931 • Published Nov 6, 2023 • 20
S-LoRA: Serving Thousands of Concurrent LoRA Adapters Paper • 2311.03285 • Published Nov 6, 2023 • 27
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 40
Can ChatGPT Assess Human Personalities? A General Evaluation Framework Paper • 2303.01248 • Published Mar 1, 2023 • 1
Deep Unlearning via Randomized Conditionally Independent Hessians Paper • 2204.07655 • Published Apr 15, 2022 • 1
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data Paper • 2309.11235 • Published Sep 20, 2023 • 15
Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation Paper • 1811.09393 • Published Nov 23, 2018 • 1
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 74
HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution Paper • 2306.15794 • Published Jun 27, 2023 • 16
ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge Paper • 2303.14070 • Published Mar 24, 2023 • 8
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 69
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection Paper • 2311.10122 • Published Nov 16, 2023 • 25
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Paper • 2311.13600 • Published Nov 22, 2023 • 41
Scalable Extraction of Training Data from (Production) Language Models Paper • 2311.17035 • Published Nov 28, 2023 • 4
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Paper • 2309.00267 • Published Sep 1, 2023 • 45
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Paper • 2311.16079 • Published Nov 27, 2023 • 18
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically Paper • 2312.02119 • Published Dec 4, 2023 • 1
Hyena Hierarchy: Towards Larger Convolutional Language Models Paper • 2302.10866 • Published Feb 21, 2023 • 6
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models Paper • 2312.04724 • Published Dec 7, 2023 • 18
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity Paper • 2101.03961 • Published Jan 11, 2021 • 13
Silkie: Preference Distillation for Large Visual Language Models Paper • 2312.10665 • Published Dec 17, 2023 • 10
Osprey: Pixel Understanding with Visual Instruction Tuning Paper • 2312.10032 • Published Dec 15, 2023 • 4
Gemini: A Family of Highly Capable Multimodal Models Paper • 2312.11805 • Published Dec 19, 2023 • 44
Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code Paper • 2308.03109 • Published Aug 6, 2023 • 1
LM-Cocktail: Resilient Tuning of Language Models via Model Merging Paper • 2311.13534 • Published Nov 22, 2023 • 3
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 Paper • 2312.16171 • Published Dec 26, 2023 • 30
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 173
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 38
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition Paper • 2305.05084 • Published May 8, 2023 • 1
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning Paper • 2401.01325 • Published Jan 2 • 24
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia Paper • 2305.14292 • Published May 23, 2023 • 1
I am a Strange Dataset: Metalinguistic Tests for Language Models Paper • 2401.05300 • Published Jan 10 • 4
Lumiere: A Space-Time Diffusion Model for Video Generation Paper • 2401.12945 • Published Jan 23 • 82
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices Paper • 2311.16567 • Published Nov 28, 2023 • 21
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning Paper • 2402.04833 • Published Feb 7 • 6
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models Paper • 2402.07033 • Published Feb 10 • 16
GraphCast: Learning skillful medium-range global weather forecasting Paper • 2212.12794 • Published Dec 24, 2022 • 1
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Paper • 2402.10176 • Published Feb 15 • 33
GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering Paper • 2402.10128 • Published Feb 15 • 14
SDXL-Lightning: Progressive Adversarial Diffusion Distillation Paper • 2402.13929 • Published Feb 21 • 24
Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures Paper • 2402.05424 • Published Feb 8 • 17
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27 • 87
Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 69
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation Paper • 2403.16990 • Published Mar 25 • 24
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Paper • 2404.10667 • Published Apr 16 • 12
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 92
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published 3 days ago • 37