Chuanming
's Collections
paper2read
updated
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style
Models on Dense Captions
Paper
•
2312.08578
•
Published
•
15
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric
Strategy for Diverse Generative Tasks
Paper
•
2312.08583
•
Published
•
9
Vision-Language Models as a Source of Rewards
Paper
•
2312.09187
•
Published
•
10
StemGen: A music generation model that listens
Paper
•
2312.08723
•
Published
•
45
Pearl: A Production-ready Reinforcement Learning Agent
Paper
•
2312.03814
•
Published
•
14
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Paper
•
2312.13789
•
Published
•
13
PanGu-π: Enhancing Language Model Architectures via Nonlinearity
Compensation
Paper
•
2312.17276
•
Published
•
14
Training a Helpful and Harmless Assistant with Reinforcement Learning
from Human Feedback
Paper
•
2204.05862
•
Published
•
2
Improving Text Embeddings with Large Language Models
Paper
•
2401.00368
•
Published
•
72
DocLLM: A layout-aware generative language model for multimodal document
understanding
Paper
•
2401.00908
•
Published
•
171
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper
•
2401.02038
•
Published
•
58
A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA
Paper
•
2312.03732
•
Published
•
4
Zephyr: Direct Distillation of LM Alignment
Paper
•
2310.16944
•
Published
•
116
MoE-Mamba: Efficient Selective State Space Models with Mixture of
Experts
Paper
•
2401.04081
•
Published
•
68
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper
•
2401.03462
•
Published
•
25
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
Paper
•
2401.04468
•
Published
•
46
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
Lengths in Large Language Models
Paper
•
2401.04658
•
Published
•
23
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper
•
2401.04577
•
Published
•
37
Tuning LLMs with Contrastive Alignment Instructions for Machine
Translation in Unseen, Low-resource Languages
Paper
•
2401.05811
•
Published
•
5
Self-Instruct: Aligning Language Model with Self Generated Instructions
Paper
•
2212.10560
•
Published
•
5
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and
DeepSpeed-Inference
Paper
•
2401.08671
•
Published
•
12
Scalable Pre-training of Large Autoregressive Image Models
Paper
•
2401.08541
•
Published
•
34
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Paper
•
2401.10061
•
Published
•
26
Self-Rewarding Language Models
Paper
•
2401.10020
•
Published
•
134
Zero Bubble Pipeline Parallelism
Paper
•
2401.10241
•
Published
•
19
Medusa: Simple LLM Inference Acceleration Framework with Multiple
Decoding Heads
Paper
•
2401.10774
•
Published
•
49
Lost in the Middle: How Language Models Use Long Contexts
Paper
•
2307.03172
•
Published
•
31
AutoRT: Embodied Foundation Models for Large Scale Orchestration of
Robotic Agents
Paper
•
2401.12963
•
Published
•
11
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper
•
2401.12945
•
Published
•
82
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning
Benchmark for Expert AGI
Paper
•
2311.16502
•
Published
•
33
Proactive Detection of Voice Cloning with Localized Watermarking
Paper
•
2401.17264
•
Published
•
15
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Paper
•
2401.18058
•
Published
•
21
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices
Paper
•
2311.16567
•
Published
•
21
A Long Way to Go: Investigating Length Correlations in RLHF
Paper
•
2310.03716
•
Published
•
9
Efficient Exploration for LLMs
Paper
•
2402.00396
•
Published
•
18
Transforming and Combining Rewards for Aligning Large Language Models
Paper
•
2402.00742
•
Published
•
10
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper
•
2401.15947
•
Published
•
46
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Paper
•
2402.10176
•
Published
•
32
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme
Long Sequence Transformer Models
Paper
•
2309.14509
•
Published
•
16
MambaByte: Token-free Selective State Space Model
Paper
•
2401.13660
•
Published
•
47
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper
•
2311.03285
•
Published
•
27
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper
•
2309.12307
•
Published
•
82
NExT-GPT: Any-to-Any Multimodal LLM
Paper
•
2309.05519
•
Published
•
74
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual
Perception
Paper
•
2401.16158
•
Published
•
15
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper
•
2403.07816
•
Published
•
37
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper
•
2403.09611
•
Published
•
119
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent
Paper
•
2402.09844
•
Published
•
16
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
•
2307.09288
•
Published
•
233
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
105
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper
•
2404.16710
•
Published
•
46