Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.16811

Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25 • 53

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 87
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16 • 18
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20 • 25
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14 • 26

Is Cosine-Similarity of Embeddings Really About Similarity?

Paper • 2403.05440 • Published Mar 8 • 3
GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning

Paper • 2402.16829 • Published Feb 26
Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25 • 53
KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108

To read... eventually

A collection of papers that i have read or plan to read all in one place. Includes a wide range of topics.

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 125
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 12
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 65

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 49
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 104
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29 • 34
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2 • 44

Context Scaling

Training-Free Long-Context Scaling of Large Language Models

Paper • 2402.17463 • Published Feb 27 • 19
Evaluating Very Long-Term Conversational Memory of LLM Agents

Paper • 2402.17753 • Published Feb 27 • 18
Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29 • 22
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Paper • 2403.09347 • Published Mar 14 • 20

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

Paper • 2402.04617 • Published Feb 7 • 4
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Paper • 2403.09347 • Published Mar 14 • 20
Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29 • 22
Training-Free Long-Context Scaling of Large Language Models

Paper • 2402.17463 • Published Feb 27 • 19

Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7 • 27
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

Paper • 2305.07185 • Published May 12, 2023 • 9
YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 65
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Paper • 2401.02669 • Published Jan 5 • 14

VideoBooth: Diffusion-based Video Generation with Image Prompts

Paper • 2312.00777 • Published Dec 1, 2023 • 21
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

Paper • 2312.03641 • Published Dec 6, 2023 • 20
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

Paper • 2312.04557 • Published Dec 7, 2023 • 12
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

Paper • 2312.04433 • Published Dec 7, 2023 • 9

Statistical Rejection Sampling Improves Preference Optimization

Paper • 2309.06657 • Published Sep 13, 2023 • 13
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 42
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12 • 64
Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25 • 53

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs