LLMs achieve adult human performance on higher-order theory of mind tasks Paper • 2405.18870 • Published 3 days ago • 12
Xwin-LM: Strong and Scalable Alignment Practice for LLMs Paper • 2405.20335 • Published 2 days ago • 10
MotionLLM: Understanding Human Behaviors from Human Motions and Videos Paper • 2405.20340 • Published 2 days ago • 11
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts Paper • 2405.19893 • Published 2 days ago • 11
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models Paper • 2405.15638 • Published 8 days ago • 1
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published 5 days ago • 44
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Paper • 2405.15319 • Published 8 days ago • 19
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach Paper • 2405.15613 • Published 8 days ago • 11
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published 9 days ago • 21
ReVideo: Remake a Video with Motion and Content Control Paper • 2405.13865 • Published 10 days ago • 19
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts Paper • 2405.11273 • Published 14 days ago • 15
LANISTR: Multimodal Learning from Structured and Unstructured Data Paper • 2305.16556 • Published May 26, 2023 • 2
Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published 12 days ago • 25
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published 13 days ago • 50
Observational Scaling Laws and the Predictability of Language Model Performance Paper • 2405.10938 • Published 15 days ago • 10
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published 12 days ago • 41
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity Paper • 2403.14403 • Published Mar 21 • 6
Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published 17 days ago • 25
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 63
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published 17 days ago • 95
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks Paper • 2405.08911 • Published 18 days ago • 1
Compositional Text-to-Image Generation with Dense Blob Representations Paper • 2405.08246 • Published 19 days ago • 11
A Careful Examination of Large Language Model Performance on Grade School Arithmetic Paper • 2405.00332 • Published May 1 • 24
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? Paper • 2405.05904 • Published 23 days ago • 5
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Paper • 2404.19752 • Published Apr 30 • 19
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30 • 61
Proving Test Set Contamination in Black Box Language Models Paper • 2310.17623 • Published Oct 26, 2023 • 1
Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video Paper • 2310.08584 • Published Oct 12, 2023 • 2
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory Paper • 2310.17884 • Published Oct 27, 2023 • 1
WildChat: 1M ChatGPT Interaction Logs in the Wild Paper • 2405.01470 • Published about 1 month ago • 53
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 115
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published 30 days ago • 21
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 30 days ago • 102
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published Apr 30 • 65
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 63
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper • 2404.16821 • Published Apr 25 • 49
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 122
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models Paper • 2404.14507 • Published Apr 22 • 21
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19 • 37
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study Paper • 2404.14047 • Published Apr 22 • 37
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation Paper • 2404.14396 • Published Apr 22 • 17
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 238
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published Apr 19 • 26
TextSquare: Scaling up Text-Centric Visual Instruction Tuning Paper • 2404.12803 • Published Apr 19 • 27