Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 25 days ago • 230
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare 28 days ago • 64
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer Paper • 2403.10301 • Published Mar 15 • 50
Larimar: Large Language Models with Episodic Memory Control Paper • 2403.11901 • Published Mar 18 • 30
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 54
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 119
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM Paper • 2403.07816 • Published Mar 12 • 37
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12 • 69
Gemma: Open Models Based on Gemini Research and Technology Paper • 2403.08295 • Published Mar 13 • 42
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 131
GPTVQ: The Blessing of Dimensionality for LLM Quantization Paper • 2402.15319 • Published Feb 23 • 19
Do Large Language Models Latently Perform Multi-Hop Reasoning? Paper • 2402.16837 • Published Feb 26 • 24
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding Paper • 2402.16671 • Published Feb 26 • 26
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs Paper • 2402.15627 • Published Feb 23 • 31
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper • 2402.17193 • Published Feb 27 • 23
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 566
Simple linear attention language models balance the recall-throughput tradeoff Paper • 2402.18668 • Published Feb 28 • 17
Resonance RoPE: Improving Context Length Generalization of Large Language Models Paper • 2403.00071 • Published Feb 29 • 18
Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters Paper • 2403.02677 • Published Mar 5 • 16
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6 • 61
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 172
Common 7B Language Models Already Possess Strong Math Capabilities Paper • 2403.04706 • Published Mar 7 • 16
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8 • 49
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU Paper • 2403.06504 • Published Mar 11 • 52
Life Science, Health and Medical Datasets for ML Collection A collection of datasets for Medical Domain • 3 items • Updated Feb 22 • 1
Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations Paper • 2402.07023 • Published Feb 10 • 3
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models Paper • 2308.01390 • Published Aug 2, 2023 • 30
Med-HALT: Medical Domain Hallucination Test for Large Language Models Paper • 2307.15343 • Published Jul 28, 2023 • 2