Llama3-8B-1.58 Collection A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! • 3 items • Updated Sep 14 • 12
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published May 23 • 37
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 12 days ago • 142
LLaVA++ (LLaMA-3 and Phi-3-Mini) Collection Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated Jun 11 • 23
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 104
Voyager: An Open-Ended Embodied Agent with Large Language Models Paper • 2305.16291 • Published May 25, 2023 • 9
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge Paper • 2206.08853 • Published Jun 17, 2022 • 1
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 19 days ago • 696
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published Apr 18 • 53
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels Paper • 2312.17090 • Published Dec 28, 2023 • 4
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation Paper • 2312.14385 • Published Dec 22, 2023 • 5
TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 37