OpenCoder Collection OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. β’ 8 items β’ Updated Nov 23 β’ 78
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper β’ 2408.10914 β’ Published Aug 20 β’ 41
OLMo Suite Collection Artifacts for the first set of OLMo models. β’ 18 items β’ Updated 29 days ago β’ 69
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. β’ 4 items β’ Updated Nov 2 β’ 160
The Prompt Report: A Systematic Survey of Prompting Techniques Paper β’ 2406.06608 β’ Published Jun 6 β’ 56
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma β’ 16 items β’ Updated 13 days ago β’ 142
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. β’ 23 items β’ Updated 8 days ago β’ 180
view article Article Releasing Swift Transformers: Run On-Device LLMs in Apple Devices Aug 8, 2023 β’ 26
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper β’ 2403.09611 β’ Published Mar 14 β’ 125
Flamingo: a Visual Language Model for Few-Shot Learning Paper β’ 2204.14198 β’ Published Apr 29, 2022 β’ 14
Gemma release Collection Groups the Gemma models released by the Google team. β’ 40 items β’ Updated 13 days ago β’ 327
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement Paper β’ 2402.07456 β’ Published Feb 12 β’ 41
LoRA: Low-Rank Adaptation of Large Language Models Paper β’ 2106.09685 β’ Published Jun 17, 2021 β’ 30
Specialized Language Models with Cheap Inference from Limited Domain Data Paper β’ 2402.01093 β’ Published Feb 2 β’ 45
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper β’ 2401.16380 β’ Published Jan 29 β’ 48
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn Paper β’ 2306.08640 β’ Published Jun 14, 2023 β’ 26