Lincoln Gachagua's picture

21 112

Lincoln Gachagua

Whiteshadow12

·

Lwachira

AI & ML interests

None yet

Organizations

None yet

Whiteshadow12's activity

upvoted a collection about 24 hours ago

DeepSeekCoder-V2

4 items • Updated 4 days ago • 28

upvoted a paper 1 day ago

GEB-1.3B: Open Lightweight Large Language Model

Paper • 2406.09900 • Published 4 days ago • 10

upvoted 2 papers 3 days ago

The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published 12 days ago • 35

CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published 11 days ago • 31

upvoted a collection 4 days ago

SSMs

A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated 4 days ago • 10

upvoted a collection 6 days ago

Gemma Reupload

6 items • Updated 7 days ago • 3

upvoted a collection 7 days ago

RecurrentGemma Release

8 items • Updated 7 days ago • 36

upvoted a collection 13 days ago

GLM-4

GLM-4 Open Models • 4 items • Updated 13 days ago • 75

upvoted a collection 26 days ago

C4AI Aya 23

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 26 days ago • 34

upvoted a collection 28 days ago

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 19 days ago • 336

upvoted a collection 29 days ago

Yi 1.5 GGUFs

Collection of Yi 1.5 GGUFs made with gguf-my-repo • 15 items • Updated 29 days ago • 4

upvoted 4 collections about 1 month ago

Yi-1.5 (2024/05)

10 items • Updated 30 days ago • 80

MAmmoTH2

Scaling up instruction data from the web for to build better LLMs • 11 items • Updated 23 days ago • 7

Searching for Better ViT Baselines

Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 19 items • Updated 6 days ago • 10

PaliGemma Release

Pretrained and mix checkpoints for PaliGemma • 11 items • Updated May 17 • 115

upvoted 2 collections 2 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 594

[lecture artifacts] aligning open language models

artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17 • 49

upvoted a paper 3 months ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 102

upvoted a collection 3 months ago

Zeroshot Classifiers

These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 81

upvoted a paper 4 months ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 81