ybelkada (Younes Belkada)

upvoted a collection 8 days ago

BitNet

Collection

🔥BitNet family of large language models (1-bit LLMs). • 6 items • Updated 5 days ago • 28

upvoted an article 2 months ago

Article

The Open Arabic LLM Leaderboard 2

Feb 10

• 31

upvoted a collection 4 months ago

Falcon3

Collection

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated Feb 13 • 86

upvoted a paper 6 months ago

Falcon Mamba: The First Competitive Attention-free 7B Language Model

Paper • 2410.05355 • Published Oct 7, 2024 • 36

upvoted an article 8 months ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12, 2024

• 110

upvoted a collection 8 months ago

FalconMamba 7B

Collection

This collection features the FalconMamba 7B base model, the instruction-tuned version, their 4-bit and GGUF variants, and the demo. • 15 items • Updated Feb 13 • 34

upvoted a collection 10 months ago

4M Models

Collection

Multimodal models from https://4m.epfl.ch/ • 17 items • Updated Mar 7 • 31

upvoted a paper 10 months ago

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

Paper • 2303.02861 • Published Mar 6, 2023 • 2

upvoted a paper 11 months ago

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

Paper • 2406.04904 • Published Jun 7, 2024 • 9

upvoted a collection 11 months ago

AQLM+PV

Collection

Official AQLM quantizations for "PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression": https://arxiv.org/abs/2405.14852 • 26 items • Updated Feb 28 • 21

upvoted a paper 11 months ago

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28, 2024 • 12

upvoted 2 articles 12 months ago

Article

Overview of natively supported quantization schemes in 🤗 Transformers

Sep 12, 2023

• 12

Article

Mixture of Experts Explained

Dec 11, 2023

• 570

upvoted a collection 12 months ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 749

upvoted 3 papers 12 months ago

upvoted a collection about 1 year ago

Pile-T5

Collection

T5 trained on the Pile with Llama Tokenizer • 4 items • Updated Feb 26 • 17

upvoted a paper about 1 year ago

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12, 2024 • 65

upvoted an article about 1 year ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

May 24, 2023

• 140

Younes Belkada

AI & ML interests

Organizations

ybelkada's activity

BitNet

The Open Arabic LLM Leaderboard 2

Falcon3

Falcon Mamba: The First Competitive Attention-free 7B Language Model

Welcome FalconMamba: The first strong attention-free 7B model

FalconMamba 7B

4M Models

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

AQLM+PV

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Overview of natively supported quantization schemes in 🤗 Transformers

Mixture of Experts Explained

Meta Llama 3

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Pile-T5

ORPO: Monolithic Preference Optimization without Reference Model

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA