43 71 289

Thomas Wolf PRO

thomwolf

https://thomwolf.io

Thom_wolf

thomwolf

AI & ML interests

NLP and open-source :-)

Articles

Organizations

thomwolf's activity

upvoted a collection 3 days ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated about 1 month ago • 522

upvoted a paper 4 days ago

OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

Paper • 2306.16527 • Published Jun 21, 2023 • 42

upvoted an article 4 days ago

Article

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

Aug 22, 2023

• 9

upvoted an article 11 days ago

Article

Improving Prompt Consistency with Structured Generations

19 days ago

• 41

upvoted a paper 13 days ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8 • 1

upvoted 2 papers 26 days ago

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Paper • 2402.09844 • Published Feb 15 • 18

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published 27 days ago • 230

upvoted a paper 27 days ago

A Generalist Agent

Paper • 2205.06175 • Published May 12, 2022 • 2

upvoted an article about 1 month ago

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 238

upvoted a paper about 1 month ago

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 54

upvoted 9 articles about 1 month ago

Article

Public Policy at Hugging Face

Apr 8

• 16

Article

Orchestration of Experts: The First-Principle Multi-Model System

•

Apr 16

• 8

Article

Total noob’s intro to Hugging Face Transformers

Mar 22

• 19

Article

Pollen-Vision: Unified interface for Zero-Shot vision models in robotics

Mar 25

• 6

Article

Custom architectures with HuggingFace 🤗

•

27 days ago

• 20

Article

Open Source All About Data Processing, Dataverse

•

Apr 4

• 2

Article

quanto: a pytorch quantization toolkit

Mar 18

• 13

Article

Hugging Face partners with Wiz Research to Improve AI Security

Apr 4

• 10

Article

The LASER technique: Evaluating SVD compression

•

Apr 4

• 6

upvoted 2 papers about 2 months ago

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Paper • 2401.16013 • Published Jan 29 • 17

QuRating: Selecting High-Quality Data for Training Language Models

Paper • 2402.09739 • Published Feb 15 • 3

upvoted 3 papers 2 months ago

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28 • 17

Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7 • 58

A Survey on Data Selection for Language Models

Paper • 2402.16827 • Published Feb 26 • 3

upvoted a paper 3 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 123

upvoted a collection 3 months ago

Gemma release

Collection

Groups the Gemma models released by the Google team. • 40 items • Updated 4 days ago • 303

upvoted 3 papers 3 months ago

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Paper • 2402.03046 • Published Feb 5 • 4

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5 • 61

SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 22

upvoted a paper 4 months ago

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 152

upvoted 2 collections 5 months ago

Leaderboards and benchmarks ✨

Collection

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 61 items • Updated 5 days ago • 59

Paloma

Collection

Dataset and baseline models for Paloma, a benchmark of language model fit to 585 textual domains • 8 items • Updated Feb 1 • 13

upvoted 2 papers 5 months ago

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 253

Paloma: A Benchmark for Evaluating Language Model Fit

Paper • 2312.10523 • Published Dec 16, 2023 • 11

upvoted 2 collections 5 months ago

Journal Club

Collection

Candidate papers to read in the H4 journal club • 54 items • Updated 27 days ago • 23

Recent models: last 100 repos, sorted by creation date

Collection

The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 446

upvoted 2 papers 5 months ago

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 33

Scaling Data-Constrained Language Models

Paper • 2305.16264 • Published May 25, 2023 • 16

upvoted 5 papers 6 months ago

The Falcon Series of Open Language Models

Paper • 2311.16867 • Published Nov 28, 2023 • 11

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Paper • 2311.12022 • Published Nov 20, 2023 • 22

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

Paper • 2311.10702 • Published Nov 17, 2023 • 17

System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 38

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 171

upvoted a collection 6 months ago

Top 10% instruction tuning datasets

Collection

Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the number of likes • 13 items • Updated Sep 25, 2023 • 6

upvoted a paper 6 months ago

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 68

upvoted a collection 6 months ago

GAIA release

Collection

Gather the items of the GAIA release • 4 items • Updated Nov 23, 2023 • 17

upvoted a collection 7 months ago

zephyr story

Collection

sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 • 8 items • Updated Jan 24 • 15

upvoted a paper 7 months ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 116

upvoted a collection 7 months ago

Historical - Spaces of the Week

Collection

All Spaces of the Week...from all weeks • 636 items • Updated Jan 17 • 19

upvoted 2 papers 7 months ago

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 94

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 42

upvoted a collection 7 months ago

LLM Leaderboard best models ❤️‍🔥

Collection

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 70 items • Updated 2 days ago • 304

upvoted 3 papers 8 months ago

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 79

A Function Interpretation Benchmark for Evaluating Interpretability Methods

Paper • 2309.03886 • Published Sep 7, 2023 • 1

Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Paper • 2309.08532 • Published Sep 15, 2023 • 50

upvoted 2 papers 9 months ago

Efficient RLHF: Reducing the Memory Usage of PPO

Paper • 2309.00754 • Published Sep 1, 2023 • 13

One Wide Feedforward is All You Need

Paper • 2309.01826 • Published Sep 4, 2023 • 31

upvoted 2 papers 10 months ago

On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models

Paper • 2307.09793 • Published Jul 19, 2023 • 45

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 235

upvoted a paper 11 months ago

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

Paper • 2306.04751 • Published Jun 7, 2023 • 4

Thomas Wolf PRO

AI & ML interests

Articles

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Constitutional AI with Open LLMs

Open LLM Leaderboard: DROP deep dive

What's going on with the Open LLM Leaderboard?

Can foundation models label data like humans?

Organizations

thomwolf's activity

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

Improving Prompt Consistency with Structured Generations

Welcome Llama 3 - Meta's new open LLM

Public Policy at Hugging Face

Orchestration of Experts: The First-Principle Multi-Model System

Total noob’s intro to Hugging Face Transformers

Pollen-Vision: Unified interface for Zero-Shot vision models in robotics

Custom architectures with HuggingFace 🤗

Open Source All About Data Processing, Dataverse

quanto: a pytorch quantization toolkit

Hugging Face partners with Wiz Research to Improve AI Security

The LASER technique: Evaluating SVD compression