m-ric (Aymeric Roucher)

upvoted an article 4 days ago

Article

Benchmarking Text Generation Inference

5 days ago

• 17

upvoted a paper 5 days ago

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published 18 days ago • 73

upvoted 3 papers 6 days ago

upvoted 2 papers 7 days ago

Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1 • 20

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published 14 days ago • 136

upvoted an article 10 days ago

Article

AI has a problem with objectifying women

By

•

10 days ago

• 52

upvoted 2 articles 12 days ago

Article

Let's talk about LLM evaluation

By

•

11 days ago

• 82

Article

Introducing Spaces Dev Mode for a seamless developer experience

13 days ago

• 10

upvoted a paper 12 days ago

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Paper • 2305.15294 • Published May 24, 2023 • 1

upvoted an article 14 days ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

20 days ago

• 132

upvoted a paper 18 days ago

MambaOut: Do We Really Need Mamba for Vision?

Paper • 2405.07992 • Published 21 days ago • 1

upvoted an article 19 days ago

Article

2024-04-22 - Hub Incident Post Mortem

By

•

17 days ago

• 15

upvoted a paper 19 days ago

No Language Left Behind: Scaling Human-Centered Machine Translation

Paper • 2207.04672 • Published Jul 11, 2022 • 1

upvoted 2 articles 20 days ago

Article

Hugging Face x LangChain : A new partner package in LangChain

20 days ago

• 71

Article

Energy Star Ratings for AI Models

By

•

25 days ago

• 15

upvoted a paper 20 days ago

Data Interpreter: An LLM Agent For Data Science

Paper • 2402.18679 • Published Feb 28 • 1

upvoted an article 20 days ago

Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

May 1

• 53

upvoted 2 articles 21 days ago

Article

Subscribe to Enterprise Hub with your AWS Account

25 days ago

• 6

Article

License to Call: Introducing Transformers Agents 2.0

21 days ago

• 92

upvoted 2 papers 27 days ago

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

Paper • 1809.09600 • Published Sep 25, 2018 • 2

Training Verifiers to Solve Math Word Problems

Paper • 2110.14168 • Published Oct 27, 2021 • 4

upvoted an article 28 days ago

Article

Vision Language Models Explained

Apr 11

• 92

upvoted a paper 28 days ago

What matters when building vision-language models?

Paper • 2405.02246 • Published about 1 month ago • 87

upvoted 3 papers 29 days ago

Fast Inference from Transformers via Speculative Decoding

Paper • 2211.17192 • Published Nov 30, 2022 • 3

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 50

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

Paper • 2402.12374 • Published Feb 19 • 2

upvoted an article about 1 month ago

Article

Improving Prompt Consistency with Structured Generations

Apr 30

• 46

upvoted 4 papers about 1 month ago

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17 • 40

Chinchilla Scaling: A replication attempt

Paper • 2404.10102 • Published Apr 15 • 1

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 239

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published Apr 19 • 38

upvoted an article about 1 month ago

Article

Introducing the Open Chain of Thought Leaderboard

Apr 23

• 20

upvoted a paper about 1 month ago

Training-Free Long-Context Scaling of Large Language Models

Paper • 2402.17463 • Published Feb 27 • 18

upvoted 3 articles about 1 month ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22

• 73

Article

Synthetic data: save money, time and carbon with open source

Feb 16

• 29

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 245

upvoted 2 collections about 2 months ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 563

fuck quadratic attention

Collection

11 items • Updated Apr 24 • 19

upvoted a paper about 2 months ago

Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 17

upvoted an article about 2 months ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 134

upvoted 2 papers about 2 months ago

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

Paper • 2208.07339 • Published Aug 15, 2022 • 4

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Paper • 2210.17323 • Published Oct 31, 2022 • 6

upvoted an article about 2 months ago

Article

It's raining diffusion personalization techniques☔️🎭🖼️

By

•

Apr 11

• 16

upvoted 2 papers about 2 months ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10 • 93

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 80

upvoted an article about 2 months ago

Article

Assisted Generation: a new direction toward low-latency text generation

May 11, 2023

• 9

upvoted 2 papers about 2 months ago

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Paper • 2404.05961 • Published Apr 9 • 62

OmniFusion Technical Report

Paper • 2404.06212 • Published Apr 9 • 73

upvoted 2 articles about 2 months ago

Article

Public Policy at Hugging Face

Apr 8

• 17

Article

Outpainting II - Differential Diffusion

By

•

Apr 23

• 24

upvoted 3 papers about 2 months ago

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 139

Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 84

AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4 • 22

upvoted an article about 2 months ago

Article

Open-source LLMs as LangChain Agents

Jan 24

• 12

upvoted 3 papers about 2 months ago

More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3 • 46

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 153

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 102

upvoted a collection 2 months ago

🤖 Agents

Collection

16 items • Updated Apr 24 • 20

Aymeric Roucher

AI & ML interests

Articles

License to Call: Introducing Transformers Agents 2.0

Open-source LLMs as LangChain Agents

Organizations

m-ric's activity

Benchmarking Text Generation Inference

AI has a problem with objectifying women

Let's talk about LLM evaluation

Introducing Spaces Dev Mode for a seamless developer experience

PaliGemma – Google's Cutting-Edge Open Vision Language Model

2024-04-22 - Hub Incident Post Mortem

Hugging Face x LangChain : A new partner package in LangChain

Energy Star Ratings for AI Models

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

Subscribe to Enterprise Hub with your AWS Account

License to Call: Introducing Transformers Agents 2.0

Vision Language Models Explained

Improving Prompt Consistency with Structured Generations

Introducing the Open Chain of Thought Leaderboard

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Synthetic data: save money, time and carbon with open source

Welcome Llama 3 - Meta's new open LLM

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

It's raining diffusion personalization techniques☔️🎭🖼️

Assisted Generation: a new direction toward low-latency text generation

Public Policy at Hugging Face

Outpainting II - Differential Diffusion

Open-source LLMs as LangChain Agents