mlabonne (Maxime Labonne)

replied to drwlf's post 24 days ago

That's super cool, congrats! :)

reacted to drwlf's post with ❤️🤗 24 days ago

Post

5411

Having an insanely good medical LLM is pointless if it won’t answer your questions!

So we’ve made 2 notebook for abliterating any model in order to achieve a good model that will actually help you!

The notebooks are made using @mlabonne ‘s abliteration logic and datasets!

Feel free to use them and happy training 😊

https://github.com/dralexlup/LLM-Abliteration

3 replies

·

replied to their post 27 days ago

Will do at some point, but I don't have time to write this down at the moment.

reacted to burtenshaw's post with 🚀❤️🤗 3 months ago

Post

3352

NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm behind DeepSeek R1 with an advanced and hands-on guide to interpreting GRPO.

🔗

reasoning-course

This unit is super useful if you’re tuning models with reinforcement learning. It will help with:

- interpreting loss and reward progression during training runs
- selecting effective parameters for training
- reviewing and defining effective reward functions

This unit also works up smoothly toward the existing practical exercises form @mlabonne and Unsloth.

📣 Shout out to @ShirinYamani who wrote the unit. Follow for more great content.

1 reply

·

posted an update 4 months ago

Post

16238

✂️ AutoAbliteration

I made a Colab notebook to automatically abliterate models.

It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.

💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing

1 reply

·

posted an update 4 months ago

Post

6347

✂️ Gemma 3 Abliterated

I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.

I experimented with different recipes and improved the abliteration technique I wrote about last year.

It's still experimental but the refusal rate is super low in my tests. Enjoy!

mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

4 replies

·

reacted to burtenshaw's post with 🤗❤️ 4 months ago

Post

3938

I’m super excited to work with @mlabonne to build the first practical example in the reasoning course.

🔗

reasoning-course

Here's a quick walk through of the first drop of material that works toward the use case:

- a fundamental introduction to reinforcement learning. Answering questions like, ‘what is a reward?’ and ‘how do we create an environment for a language model?’

- Then it focuses on Deepseek R1 by walking through the paper and highlighting key aspects. This is an old school way to learn ML topics, but it always works.

- Next, it takes to you Transformers Reinforcement Learning and demonstrates potential reward functions you could use. This is cool because it uses Marimo notebooks to visualise the reward.

- Finally, Maxime walks us through a real training notebook that uses GRPO to reduce generation length. I’m really into this because it works and Maxime took the time to validate it share assets and logging from his own runs for you to compare with.

Maxime’s work and notebooks have been a major part of the open source community over the last few years. I, like everyone, have learnt so much from them.

reacted to sometimesanotion's post with 🚀 5 months ago

Post

3362

**Update** Either I had some wrong numbers plugged in to estimate benchmark numbers from comparator, or the benchmark changed. Virtuoso Small v2 at 41.07 average is still very impressive, especially for writing draft copy for business purposes, while Lamarck remains a chatty generalist-reasoning model.

I've felt confident that 14B Qwen finetunes and merges could break the 42.0 average, and Arcee **came close** with https://huggingface.co/arcee-ai/Virtuoso-Small-2. Congratulations to @arcee-ai !

Just two months ago, it was easy to think that 14B had plateaued, that you could have high IFEVAL or high MUSR/MATH/GPQA at 14B, but not both. That barrier is completely shattered. I see a pathway to even better, and Virtuoso Small 2 is a big part of why. Very impressive work. This community would expect no less from Arcee.

Just look at this graph! Keep in mind, my merges here build on the first Virtuoso Small, and *-DS merges build on DeepSeek R1. There are some impressive merges in the pipe!

5 replies

·

replied to m-ric's post 5 months ago

hahaha

reacted to m-ric's post with ❤️🔥👀 5 months ago

Post

3456

Today we make the biggest release in smolagents so far: 𝘄𝗲 𝗲𝗻𝗮𝗯𝗹𝗲 𝘃𝗶𝘀𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀, 𝘄𝗵𝗶𝗰𝗵 𝗮𝗹𝗹𝗼𝘄𝘀 𝘁𝗼 𝗯𝘂𝗶𝗹𝗱 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹 𝘄𝗲𝗯 𝗯𝗿𝗼𝘄𝘀𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁𝘀! 🥳

Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.

The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !

Go try it out, it's the most cracked agentic stuff I've seen in a while 🤯 (well, along with OpenAI's Operator who beat us by one day)

For more detail, read our announcement blog 👉 https://huggingface.co/blog/smolagents-can-see
The code for the web browser example is here 👉 https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py

3 replies

·

posted an update 6 months ago

Post

6648

🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

replied to CultriX's post 6 months ago

that looks great, well done!

reacted to CultriX's post with ❤️ 6 months ago

Post

2141

# Space for Multi-Agent Workflows using AutoGen

Hi all, I created this "AutoGen Multi-Agent Workflow" space that allows you to experiment with multi-agent workflows.

By default, it allows code generation with built-in quality control and automatic documentation generation. It achieves this by leveraging multiple AI agents working together to produce high-quality code snippets, ensuring they meet the specified requirements.

In addition to the default, the space allows users to set custom system messages for each assistant, potentially completely changing the workflow.

# Workflow Steps
1. User Input:
- The user defines a prompt, such as "Write a random password generator using python."
- Outcome: A clear task for the primary assistant to accomplish.

2. Primary Assistant Work:
- The primary assistant begins working on the provided prompt.
It generates an initial code snippet based on the user's request.
- Outcome: An initial proposal for the requested code.

3. Critic Feedback:
- The critic reviews the generated code provides feedback or (if the output meets the criteria), broadcasts the APPROVED message.
(This process repeats until the output is APPROVED or 10 messages have been exchanged).
- Outcome: A revised Python function that incorporates the critic's feedback.

4. Documentation Generation:
- Once the code is approved, it is passed to a documentation assistant.
The documentation assistant generates a concise documentation for the final code.
- Outcome: A short documentation including function description, parameters, and return values.

Enjoy!
CultriX/AutoGen-MultiAgent-Example

4 replies

·

reacted to burtenshaw's post with ❤️ 7 months ago

Post

2831

For anyone looking to boost their LLM fine-tuning and alignment skills this decemeber. We're running this free and open course called smol course. It’s not big like Li Yin and @mlabonne , it’s just smol.

👷 It focuses on practical use cases, so if you’re working on something, bring it along.

👯‍♀️ It’s peer reviewed and open so you can discuss and get feedback.

🤘 If you’re already a smol pro, feel free to drop a star or issue.

> > Part 1 starts now, and it’s on instruction tuning!

https://github.com/huggingface/smol-course

Maxime Labonne PRO

AI & ML interests

Recent Activity

Organizations

Maxime Labonne PRO

AI & ML interests

Recent Activity

Organizations

mlabonne's activity