Omar Sanseviero


AI & ML interests

Llamas, model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.๐Ÿฆ™

Blog posts


Posts 5

view post
Diaries of Open Source. Part 1.

What a week! Here are some of the exciting Open Source releases of the week!

1. BigCode releases The Stack v2 and StarCoder 2
Resources in
Collection: bigcode/starcoder2-65de6da6e87db3383572be1a

2. Playground v2.5, a very powerful new text-to-image model
Model: playgroundai/playground-v2.5-1024px-aesthetic
Demo: playgroundai/playground-v2.5

3.Evo: DNA foundation models
Models: togethercomputer/evo-1-131k-base

4. OpenHermesPreferences: a dataset of ~1 million AI Preferences argilla/OpenHermesPreferences

5. SpeechBrain 1.0: a toolkit with hundreds of recipes and pretrained models for audio-related tasks, such as speech recognition, diarization, and enhancement. New major release!
HF repos:

6. Tower: a suite of Llama-based multilingual translation models Unbabel/tower-659eaedfe36e6dd29eb1805c

7. AllenAI releases OLMo-7B-Instruct

8. DIBT - An crowdsourced effort to human-rate prompts. Its 10k prompts dataset is released ttps://

9. ChatMusician: A Llama 2 fine-tuned model for music generation m-a-p/ChatMusician

9. ChatMusician: A Llama 2 fine-tuned model for music generation m-a-p/ChatMusician

10. Bonito, an model that converts data into synthetic instruction datasets
Model: BatsResearch/bonito-v1
Paper: 2402.18334
view post
Introducing: Zephyr Gemma!

The community has struggled to do a good preference-tune of Gemma, so the amazing @lewtun and @philschmid built an open-source recipe and trained a model to help people get started.

Model: HuggingFaceH4/zephyr-7b-gemma-v0.1
Demo: HuggingFaceH4/zephyr-7b-gemma-chat

Some interesting details
- Fine-tuned on DEITA and DPOed with Argilla DPO dataset
- Very strong MT Bench results (7.81), better than Zephyr Beta (mistral based) and Gemma Instruct
- Can run locally with tools such as llama.cpp on a Mac
- Not so good AGIEval results compared to mistral-based tunes
- All training code is open-sourced
- Trained for 105 minutes on 8x H100
- No system message

Big kudos to the team! Super exciting to see a good fine-tune for Gemma