Thomas Wolf PRO


AI & ML interests

NLP and open-source :-)



Posts 5

view post
Is is time for the open-source AI robots revolution 🚀?

With @haixuantao and @Leyo we’ve been playing with a low-cost DJI robot controlled by three local open-source AI models (Whisper, Idefics2, Parler-TTS - all Apache2) and orchestrated by Dora-cs.

Links to find all the hardware/software we used in the demo:
- robot control framework – dora-rs:
- speech-to-text model – whisper: openai/whisper-base
- vision-text model – Idefics2: HuggingFaceM4/idefics2-8b-AWQ
- text-to-speech model – ParlerTTS mini: parler-tts/parler_tts_mini_v0.1
- robot:
- code gist:
- Larger codebase: dora-rs/dora-idefics2
- laptop/pc: any with a recent GPU card (our has a RTX 4090)

view post
Very interesting model just released by MyShell: jetmoe/jetmoe-8b . It's a 8B-parameters MoE LLM so 2.2B active parameters, really efficient.

Main characteristics:
- impressive performances for its size (beating meta-llama/Llama-2-7b and huggyllama/llama-13b)
- combine Mixture of Attention heads (MoA) and Mixture of MLP Experts (MoE) – 8 experts with 2 being active for each token
- trained on a rather limited 1.25T tokens from publicly available datasets – training recipe follows the MiniCPM's two-phases training method => first time I see this for a 2B+ model
- $100k to train
- open weights - open sharing of recipes - open dataset - open code => ♡
- still interesting room to improve performances (be it only by training longer)

- report:
- model: jetmoe/jetmoe-8b
- code:

Note: I actually detailed all of the MiniCPM schedule, Mixture-of-expert (MoE) and many of the datasets used in this work in my recent little guide to building LLMs in 2024, so feel free to check it out if you want to learn more on these topics: