Ferdinand Mom

3outeille

https://3outeille.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

Balancing Pipeline Parallelism with Vocabulary Parallelism

liked a model 19 days ago

meta-llama/Llama-2-7b

liked a model about 1 month ago

TinyLlama/TinyLlama-1.1B-Chat-v0.1

View all activity

Organizations

3outeille's activity

upvoted a paper 8 days ago

Balancing Pipeline Parallelism with Vocabulary Parallelism

Paper • 2411.05288 • Published 15 days ago • 19

liked a model 19 days ago

meta-llama/Llama-2-7b

Text Generation • Updated Apr 17 • 4.16k

liked a model about 1 month ago

TinyLlama/TinyLlama-1.1B-Chat-v0.1

Text Generation • Updated Sep 26, 2023 • 4.35k • 53

updated a model 4 months ago

nanotron/bench_cluster_epfl

Updated Jul 12

updated 2 models 5 months ago

nanotron/test

Updated Jul 6

nanotron/old_bench

Updated Jul 6 • 2

updated a dataset 6 months ago

HuggingFaceBR4/fmom-debug-mmlu

Updated May 28 • 9

liked 2 datasets 7 months ago

HuggingFaceM4/OBELICS

Viewer • Updated Aug 22, 2023 • 276M • 14.6k • 141

HuggingFaceM4/the_cauldron

Viewer • Updated May 6 • 1.88M • 140k • 332

liked 4 models 7 months ago

upvoted a collection 7 months ago

Idefics2 🐶

Collection

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88

Reacted to loubnabnl's post with ❤️🤗🤯 9 months ago

Post

⭐ Today we’re releasing The Stack v2 & StarCoder2: a series of 3B, 7B & 15B code generation models trained on 3.3 to 4.5 trillion tokens of code:

- StarCoder2-15B matches or outperforms CodeLlama 34B, and approaches DeepSeek-33B on multiple benchmarks.
- StarCoder2-3B outperforms StarCoderBase-15B and similar sized models.
- The Stack v2 a 4x larger dataset than the Stack v1, resulting in 900B unique code tokens 🚀
As always, we released everything from models and datasets to curation code. Enjoy!

🔗 StarCoder2 collection: bigcode/starcoder2-65de6da6e87db3383572be1a
🔗 Paper: https://drive.google.com/file/d/17iGn3c-sYNiLyRSY-A85QOzgzGnGiVI3/view
🔗 BlogPost: https://huggingface.co/blog/starcoder2
🔗 Code Leaderboard: bigcode/bigcode-models-leaderboard

authored a paper over 1 year ago

RWKV: Reinventing RNNs for the Transformer Era

Paper • 2305.13048 • Published May 22, 2023 • 14