Stas Bekman's picture

31 3

Stas Bekman

stas

·

https://stasosphere.com/machine-learning/

AI & ML interests

Toolmaker. Software creator, optimizer and harmonizer. Makes things work and fly at Contextual.AI Training LLM/RAG/Generative AI/Machine Learning/Scalability

Recent Activity

updated a model 20 days ago

stas/ml-engineering-book

posted an update about 2 months ago

Do you want ArcticTraining at @SnowflakeDB to add an ability to post-train DeepSeek V3/R1 models with DPO using just a few GPU nodes? Please vote here and tell others about it: https://github.com/snowflakedb/ArcticTraining/discussions/58 ArcticTraining is an open-source, easy to use post-training framework for NVIDIA GPUs built on top of DeepSpeed.

updated a model 3 months ago

stas/ml-engineering-book

View all activity

Organizations

Posts 8

Post

2115

Do you want ArcticTraining at @SnowflakeDB to add an ability to post-train DeepSeek V3/R1 models with DPO using just a few GPU nodes?

Please vote here and tell others about it: https://github.com/snowflakedb/ArcticTraining/discussions/58

ArcticTraining is an open-source, easy to use post-training framework for NVIDIA GPUs built on top of DeepSpeed.

Articles 6

Article

54

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

View all Articles

Papers 6

arxiv:2406.18820

arxiv:2401.14489

arxiv:2306.16527

arxiv:2211.05100

models 9

stas/ml-engineering-book

Updated 20 days ago • 16

stas/tiny-random-llama-2

Text Generation • Updated Nov 14, 2023 • 202 • 40

stas/tiny-m2m_100

Text2Text Generation • Updated Apr 29, 2022 • 1.08k

stas/tr8b-104B-debug3

Updated Nov 29, 2021

stas/pegasus-cnn_dailymail-tiny-random

Text2Text Generation • Updated Jul 1, 2021 • 329

stas/mt5-tiny-random

Text2Text Generation • Updated Jun 23, 2021 • 22.8k • 2

stas/tiny-wmt19-en-de

Text2Text Generation • Updated May 3, 2021 • 335 • 1

stas/tiny-wmt19-en-ru

Text2Text Generation • Updated May 3, 2021 • 5.02k

stas/t5-very-small-random

Text2Text Generation • Updated Apr 21, 2021 • 11

datasets 8

stas/openwebtext-synthetic-testing

Updated Nov 14, 2023 • 31 • 4

stas/oscar-en-10k

Viewer • Updated Oct 19, 2022 • 10k • 184 • 2

stas/c4-en-10k

Viewer • Updated Oct 19, 2022 • 10k • 594 • 4

stas/general-pmd-synthetic-testing

Updated Oct 18, 2022 • 27

stas/cm4-synthetic-testing

Updated Oct 18, 2022 • 18

stas/openwebtext-10k

Viewer • Updated Sep 15, 2021 • 10k • 3.77k • 29

stas/wmt14-en-de-pre-processed

Viewer • Updated Feb 16, 2021 • 4.55M • 140 • 3

stas/wmt16-en-ro-pre-processed

Viewer • Updated Feb 16, 2021 • 614k • 126