Kevin Wu

Flagpoles00

AI & ML interests

Always ready to learn!

Recent Activity

liked a model about 2 months ago

zed-industries/zeta

liked a Space about 2 months ago

fantaxy/adult-novel

liked a model about 2 months ago

AlejandroOlmedo/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-8bit-mlx

View all activity

Organizations

Flagpoles00's activity

liked a model about 2 months ago

zed-industries/zeta

Updated Feb 27 • 3.25k • 256

liked a Space about 2 months ago

107

Graphic Novel- NSFW Adult

🖊

Create stunning graphic novels effortlessly with AI

liked a model about 2 months ago

AlejandroOlmedo/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-8bit-mlx

Text Generation • Updated Feb 23 • 57 • 3

reacted to Jaward's post with 👍 about 2 months ago

Post

3896

Finally here it is: a faster, custom, scalable GRPO trainer for smaller models with < 500M params, can train on 8gb ram cpu, also supports gpu for sanity sake (includes support for vllm + flash attention). Using smolLM2-135M/360M-instructs as ref & base models. Experience your own “aha” moment 🐳 on 8gb ram.
Code: https://github.com/Jaykef/ai-algorithms/blob/main/smollm2_360M_135M_grpo_gsm8k.ipynb

2 replies

liked 8 models about 2 months ago