chenk-ai's picture
1 3

chenk-ai

chenk-ai
·

AI & ML interests

None yet

Recent Activity

Organizations

None yet

chenk-ai's activity

replied to burtenshaw's post 23 days ago
view reply

I experienced that the GRPO from TRL is very memory-consuming. There are already various alternative implementations out there that seem much faster and more lightweight. Unsloth is promoting this with a factor of 10 less memory! This is insane. Can we potentially expect something similar for the TRL implementation in the near future?

I have combined the RL gym lib with GRPO here to see if you can teach a small model to drive taxi. This already took around 70gb for the 1.5b model.

BTW: The RL gym lib could be potentially helpful for new/better reasoning models (and new benchmarks)?

https://github.com/chenkel-data/grpo-taxi

upvoted an article 28 days ago
view article
Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

382
upvoted an article 2 months ago
view article
Article

Open-R1: a fully open reproduction of DeepSeek-R1

839