A.Genchev
AGenchev
·
AI & ML interests
None yet
Recent Activity
new activity
2 days ago
hantian/yolo-doclaynet:Good job !
upvoted
a
collection
18 days ago
olmOCR
reacted
to
burtenshaw's
post
with 👍
20 days ago
Here’s a notebook to make Gemma reason with GRPO & TRL. I made this whilst prepping the next unit of the reasoning course:
In this notebooks I combine together google’s model with some community tooling
- First, I load the model from the Hugging Face hub with transformers’s latest release for Gemma 3
- I use PEFT and bitsandbytes to get it running on Colab
- Then, I took Will Browns processing and reward functions to make reasoning chains from GSM8k
- Finally, I used TRL’s GRPOTrainer to train the model
Next step is to bring Unsloth AI in, then ship it in the reasoning course. Links to notebook below.
https://colab.research.google.com/drive/1Vkl69ytCS3bvOtV9_stRETMthlQXR4wX?usp=sharing
Organizations
None yet
models
None public yet
datasets
None public yet