Transformers
Safetensors
English
unsloth
lora
  • el modelo se entreno con el dataset "mlabonne/orpo-dpo-mix-40k"
  • Las instrucciones se convirtieron en imagenes y apartir de las imagenes este debe de seguir instrucciones del usuario.

image

metricas_grpo_pasos_1-20

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tepirale/gemma4_E2B_grpo_lora

Adapter
(33)
this model

Dataset used to train tepirale/gemma4_E2B_grpo_lora