thejaminator's picture
verl GRPO trained model at step 50
e79d8eb verified
metadata
base_model: thejaminator/checkpoints_multiple_datasets_layer_1_decoder-fixed
library_name: peft
tags:
  - lora
  - peft
pipeline_tag: text-generation