Model

  • Developed by: DARJYO
  • Base Type: Fine-tuned language model
  • Finetuned model : persadian_14B-GRPO
  • Base Architecture: Transformer-based/Phi-4

This model is fine-tuned on datasets for tasks with Unsloth and Huggingface's TRL library. It is based on the unsloth/Phi-4 model and uses reinforcement learning for improved performance.

Downloads last month
4
GGUF
Hardware compatibility
Log In to view the estimation
Video Preview
loading

Model tree for DARJYO/persadian_14B-GRPO

Base model

microsoft/phi-4
Finetuned
unsloth/phi-4
Quantized
(24)
this model
Quantizations
1 model