Edit model card

mistral-7b-neuralhermes-2.5-dpo

mistral-7b-neuralhermes-2.5-dpo is a DPO fine-tuned version of teknium/OpenHermes-2.5-Mistral-7B using the Intel/orca_dpo_pairs dataset.

LoRA

  • r: 16
  • LoRA alpha: 16
  • LoRA dropout: 0.05

Training arguments

  • Batch size: 4
  • Gradient accumulation steps: 4
  • Optimizer: paged_adamw_32bit
  • Max steps: 100
  • Learning rate: 5e-05
  • Learning rate scheduler type: cosine
  • Beta: 0.1
  • Max prompt length: 1024
  • Max length: 1536
Downloads last month
8
Safetensors
Model size
7.24B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for CorticalStack/mistral-7b-neuralhermes-2.5-dpo

Finetuned
this model