mistral-7b-neuralhermes-2.5-dpo

mistral-7b-neuralhermes-2.5-dpo is a DPO fine-tuned version of teknium/OpenHermes-2.5-Mistral-7B using the Intel/orca_dpo_pairs dataset.

LoRA

r: 16
LoRA alpha: 16
LoRA dropout: 0.05

Training arguments

Batch size: 4
Gradient accumulation steps: 4
Optimizer: paged_adamw_32bit
Max steps: 100
Learning rate: 5e-05
Learning rate scheduler type: cosine
Beta: 0.1
Max prompt length: 1024
Max length: 1536

Downloads last month: 8

Safetensors

Model size

7.24B params

Tensor type

FP16

Inference API

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for CorticalStack/mistral-7b-neuralhermes-2.5-dpo

Base model

mistralai/Mistral-7B-v0.1

Finetuned

teknium/OpenHermes-2.5-Mistral-7B

Finetuned

this model