MuntasirHossain
/

Orpo-Mistral-7B-v0.3-peft-adapter

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Orpo-Mistral-7B-v0.3-peft-adapter

This model is a fine-tuned version of mistralai/Mistral-7B-v0.3 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6129
Rewards/chosen: -0.0574
Rewards/rejected: -0.0794
Rewards/accuracies: 0.6429
Rewards/margins: 0.0221
Logps/rejected: -0.7943
Logps/chosen: -0.5738
Logits/rejected: -3.2681
Logits/chosen: -3.2859
Nll Loss: 0.5465
Log Odds Ratio: -0.5869
Log Odds Chosen: 0.4083

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 10
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen	Nll Loss	Log Odds Ratio	Log Odds Chosen
0.7914	0.2003	62	0.6937	-0.0709	-0.0851	0.6071	0.0142	-0.8514	-0.7090	-3.2582	-3.2848	0.6333	-0.6178	0.2402
0.802	0.4006	124	0.6338	-0.0607	-0.0781	0.6429	0.0174	-0.7809	-0.6070	-3.2744	-3.2972	0.5693	-0.5966	0.3287
0.8605	0.6010	186	0.6204	-0.0586	-0.0799	0.6071	0.0213	-0.7990	-0.5863	-3.2692	-3.2895	0.5538	-0.5899	0.3927
0.7359	0.8013	248	0.6129	-0.0574	-0.0794	0.6429	0.0221	-0.7943	-0.5738	-3.2681	-3.2859	0.5465	-0.5869	0.4083

Framework versions

PEFT 0.11.1
Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Downloads last month: 0

Inference API

Unable to determine this model’s pipeline type. Check the docs .

Model tree for MuntasirHossain/Orpo-Mistral-7B-v0.3-peft-adapter

Base model

mistralai/Mistral-7B-v0.3

Adapter

(290)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard