Edit model card

NeuralReyna-Mini-1.8B-v0.2

Reyna image

Description

Taken aloobun/Reyna-Mini-1.8B-v0.2 and further fine-tuned it using DPO using the Intel/orca_dpo_pairs dataset.

This model has capabilities in coding, math, science, roleplay, and function calling.

This model was trained on OpenAI's ChatML prompt format.

Evaluation

AGIEval: image/png

GPT4ALL:

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc 0.3208 ± 0.0136
none 0 acc_norm 0.3336 ± 0.0138
arc_easy 1 none 0 acc 0.6035 ± 0.0100
none 0 acc_norm 0.5833 ± 0.0101
boolq 2 none 0 acc 0.6526 ± 0.0083
hellaswag 1 none 0 acc 0.4556 ± 0.0050
none 0 acc_norm 0.6076 ± 0.0049
openbookqa 1 none 0 acc 0.2600 ± 0.0196
none 0 acc_norm 0.3460 ± 0.0213
piqa 1 none 0 acc 0.7236 ± 0.0104
none 0 acc_norm 0.7307 ± 0.0104
winogrande 1 none 0 acc 0.6062 ± 0.0137

Disclaimer

This model may have overfitted to the DPO training data, and may not perform well.

Contributions

Thanks to @aloobun and @Locutusque for their contributions to this model.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 44.85
AI2 Reasoning Challenge (25-Shot) 37.80
HellaSwag (10-Shot) 60.51
MMLU (5-Shot) 45.04
TruthfulQA (0-shot) 37.75
Winogrande (5-shot) 60.93
GSM8k (5-shot) 27.07
Downloads last month
389
Safetensors
Model size
1.84B params
Tensor type
FP16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train M4-ai/NeuralReyna-Mini-1.8B-v0.2

Evaluation results