andysalerno
/

mistral-sft-v3

Text Generation

text-generation-inference

Model card Files Files and versions Community

This is mistralai/Mistral-7B-v0.1, but with the special tokens added for ChatML, and then lightly finetuned with sft using a ChatML formatted dataset: andysalerno/ansalern-nectar-inputoutput

The training was very light, so while this model correctly follows ChatML formatting, it is not intended to be a chat model.

Rather, it is intended to be a base for further fine-tuning models that will use ChatML.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	60.93
AI2 Reasoning Challenge (25-Shot)	61.35
HellaSwag (10-Shot)	82.23
MMLU (5-Shot)	63.40
TruthfulQA (0-shot)	48.49
Winogrande (5-shot)	77.66
GSM8k (5-shot)	32.45

Downloads last month: 12

Safetensors

Model size

7.24B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for andysalerno/mistral-sft-v3

Base model

mistralai/Mistral-7B-v0.1

Finetuned

(872)

this model

Adapters

Dataset used to train andysalerno/mistral-sft-v3

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

61.350
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

82.230
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

63.400
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

48.490
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

77.660
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

32.450

View on Papers With Code