jd0g
/

Mistral-7B-NLI-v0.2

Generated from Trainer

Model card Files Files and versions Community

Edit model card

mistral-7b-nli_cot

This model is a fine-tuned version of TheBloke/Mistral-7B-v0.1-GPTQ on the None dataset. It achieves the following results on the evaluation set:

Loss: nan

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.004
train_batch_size: 32
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 11
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.6298	0.9950	149	0.4956
0.4848	1.9967	299	0.4855
1.4397	2.9983	449	2.3408
1.4527	4.0	599	1.1570
1.0505	4.9950	748	1.0305
0.8713	5.9967	898	0.7930
0.7679	6.9983	1048	0.7487
0.7289	8.0	1198	0.7110
69.2312	8.9950	1347	nan
300.5902	9.9967	1497	nan
635.9469	10.9449	1639	nan

Framework versions

PEFT 0.10.0
Transformers 4.40.1
Pytorch 2.0.1+cu118
Datasets 2.19.0
Tokenizers 0.19.1

Downloads last month: 2

Inference API

Unable to determine this model’s pipeline type. Check the docs .

Model tree for jd0g/Mistral-7B-NLI-v0.2

Base model

mistralai/Mistral-7B-v0.1

Quantized

TheBloke/Mistral-7B-v0.1-GPTQ

Adapter

(27)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard