Edit model card

sft

This model is a fine-tuned version of /home/Meta-Llama-3-8B-Instruct on the alpaca_zh_demo, the identity and the teachers_exam_local datasets. It achieves the following results on the evaluation set:

  • Loss: 1.5271

这个模型是Meta-Llama-3-8B-Instruct在identity和teachers_exam_local数据集上的微调版本。 在评估集上得到如下结果:

  • Loss: 1.5271

Model description

This Model is based Meta-Llama3-8B-Instruct finetuning.

Intended uses & limitations

No limit,everyone can use it. 无限制使用,任何人都可以使用。

Training and evaluation data

Used high qulity datas with Teachers Exam.The data include choice,multi-choice,etc types subjects. 使用高质量的数据与教师考试。数据包括选择题、多项选择题等类型的题目。

#大模型评测 benchmark(mmlu)

        Average: 66.89                                                                
           STEM: 57.52
Social Sciences: 76.28
     Humanities: 62.32
          Other: 73.32

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 8.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.2507 0.1895 50 1.9127
1.8486 0.3790 100 1.6928
1.7402 0.5685 150 1.6263
1.6577 0.7579 200 1.5895
1.6681 0.9474 250 1.5667
1.5953 1.1369 300 1.5586
1.5308 1.3264 350 1.5557
1.5432 1.5159 400 1.5500
1.5724 1.7054 450 1.5392
1.5135 1.8948 500 1.5271
1.4324 2.0843 550 1.5466
1.3993 2.2738 600 1.5391
1.4099 2.4633 650 1.5434
1.3764 2.6528 700 1.5400
1.3219 2.8423 750 1.5354
1.3678 3.0317 800 1.5719
1.263 3.2212 850 1.5781
1.228 3.4107 900 1.5834
1.2743 3.6002 950 1.5766
1.2456 3.7897 1000 1.5617
1.2192 3.9792 1050 1.5626
1.0889 4.1686 1100 1.6138
1.156 4.3581 1150 1.6190
1.1111 4.5476 1200 1.6066
1.1222 4.7371 1250 1.6185
1.1102 4.9266 1300 1.6020
1.042 5.1161 1350 1.6649
0.9666 5.3055 1400 1.6663
1.0506 5.4950 1450 1.6709
1.035 5.6845 1500 1.6592
1.0121 5.8740 1550 1.6589
0.968 6.0635 1600 1.7109
0.9422 6.2530 1650 1.7100
0.9571 6.4424 1700 1.7004
0.9546 6.6319 1750 1.6982
0.9965 6.8214 1800 1.7010
0.9433 7.0109 1850 1.7062
0.9193 7.2004 1900 1.7224
0.89 7.3899 1950 1.7259
0.901 7.5793 2000 1.7271
0.9101 7.7688 2050 1.7280
0.9108 7.9583 2100 1.7280

Framework versions

  • PEFT 0.12.0
  • Transformers 4.43.4
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
14
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) does not yet support peft models for this pipeline type.

Model tree for shileii/Teachers_Exam_LLaMA_8B

Adapter
(616)
this model
Adapters
1 model

Dataset used to train shileii/Teachers_Exam_LLaMA_8B