File size: 5,903 Bytes
61944bd
94d34d9
61944bd
 
 
 
 
 
 
 
 
94d34d9
61944bd
94d34d9
61944bd
 
 
 
 
94d34d9
 
 
 
 
 
 
 
 
 
 
 
61944bd
 
 
94d34d9
61944bd
 
 
94d34d9
61944bd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94d34d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
---
license: mit
base_model: mistralai/Mistral-7B-v0.1
tags:
- generated_from_trainer
model-index:
- name: supercot-lora
  results: []
---

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
# mistral-v0.1-supercot-lora

This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the [supercot](https://huggingface.co/datasets/kaiokendev/SuperCOT-dataset) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9790

## Model description

SuperCOT is a LoRA trained with the aim of making Mistral follow prompts for Langchain better, by infusing chain-of-thought datasets, code explanations and instructions, snippets, logical deductions and Alpaca GPT-4 prompts. It uses a mixture of the following datasets:

https://huggingface.co/datasets/QingyiSi/Alpaca-CoT
- Chain of thought QED
- Chain of thought Aqua
- CodeAlpaca
  
https://huggingface.co/datasets/neulab/conala
- Code snippets
  
https://huggingface.co/datasets/yahma/alpaca-cleaned
- Alpaca GPT4

## Intended uses & limitations

The model will show biases similar to those exhibited by the base model. It is not intended for supplying factual information or advice in any form.

## Training and evaluation data

[kaiokendev/SuperCOT-dataset](https://huggingface.co/datasets/kaiokendev/SuperCOT-dataset)

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.7661        | 0.06  | 20   | 1.5173          |
| 0.7681        | 0.12  | 40   | 1.2323          |
| 0.6647        | 0.18  | 60   | 1.1306          |
| 0.6742        | 0.24  | 80   | 1.0847          |
| 0.6995        | 0.3   | 100  | 1.0573          |
| 0.6883        | 0.36  | 120  | 1.0412          |
| 0.6437        | 0.42  | 140  | 1.0375          |
| 0.6331        | 0.48  | 160  | 1.0186          |
| 0.6686        | 0.54  | 180  | 1.0153          |
| 0.6767        | 0.6   | 200  | 1.0042          |
| 0.7037        | 0.66  | 220  | 1.0023          |
| 0.6994        | 0.72  | 240  | 1.0014          |
| 0.7012        | 0.78  | 260  | 0.9996          |
| 0.6599        | 0.84  | 280  | 0.9926          |
| 0.6401        | 0.9   | 300  | 0.9913          |
| 0.6665        | 0.96  | 320  | 0.9910          |
| 0.5771        | 1.02  | 340  | 0.9907          |
| 0.6286        | 1.08  | 360  | 0.9830          |
| 0.6064        | 1.14  | 380  | 0.9865          |
| 0.5976        | 1.19  | 400  | 0.9802          |
| 0.5512        | 1.25  | 420  | 0.9817          |
| 0.6333        | 1.31  | 440  | 0.9810          |
| 0.5883        | 1.37  | 460  | 0.9817          |
| 0.5822        | 1.43  | 480  | 0.9783          |
| 0.5878        | 1.49  | 500  | 0.9757          |
| 0.5951        | 1.55  | 520  | 0.9753          |
| 0.6466        | 1.61  | 540  | 0.9719          |
| 0.6246        | 1.67  | 560  | 0.9681          |
| 0.627         | 1.73  | 580  | 0.9705          |
| 0.6214        | 1.79  | 600  | 0.9691          |
| 0.6558        | 1.85  | 620  | 0.9709          |
| 0.5736        | 1.91  | 640  | 0.9674          |
| 0.6188        | 1.97  | 660  | 0.9674          |
| 0.5293        | 2.03  | 680  | 0.9742          |
| 0.5463        | 2.09  | 700  | 0.9766          |
| 0.5184        | 2.15  | 720  | 0.9776          |
| 0.5349        | 2.21  | 740  | 0.9783          |
| 0.5536        | 2.27  | 760  | 0.9794          |
| 0.5016        | 2.33  | 780  | 0.9822          |
| 0.5075        | 2.39  | 800  | 0.9795          |
| 0.5529        | 2.45  | 820  | 0.9808          |
| 0.5168        | 2.51  | 840  | 0.9784          |
| 0.5416        | 2.57  | 860  | 0.9793          |
| 0.4845        | 2.63  | 880  | 0.9804          |
| 0.5487        | 2.69  | 900  | 0.9801          |
| 0.5313        | 2.75  | 920  | 0.9797          |
| 0.5449        | 2.81  | 940  | 0.9790          |
| 0.5303        | 2.87  | 960  | 0.9795          |
| 0.5599        | 2.93  | 980  | 0.9795          |
| 0.544         | 2.99  | 1000 | 0.9790          |


### Framework versions

- Transformers 4.34.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.0

### Citations

Alpaca COT datasets
```
@misc{alpaca-cot,
  author = {Qingyi Si, Zheng Lin },
  school = {Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China},
  title = {Alpaca-CoT: An Instruction Fine-Tuning Platform with Instruction Data Collection and Unified Large Language Models Interface},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/PhoebusSi/alpaca-CoT}},
}
```
Stanford Alpaca
```
@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
```
Google FLAN
```
@inproceedings{weifinetuned,
  title={Finetuned Language Models are Zero-Shot Learners},
  author={Wei, Jason and Bosma, Maarten and Zhao, Vincent and Guu, Kelvin and Yu, Adams Wei and Lester, Brian and Du, Nan and Dai, Andrew M and Le, Quoc V},
  booktitle={International Conference on Learning Representations}
}
```