File size: 5,903 Bytes

---
license: mit
base_model: mistralai/Mistral-7B-v0.1
tags:
- generated_from_trainer
model-index:
- name: supercot-lora
  results: []
---

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
# mistral-v0.1-supercot-lora

This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the [supercot](https://huggingface.co/datasets/kaiokendev/SuperCOT-dataset) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9790

## Model description

SuperCOT is a LoRA trained with the aim of making Mistral follow prompts for Langchain better, by infusing chain-of-thought datasets, code explanations and instructions, snippets, logical deductions and Alpaca GPT-4 prompts. It uses a mixture of the following datasets:

https://huggingface.co/datasets/QingyiSi/Alpaca-CoT
- Chain of thought QED
- Chain of thought Aqua
- CodeAlpaca
  
https://huggingface.co/datasets/neulab/conala
- Code snippets
  
https://huggingface.co/datasets/yahma/alpaca-cleaned
- Alpaca GPT4

## Intended uses & limitations

The model will show biases similar to those exhibited by the base model. It is not intended for supplying factual information or advice in any form.

## Training and evaluation data

[kaiokendev/SuperCOT-dataset](https://huggingface.co/datasets/kaiokendev/SuperCOT-dataset)

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.7661        | 0.06  | 20   | 1.5173          |
| 0.7681        | 0.12  | 40   | 1.2323          |
| 0.6647        | 0.18  | 60   | 1.1306          |
| 0.6742        | 0.24  | 80   | 1.0847          |
| 0.6995        | 0.3   | 100  | 1.0573          |
| 0.6883        | 0.36  | 120  | 1.0412          |
| 0.6437        | 0.42  | 140  | 1.0375          |
| 0.6331        | 0.48  | 160  | 1.0186          |
| 0.6686        | 0.54  | 180  | 1.0153          |
| 0.6767        | 0.6   | 200  | 1.0042          |
| 0.7037        | 0.66  | 220  | 1.0023          |
| 0.6994        | 0.72  | 240  | 1.0014          |
| 0.7012        | 0.78  | 260  | 0.9996          |
| 0.6599        | 0.84  | 280  | 0.9926          |
| 0.6401        | 0.9   | 300  | 0.9913          |
| 0.6665        | 0.96  | 320  | 0.9910          |
| 0.5771        | 1.02  | 340  | 0.9907          |
| 0.6286        | 1.08  | 360  | 0.9830          |
| 0.6064        | 1.14  | 380  | 0.9865          |
| 0.5976        | 1.19  | 400  | 0.9802          |
| 0.5512        | 1.25  | 420  | 0.9817          |
| 0.6333        | 1.31  | 440  | 0.9810          |
| 0.5883        | 1.37  | 460  | 0.9817          |
| 0.5822        | 1.43  | 480  | 0.9783          |
| 0.5878        | 1.49  | 500  | 0.9757          |
| 0.5951        | 1.55  | 520  | 0.9753          |
| 0.6466        | 1.61  | 540  | 0.9719          |
| 0.6246        | 1.67  | 560  | 0.9681          |
| 0.627         | 1.73  | 580  | 0.9705          |
| 0.6214        | 1.79  | 600  | 0.9691          |
| 0.6558        | 1.85  | 620  | 0.9709          |
| 0.5736        | 1.91  | 640  | 0.9674          |
| 0.6188        | 1.97  | 660  | 0.9674          |
| 0.5293        | 2.03  | 680  | 0.9742          |
| 0.5463        | 2.09  | 700  | 0.9766          |
| 0.5184        | 2.15  | 720  | 0.9776          |
| 0.5349        | 2.21  | 740  | 0.9783          |
| 0.5536        | 2.27  | 760  | 0.9794          |
| 0.5016        | 2.33  | 780  | 0.9822          |
| 0.5075        | 2.39  | 800  | 0.9795          |
| 0.5529        | 2.45  | 820  | 0.9808          |
| 0.5168        | 2.51  | 840  | 0.9784          |
| 0.5416        | 2.57  | 860  | 0.9793          |
| 0.4845        | 2.63  | 880  | 0.9804          |
| 0.5487        | 2.69  | 900  | 0.9801          |
| 0.5313        | 2.75  | 920  | 0.9797          |
| 0.5449        | 2.81  | 940  | 0.9790          |
| 0.5303        | 2.87  | 960  | 0.9795          |
| 0.5599        | 2.93  | 980  | 0.9795          |
| 0.544         | 2.99  | 1000 | 0.9790          |


### Framework versions

- Transformers 4.34.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.0

### Citations

Alpaca COT datasets
```
@misc{alpaca-cot,
  author = {Qingyi Si, Zheng Lin },
  school = {Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China},
  title = {Alpaca-CoT: An Instruction Fine-Tuning Platform with Instruction Data Collection and Unified Large Language Models Interface},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/PhoebusSi/alpaca-CoT}},
}
```
Stanford Alpaca
```
@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
```
Google FLAN
```
@inproceedings{weifinetuned,
  title={Finetuned Language Models are Zero-Shot Learners},
  author={Wei, Jason and Bosma, Maarten and Zhao, Vincent and Guu, Kelvin and Yu, Adams Wei and Lester, Brian and Du, Nan and Dai, Andrew M and Le, Quoc V},
  booktitle={International Conference on Learning Representations}
}
```