---
tags:
- generated_from_trainer
model-index:
- name: vicuna_13b_stage1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
# vicuna_13b_stage1

This model was trained from scratch on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.2017

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 40
- num_epochs: 1

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 1.9535        | 0.02  | 40   | 1.9456          |
| 1.8556        | 0.04  | 80   | 1.7714          |
| 1.791         | 0.06  | 120  | 1.7425          |
| 1.6622        | 0.08  | 160  | 1.7164          |
| 1.8169        | 0.1   | 200  | 1.7154          |
| 1.7356        | 0.12  | 240  | 1.7026          |
| 1.6051        | 0.14  | 280  | 1.7104          |
| 1.7925        | 0.16  | 320  | 1.7127          |
| 1.8257        | 0.18  | 360  | 1.7055          |
| 1.7057        | 0.2   | 400  | 1.6906          |
| 1.9282        | 0.22  | 440  | 1.6746          |
| 1.668         | 0.24  | 480  | 1.7052          |
| 1.6273        | 0.26  | 520  | 1.6620          |
| 1.6136        | 0.28  | 560  | 1.6616          |
| 1.4754        | 0.3   | 600  | 1.6389          |
| 1.4024        | 0.32  | 640  | 1.6038          |
| 1.6773        | 0.34  | 680  | 1.5743          |
| 1.6008        | 0.36  | 720  | 1.5607          |
| 1.568         | 0.39  | 760  | 1.5236          |
| 1.4922        | 0.41  | 800  | 1.5158          |
| 1.4667        | 0.43  | 840  | 1.4938          |
| 1.5653        | 0.45  | 880  | 1.4692          |
| 1.331         | 0.47  | 920  | 1.4581          |
| 1.4019        | 0.49  | 960  | 1.4290          |
| 1.4925        | 0.51  | 1000 | 1.4087          |
| 1.4772        | 0.53  | 1040 | 1.3961          |
| 1.4728        | 0.55  | 1080 | 1.3817          |
| 1.4555        | 0.57  | 1120 | 1.3559          |
| 1.5487        | 0.59  | 1160 | 1.3399          |
| 1.3888        | 0.61  | 1200 | 1.3212          |
| 1.2544        | 0.63  | 1240 | 1.3099          |
| 1.2657        | 0.65  | 1280 | 1.2972          |
| 1.3641        | 0.67  | 1320 | 1.2815          |
| 1.2915        | 0.69  | 1360 | 1.2687          |
| 1.4182        | 0.71  | 1400 | 1.2541          |
| 1.2515        | 0.73  | 1440 | 1.2427          |
| 1.2287        | 0.75  | 1480 | 1.2352          |
| 1.1886        | 0.77  | 1520 | 1.2285          |
| 1.2651        | 0.79  | 1560 | 1.2219          |
| 1.3341        | 0.81  | 1600 | 1.2145          |
| 1.2357        | 0.83  | 1640 | 1.2107          |
| 1.0767        | 0.85  | 1680 | 1.2080          |
| 1.2158        | 0.87  | 1720 | 1.2051          |
| 1.2042        | 0.89  | 1760 | 1.2034          |
| 1.1887        | 0.91  | 1800 | 1.2023          |
| 1.2662        | 0.93  | 1840 | 1.2018          |
| 1.1866        | 0.95  | 1880 | 1.2017          |
| 1.1798        | 0.97  | 1920 | 1.2017          |
| 1.336         | 0.99  | 1960 | 1.2017          |


### Framework versions

- Transformers 4.34.1
- Pytorch 2.3.1+cu121
- Datasets 2.14.7
- Tokenizers 0.14.1