xaviviro commited on
Commit
117d6f8
1 Parent(s): 6913489

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -67
README.md CHANGED
@@ -7,77 +7,17 @@ base_model: teknium/OpenHermes-2.5-Mistral-7B
7
  model-index:
8
  - name: openhermes-mistral-cat_out
9
  results: []
 
 
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
16
- # openhermes-mistral-cat_out
17
-
18
- This model is a fine-tuned version of [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) on the None dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 1.6083
21
-
22
- ## Model description
23
-
24
- More information needed
25
-
26
- ## Intended uses & limitations
27
-
28
- More information needed
29
-
30
- ## Training and evaluation data
31
-
32
- More information needed
33
-
34
- ## Training procedure
35
-
36
- ### Training hyperparameters
37
 
38
- The following hyperparameters were used during training:
39
- - learning_rate: 0.0002
40
- - train_batch_size: 2
41
- - eval_batch_size: 2
42
- - seed: 42
43
- - gradient_accumulation_steps: 4
44
- - total_train_batch_size: 8
45
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
- - lr_scheduler_type: cosine
47
- - lr_scheduler_warmup_steps: 14
48
- - num_epochs: 2
49
- - mixed_precision_training: Native AMP
50
 
51
- ### Training results
52
-
53
- | Training Loss | Epoch | Step | Validation Loss |
54
- |:-------------:|:-----:|:----:|:---------------:|
55
- | 1.6981 | 1.01 | 61 | 1.6273 |
56
- | 1.6355 | 1.96 | 120 | 1.6083 |
57
-
58
-
59
- ### Framework versions
60
-
61
- - Transformers 4.36.2
62
- - Pytorch 2.1.2+cu121
63
- - Datasets 2.15.0
64
- - Tokenizers 0.15.0
65
- ## Training procedure
66
-
67
-
68
- The following `bitsandbytes` quantization config was used during training:
69
- - quant_method: bitsandbytes
70
- - load_in_8bit: False
71
- - load_in_4bit: True
72
- - llm_int8_threshold: 6.0
73
- - llm_int8_skip_modules: None
74
- - llm_int8_enable_fp32_cpu_offload: False
75
- - llm_int8_has_fp16_weight: False
76
- - bnb_4bit_quant_type: nf4
77
- - bnb_4bit_use_double_quant: True
78
- - bnb_4bit_compute_dtype: float16
79
-
80
- ### Framework versions
81
-
82
-
83
- - PEFT 0.6.0
 
7
  model-index:
8
  - name: openhermes-mistral-cat_out
9
  results: []
10
+ datasets:
11
+ - xaviviro/oasst2_ca_gpt
12
+ language:
13
+ - ca
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
+ # OpenHermes-2.5-Mistral-7B-Catala-LoRA
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
+ This model is a fine-tuned version of [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) on the [xaviviro/oasst2_ca_gpt](https://huggingface.co/datasets/xaviviro/oasst2_ca_gpt) dataset.
 
 
 
 
 
 
 
 
 
 
 
22
 
23
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)