buzzcraft commited on
Commit
3d8ca67
β€’
1 Parent(s): aaa98a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -97
README.md CHANGED
@@ -1,112 +1,50 @@
1
  ---
2
  base_model: meta-llama/Llama-2-7b-chat-hf
3
  tags:
4
- - generated_from_trainer
5
- model-index:
6
- - name: llama-le-out
7
- results: []
8
- ---
9
-
10
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
- should probably proofread and complete it, then remove this comment. -->
12
-
13
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
14
- # llama-le-out
15
-
16
- This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) on the None dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 0.6239
19
 
20
- ## Model description
21
-
22
- More information needed
23
-
24
- ## Intended uses & limitations
25
 
26
- More information needed
27
 
28
- ## Training and evaluation data
29
 
30
- More information needed
 
31
 
32
- ## Training procedure
33
 
34
- ### Training hyperparameters
 
 
 
 
 
 
35
 
36
- The following hyperparameters were used during training:
37
- - learning_rate: 0.0002
38
- - train_batch_size: 4
39
- - eval_batch_size: 4
40
- - seed: 42
41
- - distributed_type: multi-GPU
42
- - num_devices: 4
43
- - gradient_accumulation_steps: 2
44
- - total_train_batch_size: 32
45
- - total_eval_batch_size: 16
46
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
- - lr_scheduler_type: cosine
48
- - lr_scheduler_warmup_steps: 10
49
- - num_epochs: 3
50
 
51
- ### Training results
 
 
 
 
52
 
53
- | Training Loss | Epoch | Step | Validation Loss |
54
- |:-------------:|:-----:|:----:|:---------------:|
55
- | 0.9364 | 0.06 | 100 | 0.8000 |
56
- | 0.809 | 0.12 | 200 | 0.7724 |
57
- | 0.8695 | 0.18 | 300 | 0.7571 |
58
- | 0.7512 | 0.24 | 400 | 0.7406 |
59
- | 0.8266 | 0.3 | 500 | 0.7327 |
60
- | 0.7898 | 0.35 | 600 | 0.7238 |
61
- | 0.9163 | 0.41 | 700 | 0.7135 |
62
- | 0.6955 | 0.47 | 800 | 0.7025 |
63
- | 0.7887 | 0.53 | 900 | 0.7009 |
64
- | 0.7361 | 0.59 | 1000 | 0.6911 |
65
- | 0.7736 | 0.65 | 1100 | 0.6897 |
66
- | 0.7135 | 0.71 | 1200 | 0.6859 |
67
- | 0.8138 | 0.77 | 1300 | 0.6788 |
68
- | 0.7172 | 0.83 | 1400 | 0.6720 |
69
- | 0.7387 | 0.89 | 1500 | 0.6695 |
70
- | 0.7042 | 0.95 | 1600 | 0.6688 |
71
- | 0.7231 | 1.0 | 1700 | 0.6652 |
72
- | 0.7136 | 1.06 | 1800 | 0.6626 |
73
- | 0.694 | 1.12 | 1900 | 0.6583 |
74
- | 0.7401 | 1.18 | 2000 | 0.6551 |
75
- | 0.63 | 1.24 | 2100 | 0.6519 |
76
- | 0.6506 | 1.3 | 2200 | 0.6478 |
77
- | 0.7436 | 1.36 | 2300 | 0.6457 |
78
- | 0.5903 | 1.42 | 2400 | 0.6452 |
79
- | 0.6861 | 1.48 | 2500 | 0.6399 |
80
- | 0.6576 | 1.54 | 2600 | 0.6412 |
81
- | 0.6327 | 1.59 | 2700 | 0.6357 |
82
- | 0.6634 | 1.65 | 2800 | 0.6378 |
83
- | 0.6419 | 1.71 | 2900 | 0.6349 |
84
- | 0.6573 | 1.77 | 3000 | 0.6344 |
85
- | 0.7052 | 1.83 | 3100 | 0.6327 |
86
- | 0.6438 | 1.89 | 3200 | 0.6292 |
87
- | 0.713 | 1.95 | 3300 | 0.6283 |
88
- | 0.6357 | 2.01 | 3400 | 0.6293 |
89
- | 0.5736 | 2.07 | 3500 | 0.6302 |
90
- | 0.591 | 2.13 | 3600 | 0.6307 |
91
- | 0.6995 | 2.19 | 3700 | 0.6295 |
92
- | 0.6708 | 2.24 | 3800 | 0.6277 |
93
- | 0.6329 | 2.3 | 3900 | 0.6262 |
94
- | 0.6138 | 2.36 | 4000 | 0.6271 |
95
- | 0.6316 | 2.42 | 4100 | 0.6266 |
96
- | 0.6022 | 2.48 | 4200 | 0.6260 |
97
- | 0.7221 | 2.54 | 4300 | 0.6252 |
98
- | 0.6943 | 2.6 | 4400 | 0.6256 |
99
- | 0.6616 | 2.66 | 4500 | 0.6246 |
100
- | 0.6185 | 2.72 | 4600 | 0.6247 |
101
- | 0.6417 | 2.78 | 4700 | 0.6239 |
102
- | 0.6238 | 2.84 | 4800 | 0.6237 |
103
- | 0.6024 | 2.89 | 4900 | 0.6236 |
104
- | 0.6059 | 2.95 | 5000 | 0.6239 |
105
 
 
 
 
106
 
107
- ### Framework versions
108
 
109
- - Transformers 4.34.1
110
- - Pytorch 2.0.1+cu118
111
- - Datasets 2.14.6
112
- - Tokenizers 0.14.1
 
1
  ---
2
  base_model: meta-llama/Llama-2-7b-chat-hf
3
  tags:
4
+ - mistral
5
+ - instruct
6
+ - finetune
7
+ language:
8
+ - no
 
 
 
 
 
 
 
 
 
 
9
 
10
+ license: cc-by-nc-sa-4.0
11
+ ---
 
 
 
12
 
13
+ # NorskGPT-Mistral-7b
14
 
15
+ This model is a Norwegian variant of Llama-2-7b-chat-hf, fine-tuned on a carefully selected mix of Norwegian instruction pairs. The model is tuned to understand and generate text in Norwegain.
16
 
17
+ ## Intended Use
18
+ This model is intended for personal and research use in Norwegian and can be used as an assistant-like chat.
19
 
20
+ ## Prompt Template
21
 
22
+ ```
23
+ ### Instruction:
24
+ Summarize following text.
25
+ ### Input:
26
+ Text to be summarized
27
+ ### Response:
28
+ ```
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
+ ## Limitations
32
+ * This is an LLM, not a knowledge model. It can not be expected to have more information about Norway than the base model.
33
+ * It will generally preform better on tasks that involves summarization, question answering and chat, than on tasks that requires more knowledge about Norway, specific domains, or tasks where the model can answer freely.
34
+ * The model is released as is, and would in most cases need prompt tuning to achieve optimal results.
35
+
36
 
37
+ ## License
38
+ [Attribution-NonCommercial-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-nc-sa/4.0/)
39
+ You are free to:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
+ Share β€” copy and redistribute the material in any medium or format
42
+ Adapt β€” remix, transform, and build upon the material
43
+ The licensor cannot revoke these freedoms as long as you follow the license terms.
44
 
45
+ Under the following terms:
46
 
47
+ Attribution β€” You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
48
+ NonCommercial β€” You may not use the material for commercial purposes .
49
+ ShareAlike β€” If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
50
+ No additional restrictions β€” You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.