chlee10 commited on
Commit
e838dfa
โ€ข
1 Parent(s): e777de0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md CHANGED
@@ -1,3 +1,63 @@
1
  ---
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: text-generation
3
  license: apache-2.0
4
+ language:
5
+ - en
6
+ tags:
7
+ - SOLAR-10.7B-v1.0
8
+ - Open-platypus-Commercial
9
+ base_model: upstage/SOLAR-10.7B-v1.0
10
+ datasets:
11
+ - kyujinpy/Open-platypus-Commercial
12
+ model-index:
13
+ - name: chlee10/T3Q-Platypus-SOLAR
14
+ results: []
15
  ---
16
+ Update @ 2024.03.07
17
+
18
+ ## T3Q-platypus-SOLAR-10.7B-v1.0
19
+
20
+ This model is a fine-tuned version of upstage/SOLAR-10.7B-v1.0
21
+
22
+ **Model Developers** Chihoon Lee(chlee10), T3Q
23
+
24
+ ## Training hyperparameters
25
+
26
+ The following hyperparameters were used during training:
27
+
28
+ ```python
29
+ # ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ›ˆ๋ จ ํšŸ์ˆ˜์™€ ๊ด€๋ จ๋œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ
30
+ batch_size = 16
31
+ num_epochs = 1
32
+ micro_batch = 1
33
+ gradient_accumulation_steps = batch_size // micro_batch
34
+
35
+ # ํ›ˆ๋ จ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ
36
+ cutoff_len = 4096
37
+ lr_scheduler = 'cosine'
38
+ warmup_ratio = 0.06 # warmup_steps = 100
39
+ learning_rate = 4e-4
40
+ optimizer = 'adamw_torch'
41
+ weight_decay = 0.01
42
+ max_grad_norm = 1.0
43
+
44
+ # LoRA config
45
+ lora_r = 16
46
+ lora_alpha = 16
47
+ lora_dropout = 0.05
48
+ lora_target_modules = ["gate_proj", "down_proj", "up_proj"]
49
+
50
+ # Tokenizer์—์„œ ๋‚˜์˜ค๋Š” input๊ฐ’ ์„ค์ • ์˜ต์…˜
51
+ train_on_inputs = False
52
+ add_eos_token = False
53
+
54
+ # NEFTune params
55
+ noise_alpha: int = 5
56
+ ```
57
+
58
+ ## Framework versions
59
+
60
+ - Transformers 4.34.1
61
+ - Pytorch 2.1.0+cu121
62
+ - Datasets 2.13.0
63
+ - Tokenizers 0.14.1