Commit
·
8cd2153
1
Parent(s):
cf0da05
Update README.md
Browse files
README.md
CHANGED
@@ -8,8 +8,50 @@ license: apache-2.0
|
|
8 |
|
9 |
Model:
|
10 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
```
|
12 |
|
13 |
Dataset:
|
14 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
|
9 |
Model:
|
10 |
```
|
11 |
+
falcon-7b:
|
12 |
+
dtype: bf16
|
13 |
+
log_dir: "falcon_log_7b"
|
14 |
+
learning_rate: 1e-5
|
15 |
+
model_name: "tiiuae/falcon-7b"
|
16 |
+
deepspeed_config: configs/zero_config.json
|
17 |
+
output_dir: falcon
|
18 |
+
weight_decay: 0.0
|
19 |
+
max_length: 2048
|
20 |
+
warmup_steps: 20
|
21 |
+
gradient_checkpointing: true
|
22 |
+
gradient_accumulation_steps: 4
|
23 |
+
per_device_train_batch_size: 4
|
24 |
+
per_device_eval_batch_size: 8
|
25 |
+
eval_steps: 100
|
26 |
+
save_steps: 500
|
27 |
+
save_strategy: steps
|
28 |
+
num_train_epochs: 8
|
29 |
+
save_total_limit: 4
|
30 |
+
residual_dropout: 0.2
|
31 |
+
residual_dropout_lima: true
|
32 |
```
|
33 |
|
34 |
Dataset:
|
35 |
```
|
36 |
+
oasst-top1:
|
37 |
+
# oasst_export: 11123 (100.00%)
|
38 |
+
save_strategy: steps
|
39 |
+
eval_steps: 80
|
40 |
+
save_steps: 80
|
41 |
+
datasets:
|
42 |
+
- oasst_export:
|
43 |
+
lang: "bg,ca,cs,da,de,en,es,fr,hr,hu,it,nl,pl,pt,ro,ru,sl,sr,sv,uk" # sft-8.0
|
44 |
+
input_file_path: 2023-06-02_oasst_all_labels.jsonl.gz
|
45 |
+
val_split: 0.05
|
46 |
+
top_k: 1
|
47 |
```
|
48 |
+
|
49 |
+
Train command:
|
50 |
+
```
|
51 |
+
deepspeed trainer_sft.py --configs defaults falcon-7b oasst-top1 --cache_dir <data_cache_dir> --output_dir <output_path> --deepspeed
|
52 |
+
```
|
53 |
+
|
54 |
+
Export command:
|
55 |
+
```
|
56 |
+
python export_model.py --dtype bf16 --hf_repo_name OpenAssistant/falcon-7b-sft-top1 --trust_remote_code --auth_token <auth_token> <output_path> --max_shard_size 2GB
|
57 |
+
```
|