andreaskoepf commited on
Commit
480e62b
1 Parent(s): 1f45cad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md CHANGED
@@ -1,3 +1,59 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ wandb: https://wandb.ai/open-assistant/supervised-finetuning/runs/kzy0gark
5
+
6
+
7
+ datasets:
8
+ ```
9
+ pretrain:
10
+ num_train_epochs: 1
11
+ weight_decay: 0.0
12
+ use_custom_sampler: true
13
+ sort_by_length: false
14
+ datasets:
15
+ - joke
16
+ - webgpt:
17
+ val_split: 0.1
18
+ - gpt4all:
19
+ val_split: 0.01
20
+ - alpaca:
21
+ val_split: 0.025
22
+ - code_alpaca:
23
+ val_split: 0.05
24
+ - minimath
25
+ - humaneval_mbpp_codegen_qa
26
+ - humaneval_mbpp_testgen_qa
27
+ - grade_school_math_instructions
28
+ - recipes
29
+ - cmu_wiki_qa
30
+ - oa_wiki_qa_bart_10000row
31
+ - prosocial_dialogue:
32
+ fraction: 0.1
33
+ - explain_prosocial:
34
+ fraction: 0.05
35
+ - oig_file:
36
+ source_url: https://huggingface.co/datasets/laion/OIG/resolve/main/unified_chip2.jsonl
37
+ max_count: 10000
38
+ min_length: 250
39
+ val_split: 0.1
40
+ ```
41
+
42
+
43
+ pythia:
44
+ ```
45
+ pythia-6.9b-pretrain:
46
+ learning_rate: 6e-6
47
+ model_name: EleutherAI/pythia-6.9b-deduped
48
+ deepspeed_config: configs/zero3_config_pretrain.json
49
+ weight_decay: 0.0
50
+ max_length: 2048
51
+ use_flash_attention: true
52
+ warmup_steps: 20
53
+ gradient_checkpointing: false
54
+ gradient_accumulation_steps: 2
55
+ per_device_train_batch_size: 5
56
+ per_device_eval_batch_size: 8
57
+ num_train_epochs: 1
58
+ save_total_limit: 2
59
+ ```