ZhangShenao commited on
Commit
1445c42
1 Parent(s): 2510381

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -23
README.md CHANGED
@@ -1,19 +1,15 @@
1
  ---
2
  license: mit
3
- base_model: ZhangShenao/0.001_2SELM_Zephyr_iter_2
4
  tags:
5
  - alignment-handbook
6
- - trl
7
  - dpo
8
- - generated_from_trainer
9
  - trl
10
- - dpo
11
- - generated_from_trainer
12
  datasets:
13
- - updated
14
- - original
15
  model-index:
16
- - name: 0.001_2SELM_Zephyr_iter_3
17
  results: []
18
  ---
19
 
@@ -22,26 +18,18 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  # 0.001_2SELM_Zephyr_iter_3
24
 
25
- This model is a fine-tuned version of [ZhangShenao/0.001_2SELM_Zephyr_iter_2](https://huggingface.co/ZhangShenao/0.001_2SELM_Zephyr_iter_2) on the updated and the original datasets.
26
 
27
  ## Model description
28
 
29
- More information needed
30
-
31
- ## Intended uses & limitations
32
-
33
- More information needed
34
-
35
- ## Training and evaluation data
36
-
37
- More information needed
38
-
39
- ## Training procedure
40
 
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
- - learning_rate: 5e-07
45
  - train_batch_size: 8
46
  - eval_batch_size: 8
47
  - seed: 42
@@ -55,9 +43,13 @@ The following hyperparameters were used during training:
55
  - lr_scheduler_warmup_ratio: 0.1
56
  - num_epochs: 1
57
 
58
- ### Training results
59
-
60
 
 
 
 
 
 
61
 
62
  ### Framework versions
63
 
 
1
  ---
2
  license: mit
3
+ base_model: ZhangShenao/SELM-Zephyr-7B-iter-1
4
  tags:
5
  - alignment-handbook
 
6
  - dpo
 
7
  - trl
8
+ - selm
 
9
  datasets:
10
+ - HuggingFaceH4/ultrafeedback_binarized
 
11
  model-index:
12
+ - name: SELM-Zephyr-7B-iter-2
13
  results: []
14
  ---
15
 
 
18
 
19
  # 0.001_2SELM_Zephyr_iter_3
20
 
21
+ This model is a fine-tuned version of [ZhangShenao/SELM-Zephyr-7B-iter-1](https://huggingface.co/ZhangShenao/SELM-Zephyr-7B-iter-1) using synthetic data based on on the HuggingFaceH4/ultrafeedback_binarized dataset.
22
 
23
  ## Model description
24
 
25
+ - Model type: A 7B parameter Zephyr-based Self-Exploring Language Models (SELM).
26
+ - Language(s) (NLP): Primarily English
27
+ - License: MIT
 
 
 
 
 
 
 
 
28
 
29
  ### Training hyperparameters
30
 
31
  The following hyperparameters were used during training:
32
+ - alpha: 0.001
33
  - train_batch_size: 8
34
  - eval_batch_size: 8
35
  - seed: 42
 
43
  - lr_scheduler_warmup_ratio: 0.1
44
  - num_epochs: 1
45
 
46
+ ## Results
 
47
 
48
+ | AlpacaEval 2.0 (LC Win Rate) | MT-Bench (Average) |
49
+ |---------------------------|---------------|
50
+ SELM-Zephyr-7B-iter-3 (https://huggingface.co/ZhangShenao/SELM-Zephyr-7B-iter-3) 24.00 7.48 |
51
+ SELM-Zephyr-7B-iter-2 (https://huggingface.co/ZhangShenao/SELM-Zephyr-7B-iter-2) 23.40 7.72 |
52
+ SELM-Zephyr-7B-iter-1 (https://huggingface.co/ZhangShenao/SELM-Zephyr-7B-iter-1) 20.28 - |
53
 
54
  ### Framework versions
55