Text Generation
Transformers
PyTorch
RefinedWeb
falcon-40b
rlhf
falcon
custom_code
text-generation-inference
Inference Endpoints
WeightsnWizardry commited on
Commit
d57f73c
1 Parent(s): 73bed7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -116,16 +116,16 @@ Samples from each of the datasets have been programmatically formatted to chat,
116
  | **Hyperparameter** | **Value** |
117
  |--------------------|------------|
118
  | Num Rollouts | 1024 |
119
- | PPO Epochs | 1 |
120
  | Value Epochs | 1 |
121
  | KL Coef | 0.01 |
122
  | Gamma | 1.0 |
123
  | GAE Lambda | 0.95 |
124
- | Clip Range | 0.2 |
125
  | Clip Range Value | 0.2 |
126
  | Whiten Advantages | `true` |
127
  | Whiten Rewards | `false` |
128
- | Score on EOD | `true` |
129
  | Max Steps | 200 |
130
  | PPO steps/epoch | 1 |
131
  | Value steps/epoch | 8 |
 
116
  | **Hyperparameter** | **Value** |
117
  |--------------------|------------|
118
  | Num Rollouts | 1024 |
119
+ | Policy Epochs | 1 |
120
  | Value Epochs | 1 |
121
  | KL Coef | 0.01 |
122
  | Gamma | 1.0 |
123
  | GAE Lambda | 0.95 |
124
+ | Clip Range Policy | 0.2 |
125
  | Clip Range Value | 0.2 |
126
  | Whiten Advantages | `true` |
127
  | Whiten Rewards | `false` |
128
+ | Score on EOD | `true` |
129
  | Max Steps | 200 |
130
  | PPO steps/epoch | 1 |
131
  | Value steps/epoch | 8 |