p1atdev commited on
Commit
fbdcb30
1 Parent(s): 8170002

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -181,7 +181,7 @@ TODO
181
  #### Training Hyperparameters
182
 
183
  The following hyperparameters were used during training:
184
- - learning_rate: 0.00025
185
  - train_batch_size: 1024
186
  - eval_batch_size: 256
187
  - seed: 42
@@ -190,7 +190,7 @@ The following hyperparameters were used during training:
190
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
191
  - lr_scheduler_type: cosine
192
  - lr_scheduler_warmup_steps: 1000
193
- - num_epochs: 4
194
 
195
  ## Evaluation
196
 
@@ -204,7 +204,7 @@ The architecture of this model is [Mixtral](https://huggingface.co/docs/transfor
204
 
205
  ### Compute Infrastructure
206
 
207
- Private server.
208
 
209
  #### Hardware
210
 
 
181
  #### Training Hyperparameters
182
 
183
  The following hyperparameters were used during training:
184
+ - learning_rate: 0.0005
185
  - train_batch_size: 1024
186
  - eval_batch_size: 256
187
  - seed: 42
 
190
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
191
  - lr_scheduler_type: cosine
192
  - lr_scheduler_warmup_steps: 1000
193
+ - num_epochs: 5
194
 
195
  ## Evaluation
196
 
 
204
 
205
  ### Compute Infrastructure
206
 
207
+ Server in a university laboratory
208
 
209
  #### Hardware
210