Locutusque commited on
Commit
e66b659
1 Parent(s): 8dd987b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -38,7 +38,7 @@ Llama-3-Hercules-5.0-8B is well-suited to the following applications:
38
  - This model was trained on 8 kaggle TPUs, using torch xla SPMD for high MXU efficiency. There was no expense on my end (meaning you can reproduce this too!)
39
  - A learning rate of 2e-5 with the Adam optimizer. A linear scheduler was used, with an end factor of 0.005.
40
  - No mixed precision was used, with the default dtype being bfloat16.
41
- - A total batch size of 64 was used.
42
  - Trained on all examples of Hercules-v5.0 for 2 epochs
43
  - No model parameters were frozen and no quantization was used.
44
  - This model was trained on OpenAI's ChatML prompt format. Because this model has function calling capabilities, the prompt format is slightly different, here's what it would look like: ```<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{user message}<|im_end|>\n<|im_start|>call\n{function call message}<|im_end|>\n<|im_start|>function\n{function response message}<|im_end|>\n<|im_start|>assistant\n{assistant message}</s>```
 
38
  - This model was trained on 8 kaggle TPUs, using torch xla SPMD for high MXU efficiency. There was no expense on my end (meaning you can reproduce this too!)
39
  - A learning rate of 2e-5 with the Adam optimizer. A linear scheduler was used, with an end factor of 0.005.
40
  - No mixed precision was used, with the default dtype being bfloat16.
41
+ - A total batch size of 128 was used.
42
  - Trained on all examples of Hercules-v5.0 for 2 epochs
43
  - No model parameters were frozen and no quantization was used.
44
  - This model was trained on OpenAI's ChatML prompt format. Because this model has function calling capabilities, the prompt format is slightly different, here's what it would look like: ```<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{user message}<|im_end|>\n<|im_start|>call\n{function call message}<|im_end|>\n<|im_start|>function\n{function response message}<|im_end|>\n<|im_start|>assistant\n{assistant message}</s>```