JetBrains
/

CodeLlama-7B-Kexer

jdev8 commited on May 15, 2024

Commit

a6f080d

•

1 Parent(s): b703bd3

Update README.md (#1)

- Update README.md (704093fb3d36ad12d970386311c3473ba65d0ca5)

Co-authored-by: Anton Shapkin <jdev8@users.noreply.huggingface.co>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -10,6 +10,14 @@ This is CodeLlama model fine-tuned on Kotlin Exercices dataset.
 The model was trained on one A100 GPU with following hyperparameters:
 # Fine-tuning data
 For this model we used 15K exmaples of Kotlin Exercices dataset. For more information about the dataset follow th link.

 The model was trained on one A100 GPU with following hyperparameters:
+|         **Hyperparameter**           |             **Value**              |
+|:---------------------------:|:----------------------------------------:|
+|           `warmup`            |           10%            |
+|        `max_lr`        |          1e-4          |
+|        `scheduler`        |          linear          |
+|        `total_batch_size`        |          256 (~130K tokens per step)          |
 # Fine-tuning data
 For this model we used 15K exmaples of Kotlin Exercices dataset. For more information about the dataset follow th link.