Commit
•
048a29d
1
Parent(s):
c4ebe02
Update README.md
Browse files
README.md
CHANGED
@@ -55,6 +55,11 @@ Fietje 2B instruct was finetuned from [the base model](https://huggingface.co/Br
|
|
55 |
|
56 |
## Training procedure
|
57 |
|
|
|
|
|
|
|
|
|
|
|
58 |
### Training hyperparameters
|
59 |
|
60 |
The following hyperparameters were used during training:
|
|
|
55 |
|
56 |
## Training procedure
|
57 |
|
58 |
+
I am thankful to the [Flemish Supercomputer Center](https://www.vscentrum.be/) (VSC) for providing the computational power to accomplish this project. Accounting for waiting for jobs, training took around a day on four nodes of 4x A100 80GB each (16 total). I cannot find the exact time anymore and I do not think that the runtime in `all_results.json` accounts for interrupted-and-continued jobs.
|
59 |
+
|
60 |
+
Training was done with the wonderful [alignment-handbook](https://github.com/huggingface/alignment-handbook), using DeepSpeed as a back-end. Exact training recipes and SLURM script are given in the [Github repository](https://github.com/BramVanroy/fietje).
|
61 |
+
|
62 |
+
|
63 |
### Training hyperparameters
|
64 |
|
65 |
The following hyperparameters were used during training:
|