CaterinaLac commited on
Commit
0d6c33d
1 Parent(s): fee6c22

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,4 +12,6 @@ language:
12
  - fr
13
  ---
14
 
15
- This model is a Llama2-7B model finetuned on the union of ShareGPT, the exams dataset and Orca.
 
 
 
12
  - fr
13
  ---
14
 
15
+ This model is a Llama2-7B model finetuned on the union of ShareGPT, the exams dataset and a subset of the Orca dataset.
16
+ The finetuning was performed with [DeepSpeed Chat](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat) toolkit (step 1, sft).
17
+ The model run for three epochs before reaching a plateau on the validation dataset. We used a cosine scheduler, with an initial LR of 2e-5.