--- license: apache-2.0 datasets: - CaterinaLac/sharegpt-deduplicated - exams - Open-Orca/OpenOrca language: - en - zh - ko - ja - fr --- This model is a Llama2-7B model finetuned on the union of ShareGPT, the exams dataset and a subset of the Orca dataset. The finetuning was performed with [DeepSpeed Chat](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat) toolkit (step 1, sft). The model run for three epochs before reaching a plateau on the validation dataset. We used a cosine scheduler, with an initial LR of 2e-5.