timinar
/

baby-llama-58m

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

timinar commited on Oct 23, 2023

Commit

f8abda2

•

1 Parent(s): 8a6eadc

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -11,6 +11,7 @@ Our submission to the `strict-small` track of the [BabyLM challenge](https://bab
 Baby Llama is a 58M-parameter model, distilled from an ensemble consisting of LLaMA-360M and GPT2-705M, both trained on the `babylm_10M` dataset.
 See the associated paper (arXiv number **TBA**) for a detailed discussion of the training procedure and of the model performance.
 ### Hyperparameters for the tasks that require fine-tuning

 Baby Llama is a 58M-parameter model, distilled from an ensemble consisting of LLaMA-360M and GPT2-705M, both trained on the `babylm_10M` dataset.
 See the associated paper (arXiv number **TBA**) for a detailed discussion of the training procedure and of the model performance.
+The training code is available at [https://github.com/timinar/BabyLlama](https://github.com/timinar/BabyLlama).
 ### Hyperparameters for the tasks that require fine-tuning