File size: 691 Bytes
9b50047 940019e 9b50047 701e8d5 9b50047 0d14b39 a513c8c 300f982 9b50047 4d5a8a8 9b50047 22652b6 940019e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
---
library_name: transformers
license: apache-2.0
---
# llama-161M
Trained on 100B tokens.
- 1e-3 LR
- 0.1 wd
- WSD scheduler with 10% decay
- 80% code, 10% NL, 10% instruction data
- Dataset decontaminated against popular benchmarks following [bigcode](https://github.com/bigcode-project/bigcode-dataset/tree/main/decontamination)
- 8x3090s 110~ hours
This is a *base* pretrained model and requires further fine tuning to be useful.
## Model Details
| [openai/openai_humaneval](https://huggingface.co/datasets/openai/openai_humaneval) (greedy) | [mbpp](https://huggingface.co/datasets/google-research-datasets/mbpp) (greedy) |
| :------------------ | :------------- |
| 9.2% | 9.8% | |