munish0838
commited on
Commit
•
9ace186
1
Parent(s):
f6b0000
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: transformers
|
3 |
+
license: apache-2.0
|
4 |
+
base_model: abacaj/llama-161M-100B
|
5 |
+
pipeline_tag: text-generation
|
6 |
+
---
|
7 |
+
|
8 |
+
# QuantFactory/llama-161M-100B-GGUF
|
9 |
+
This is quantized version of [abacaj/llama-161M-100B](https://huggingface.co/abacaj/llama-161M-100B) created using llama.cpp
|
10 |
+
|
11 |
+
# Model Description
|
12 |
+
|
13 |
+
Trained on 100B tokens.
|
14 |
+
- 1e-3 LR
|
15 |
+
- 0.1 wd
|
16 |
+
- WSD scheduler with 10% decay
|
17 |
+
- 80% code, 10% NL, 10% instruction data
|
18 |
+
- Dataset decontaminated against popular benchmarks following [bigcode](https://github.com/bigcode-project/bigcode-dataset/tree/main/decontamination)
|
19 |
+
- 8x3090s 110~ hours
|
20 |
+
|
21 |
+
|
22 |
+
This is a *base* pretrained model and requires further fine tuning to be useful.
|
23 |
+
|
24 |
+
## Model Details
|
25 |
+
|
26 |
+
| [openai/openai_humaneval](https://huggingface.co/datasets/openai/openai_humaneval) (greedy) | [mbpp](https://huggingface.co/datasets/google-research-datasets/mbpp) (greedy) |
|
27 |
+
| :------------------ | :------------- |
|
28 |
+
| 9.2% | 9.8% |
|