munish0838 commited on
Commit
9ace186
1 Parent(s): f6b0000

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: abacaj/llama-161M-100B
5
+ pipeline_tag: text-generation
6
+ ---
7
+
8
+ # QuantFactory/llama-161M-100B-GGUF
9
+ This is quantized version of [abacaj/llama-161M-100B](https://huggingface.co/abacaj/llama-161M-100B) created using llama.cpp
10
+
11
+ # Model Description
12
+
13
+ Trained on 100B tokens.
14
+ - 1e-3 LR
15
+ - 0.1 wd
16
+ - WSD scheduler with 10% decay
17
+ - 80% code, 10% NL, 10% instruction data
18
+ - Dataset decontaminated against popular benchmarks following [bigcode](https://github.com/bigcode-project/bigcode-dataset/tree/main/decontamination)
19
+ - 8x3090s 110~ hours
20
+
21
+
22
+ This is a *base* pretrained model and requires further fine tuning to be useful.
23
+
24
+ ## Model Details
25
+
26
+ | [openai/openai_humaneval](https://huggingface.co/datasets/openai/openai_humaneval) (greedy) | [mbpp](https://huggingface.co/datasets/google-research-datasets/mbpp) (greedy) |
27
+ | :------------------ | :------------- |
28
+ | 9.2% | 9.8% |