rhysjones commited on
Commit
029b2e5
1 Parent(s): 7ecae3b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -19,6 +19,8 @@ Training took 20 hours on a single 4090 GPU, giving the following graphs:
19
 
20
  ![gpt2-124M-edu-fineweb-10B](https://huggingface.co/rhysjones/gpt2-124M-edu-fineweb-10B/resolve/main/graph.png)
21
 
 
 
22
  The training parameters where:
23
  ```
24
  ./train_gpt2cu \
@@ -37,4 +39,18 @@ The training parameters where:
37
  -n 5000 \
38
  -v 250 -s 20000 \
39
  -h 1
40
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ![gpt2-124M-edu-fineweb-10B](https://huggingface.co/rhysjones/gpt2-124M-edu-fineweb-10B/resolve/main/graph.png)
21
 
22
+ ## Training
23
+
24
  The training parameters where:
25
  ```
26
  ./train_gpt2cu \
 
39
  -n 5000 \
40
  -v 250 -s 20000 \
41
  -h 1
42
+ ```
43
+
44
+ The model has had no further finetuning.
45
+
46
+ ## Evaluation
47
+ Evals using [Eleuther AI Harness](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463) gives:
48
+ | Eval Test | Score |
49
+ | --------- | ----- |
50
+ | arc_challenge (25 shot) | 24.83 |
51
+ | gsm8k (5 shot) | 0.00 |
52
+ | hellaswag (10 shot) | 32.52 |
53
+ | mmlu (5 shot) | 25.95 |
54
+ | truthfulqa (0 shot) | 42.45 |
55
+ | winogrande (5 shot) | 53.35 |
56
+ | **Overall Score** | **29.85** |