bleysg commited on
Commit
045558d
1 Parent(s): 9ff32be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -1
README.md CHANGED
@@ -11,10 +11,34 @@ pipeline_tag: text-generation
11
 
12
  Unreleased, untested, unfinished beta.
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  # Training
16
 
17
- Trained on 8xA6000s for 3 epochs for 37.5h (12.5h/epoch) at a commodity cost of $240 ($80/epoch).
18
 
19
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
20
 
 
11
 
12
  Unreleased, untested, unfinished beta.
13
 
14
+ # Evaluations
15
+
16
+ We've only done very limited testing as yet. The epoch 4.5 checkpoint scores above 5 on MT-Bench (better than Alpaca-13B, worse than Llama2-7b-chat), while preliminary benchmarks suggest peak average performance was achieved roughly at epoch 4.
17
+
18
+ MT-bench Epoch 4.5 result:
19
+ ```
20
+ Mode: single
21
+ Input file: data/mt_bench/model_judgment/gpt-4_single.jsonl
22
+
23
+ ########## First turn ##########
24
+ score
25
+ model turn
26
+ oo-phi-1_5 1 6.0375
27
+
28
+ ########## Second turn ##########
29
+ score
30
+ model turn
31
+ oo-phi-1_5 2 4.025
32
+
33
+ ########## Average ##########
34
+ score
35
+ model
36
+ oo-phi-1_5 5.03125
37
+ ```
38
 
39
  # Training
40
 
41
+ Trained on 8x A6000s for 5 epochs for 62h (12.5h/epoch) at a commodity cost of $390 ($80/epoch).
42
 
43
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
44