Severian commited on
Commit
dd66d6a
1 Parent(s): 169463c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -48
README.md CHANGED
@@ -24,54 +24,14 @@ pipeline_tag: text-generation
24
 
25
  ### Open-Hermes-2.0 (Only first 1500 examples): **[ 1530/125193 4:46:45 < 386:48:08, 0.09 it/s, Epoch 0.01/1]**
26
 
27
- ```
28
- 1483 5.986700
29
- 1484 5.764100
30
- 1485 5.887200
31
- 1486 5.445200
32
- 1487 6.086300
33
- 1488 5.718300
34
- 1489 5.670300
35
- 1490 5.440900
36
- 1491 4.945900
37
- 1492 6.154700
38
- 1493 5.624800
39
- 1494 6.868100
40
- 1495 5.627100
41
- 1496 5.192700
42
- 1497 5.826800
43
- 1498 5.512200
44
- 1499 5.869900
45
- 1500 5.852300
46
- 1501 5.574800
47
- 1502 5.299200
48
- 1503 5.631200
49
- 1504 5.535600
50
- 1505 5.626000
51
- 1506 5.093300
52
- 1507 5.278000
53
- 1508 5.585400
54
- 1509 5.318600
55
- 1510 5.319200
56
- 1511 5.513900
57
- 1512 5.375400
58
- 1513 5.460600
59
- 1514 5.045300
60
- 1515 6.013600
61
- 1516 5.812300
62
- 1517 5.707400
63
- 1518 5.109800
64
- 1519 5.212900
65
- 1520 5.317200
66
- 1521 5.935400
67
- 1522 5.733900
68
- 1523 5.866000
69
- 1524 5.675400
70
- 1525 5.580800
71
- 1526 4.996900
72
- 1527 5.666700
73
- 1528 4.979900
74
- ```
75
 
76
  ### Hyperparameters
77
 
 
24
 
25
  ### Open-Hermes-2.0 (Only first 1500 examples): **[ 1530/125193 4:46:45 < 386:48:08, 0.09 it/s, Epoch 0.01/1]**
26
 
27
+ **Notes:**
28
+
29
+ - Tried over 30+ combinations of hyperparameters. Below are the best I could land on.
30
+
31
+ - Loss hovered around ~5-6 no matter what I tried with the learning rate.
32
+
33
+ - Couldn't increase batch size due to Colab limitations, so the answer may lie somewhere in a perfect balance of Lr and Batch Size.
34
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  ### Hyperparameters
37