Myashka
/

gpt-imdb-hinge-beta_0.1

@@ -14,15 +14,16 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1683
-- Rewards/chosen: -2.7548
-- Rewards/rejected: -6.4029
-- Rewards/accuracies: 0.9271
-- Rewards/margins: 3.6481
-- Logps/rejected: -327.7139
-- Logps/chosen: -262.8134
-- Logits/rejected: -45.7026
-- Logits/chosen: -46.9404
 ## Model description
@@ -64,10 +65,10 @@ The following hyperparameters were used during training:
 | 0.2231        | 1.67  | 4000 | 0.1884          | -2.3281        | -5.6096          | 0.9208             | 3.2815          | -319.7815      | -258.5467    | -45.7720        | -46.8600      |
 | 0.2269        | 1.88  | 4500 | 0.1785          | -2.5145        | -6.0015          | 0.9292             | 3.4871          | -323.7006      | -260.4101    | -45.7220        | -46.8746      |
 | 0.1831        | 2.08  | 5000 | 0.1727          | -2.6850        | -6.2801          | 0.9312             | 3.5951          | -326.4862      | -262.1152    | -45.0514        | -46.1610      |
-| 0.0112        | 2.29  | 5500 | 0.1682          | -2.5613        | -6.0913          | 0.9312             | 3.5300          | -324.5987      | -260.8782    | -45.3410        | -46.5522      |
 | 0.1894        | 2.5   | 6000 | 0.1706          | -2.7334        | -6.3632          | 0.9271             | 3.6298          | -327.3174      | -262.5995    | -45.2020        | -46.4449      |
 | 0.13          | 2.71  | 6500 | 0.1685          | -2.7681        | -6.4203          | 0.9250             | 3.6522          | -327.8886      | -262.9462    | -45.5580        | -46.8017      |
-| 0.2717        | 2.92  | 7000 | **0.1683**          | -2.7548        | -6.4029          | 0.9271             | 3.6481          | -327.7139      | -262.8134    | -45.7026        | -46.9404      |
 ### Framework versions

 This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Step: 5500
+- Loss: 0.1682
+- Rewards/chosen: -2.5613
+- Rewards/rejected: -6.0913
+- Rewards/accuracies: 0.9312
+- Rewards/margins: 3.5300
+- Logps/rejected: -324.5987
+- Logps/chosen: -260.8782
+- Logits/rejected: -45.3410
+- Logits/chosen: -46.5522
 ## Model description
 | 0.2231        | 1.67  | 4000 | 0.1884          | -2.3281        | -5.6096          | 0.9208             | 3.2815          | -319.7815      | -258.5467    | -45.7720        | -46.8600      |
 | 0.2269        | 1.88  | 4500 | 0.1785          | -2.5145        | -6.0015          | 0.9292             | 3.4871          | -323.7006      | -260.4101    | -45.7220        | -46.8746      |
 | 0.1831        | 2.08  | 5000 | 0.1727          | -2.6850        | -6.2801          | 0.9312             | 3.5951          | -326.4862      | -262.1152    | -45.0514        | -46.1610      |
+| 0.0112        | 2.29  | 5500 | **0.1682**          | -2.5613        | -6.0913          | 0.9312             | 3.5300          | -324.5987      | -260.8782    | -45.3410        | -46.5522      |
 | 0.1894        | 2.5   | 6000 | 0.1706          | -2.7334        | -6.3632          | 0.9271             | 3.6298          | -327.3174      | -262.5995    | -45.2020        | -46.4449      |
 | 0.13          | 2.71  | 6500 | 0.1685          | -2.7681        | -6.4203          | 0.9250             | 3.6522          | -327.8886      | -262.9462    | -45.5580        | -46.8017      |
+| 0.2717        | 2.92  | 7000 | 0.1683          | -2.7548        | -6.4029          | 0.9271             | 3.6481          | -327.7139      | -262.8134    | -45.7026        | -46.9404      |
 ### Framework versions