Update README.md
Browse files
README.md
CHANGED
@@ -31,7 +31,7 @@ Even though this version of GPT-2 has been finely tuned and is quite simple, it
|
|
31 |
⚠️ Since the dataset used for this model is mostly composed of news articles, it is heavily biased towards generating news content. This bias may become apparent during the generation process.
|
32 |
|
33 |
## Training procedure
|
34 |
-
The model was trained for
|
35 |
|
36 |
## Usage Details
|
37 |
|
@@ -63,9 +63,9 @@ The following hyperparameters were used during training:
|
|
63 |
|
64 |
| Training Loss | Epoch | Step | Validation Loss |
|
65 |
|:-------------:|:-----:|:-----:|:---------------:|
|
66 |
-
| 2.
|
67 |
-
| 1.
|
68 |
-
| 1.
|
69 |
|
70 |
|
71 |
### Framework versions
|
|
|
31 |
⚠️ Since the dataset used for this model is mostly composed of news articles, it is heavily biased towards generating news content. This bias may become apparent during the generation process.
|
32 |
|
33 |
## Training procedure
|
34 |
+
The model was trained for 12+ hours on Kaggle GPUs.
|
35 |
|
36 |
## Usage Details
|
37 |
|
|
|
63 |
|
64 |
| Training Loss | Epoch | Step | Validation Loss |
|
65 |
|:-------------:|:-----:|:-----:|:---------------:|
|
66 |
+
| 2.0233 | 1.0 | 15323 | 2.3348 |
|
67 |
+
| 1.6938 | 2.0 | 30646 | 1.8377 |
|
68 |
+
| 1.4938 | 3.0 | 45969 | 1.6498 |
|
69 |
|
70 |
|
71 |
### Framework versions
|