armahlovis
commited on
Commit
•
a26e6ea
1
Parent(s):
6e403f8
Update README.md
Browse files
README.md
CHANGED
@@ -35,6 +35,9 @@ The evaluation data set consist of The Underground Railroad, by William Still(40
|
|
35 |
|
36 |
## Training procedure
|
37 |
|
|
|
|
|
|
|
38 |
### Training hyperparameters
|
39 |
|
40 |
The following hyperparameters were used during training:
|
@@ -48,6 +51,8 @@ The following hyperparameters were used during training:
|
|
48 |
|
49 |
### Training results
|
50 |
|
|
|
|
|
51 |
|
52 |
|
53 |
### Framework versions
|
|
|
35 |
|
36 |
## Training procedure
|
37 |
|
38 |
+
After corpus was put together, the text was preprocessed to remove extra text and license information added by Gutenberg organization. Also the word token was kept below 1000,000 word token and the number of epocs set to 1 so that it could be trained on basic package provided by Google Colab.
|
39 |
+
It was then tokenized using GPT2Tokenizer and afterwards finned tunned on GPT2.
|
40 |
+
|
41 |
### Training hyperparameters
|
42 |
|
43 |
The following hyperparameters were used during training:
|
|
|
51 |
|
52 |
### Training results
|
53 |
|
54 |
+
At the end of the training , the loss function was reduced from 4.339300 to 3.582800. With much training epocs this value can be reduced further
|
55 |
+
|
56 |
|
57 |
|
58 |
### Framework versions
|