Commit
·
3ae43c2
1
Parent(s):
95c4ca3
Update a link to source code
Browse files
README.md
CHANGED
@@ -36,7 +36,7 @@ The model is trained on the train split for 10 epochs with batch size 2 and 1024
|
|
36 |
Adam optimizer is used. The learning rate is linearly decreased from `1e-4` to `0`. A clip norm is also used to set to `1.0`.
|
37 |
After finishing training, the training loss is reached to 3.23, wihle the validation loss is reached to 3.50.
|
38 |
|
39 |
-
All the code to train tokenizer and GPT-2 models are available in [a repository on GitHub](https://github.com/colorfulscoop/tfdlg/tree/
|
40 |
|
41 |
## Usage
|
42 |
|
|
|
36 |
Adam optimizer is used. The learning rate is linearly decreased from `1e-4` to `0`. A clip norm is also used to set to `1.0`.
|
37 |
After finishing training, the training loss is reached to 3.23, wihle the validation loss is reached to 3.50.
|
38 |
|
39 |
+
All the code to train tokenizer and GPT-2 models are available in [a repository on GitHub](https://github.com/colorfulscoop/tfdlg/tree/8d068f4cc3fac49555971ad8244a540587745d79/examples/transformers-gpt2-ja)
|
40 |
|
41 |
## Usage
|
42 |
|