raygx commited on
Commit
e2d84b2
1 Parent(s): 5bf5ede

Training in progress epoch 0

Browse files
Files changed (3) hide show
  1. README.md +7 -16
  2. config.json +2 -2
  3. tf_model.h5 +2 -2
README.md CHANGED
@@ -2,20 +2,20 @@
2
  tags:
3
  - generated_from_keras_callback
4
  model-index:
5
- - name: Covid-News-Headline-Generator
6
  results: []
7
  ---
8
 
9
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
10
  probably proofread and complete it, then remove this comment. -->
11
 
12
- # Covid-News-Headline-Generator
13
 
14
  This model was trained from scratch on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
- - Train Loss: 6.2848
17
- - Validation Loss: 6.4997
18
- - Epoch: 9
19
 
20
  ## Model description
21
 
@@ -34,23 +34,14 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.001}
38
  - training_precision: float32
39
 
40
  ### Training results
41
 
42
  | Train Loss | Validation Loss | Epoch |
43
  |:----------:|:---------------:|:-----:|
44
- | 7.9127 | 7.4885 | 0 |
45
- | 7.2651 | 7.2848 | 1 |
46
- | 7.0688 | 7.1212 | 2 |
47
- | 6.9180 | 6.9791 | 3 |
48
- | 6.7836 | 6.8700 | 4 |
49
- | 6.6663 | 6.7885 | 5 |
50
- | 6.5613 | 6.7219 | 6 |
51
- | 6.4627 | 6.6263 | 7 |
52
- | 6.3710 | 6.5719 | 8 |
53
- | 6.2848 | 6.4997 | 9 |
54
 
55
 
56
  ### Framework versions
 
2
  tags:
3
  - generated_from_keras_callback
4
  model-index:
5
+ - name: raygx/Covid-News-Headline-Generator
6
  results: []
7
  ---
8
 
9
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
10
  probably proofread and complete it, then remove this comment. -->
11
 
12
+ # raygx/Covid-News-Headline-Generator
13
 
14
  This model was trained from scratch on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Train Loss: 6.3859
17
+ - Validation Loss: 6.4563
18
+ - Epoch: 0
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 1e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.001}
38
  - training_precision: float32
39
 
40
  ### Training results
41
 
42
  | Train Loss | Validation Loss | Epoch |
43
  |:----------:|:---------------:|:-----:|
44
+ | 6.3859 | 6.4563 | 0 |
 
 
 
 
 
 
 
 
 
45
 
46
 
47
  ### Framework versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "/kaggle/input/nepali-trained-gpt2-casual-language-model/gpt2NepaliCasualLM",
3
  "_num_labels": 1,
4
  "activation_function": "gelu_new",
5
  "architectures": [
@@ -26,7 +26,7 @@
26
  "model_type": "gpt2",
27
  "n_ctx": 1024,
28
  "n_embd": 768,
29
- "n_head": 12,
30
  "n_inner": null,
31
  "n_layer": 6,
32
  "n_positions": 1024,
 
1
  {
2
+ "_name_or_path": "raygx/Covid-News-Headline-Generator",
3
  "_num_labels": 1,
4
  "activation_function": "gelu_new",
5
  "architectures": [
 
26
  "model_type": "gpt2",
27
  "n_ctx": 1024,
28
  "n_embd": 768,
29
+ "n_head": 16,
30
  "n_inner": null,
31
  "n_layer": 6,
32
  "n_positions": 1024,
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ae99f8ee5d3673c0b5571fa808298f66f45bba504f04b57c50d1ca45b088554a
3
- size 357679600
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ed93cf55bdf589a2f1f17cbe16ecfe16373f7e3d1104cae026c1c2dc693ad8ba
3
+ size 265515968