nurcan commited on
Commit
3791634
1 Parent(s): 63f336e

Upload model

Browse files
Files changed (3) hide show
  1. README.md +4 -2
  2. config.json +1 -1
  3. tf_model.h5 +2 -2
README.md CHANGED
@@ -1,4 +1,6 @@
1
  ---
 
 
2
  tags:
3
  - generated_from_keras_callback
4
  model-index:
@@ -11,7 +13,7 @@ probably proofread and complete it, then remove this comment. -->
11
 
12
  # tdk-model
13
 
14
- This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
 
17
 
@@ -32,7 +34,7 @@ More information needed
32
  ### Training hyperparameters
33
 
34
  The following hyperparameters were used during training:
35
- - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -1000, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
36
  - training_precision: float32
37
 
38
  ### Training results
 
1
  ---
2
+ license: mit
3
+ base_model: gpt2
4
  tags:
5
  - generated_from_keras_callback
6
  model-index:
 
13
 
14
  # tdk-model
15
 
16
+ This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
 
19
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - optimizer: None
38
  - training_precision: float32
39
 
40
  ### Training results
config.json CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "activation_function": "gelu_new",
3
  "architectures": [
4
  "GPT2LMHeadModel"
@@ -31,7 +32,6 @@
31
  "max_length": 50
32
  }
33
  },
34
- "torch_dtype": "float32",
35
  "transformers_version": "4.35.2",
36
  "use_cache": true,
37
  "vocab_size": 50257
 
1
  {
2
+ "_name_or_path": "gpt2",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2LMHeadModel"
 
32
  "max_length": 50
33
  }
34
  },
 
35
  "transformers_version": "4.35.2",
36
  "use_cache": true,
37
  "vocab_size": 50257
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db13a093c96b66532b6ae28d99b728da505c5991715b47cb7e20b1297a12e819
3
- size 7120
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2103e994faecface37230ffa38af0b9e036ffad5693ef9d4065efd257d47e7f
3
+ size 497935440