arminmrm93 commited on
Commit
fa3ef22
1 Parent(s): da8f2f2

Upload TFDistilBertForMultipleChoice

Browse files
Files changed (3) hide show
  1. README.md +6 -17
  2. config.json +14 -16
  3. tf_model.h5 +2 -2
README.md CHANGED
@@ -1,25 +1,21 @@
1
  ---
2
  license: apache-2.0
3
- base_model: bert-base-uncased
4
  tags:
5
  - generated_from_keras_callback
6
  model-index:
7
- - name: arminmrm93/kaggle_qa_model
8
  results: []
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
12
  probably proofread and complete it, then remove this comment. -->
13
 
14
- # arminmrm93/kaggle_qa_model
15
 
16
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - Train Loss: 1.6112
19
- - Train Accuracy: 0.455
20
- - Validation Loss: 1.6094
21
- - Validation Accuracy: 0.4550
22
- - Epoch: 4
23
 
24
  ## Model description
25
 
@@ -38,18 +34,11 @@ More information needed
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
- - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0006103840571684032, 'decay_steps': 100, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
42
  - training_precision: float32
43
 
44
  ### Training results
45
 
46
- | Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
47
- |:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
48
- | 1.6365 | 0.465 | 1.6094 | 0.4650 | 0 |
49
- | 1.6167 | 0.455 | 1.6094 | 0.4550 | 1 |
50
- | 1.6238 | 0.455 | 1.6094 | 0.4550 | 2 |
51
- | 1.6132 | 0.455 | 1.6094 | 0.4550 | 3 |
52
- | 1.6112 | 0.455 | 1.6094 | 0.4550 | 4 |
53
 
54
 
55
  ### Framework versions
 
1
  ---
2
  license: apache-2.0
3
+ base_model: distilbert-base-uncased
4
  tags:
5
  - generated_from_keras_callback
6
  model-index:
7
+ - name: kaggle_qa_model
8
  results: []
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
12
  probably proofread and complete it, then remove this comment. -->
13
 
14
+ # kaggle_qa_model
15
 
16
+ This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+
 
 
 
 
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 150, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
38
  - training_precision: float32
39
 
40
  ### Training results
41
 
 
 
 
 
 
 
 
42
 
43
 
44
  ### Framework versions
config.json CHANGED
@@ -1,25 +1,23 @@
1
  {
2
- "_name_or_path": "bert-base-uncased",
 
3
  "architectures": [
4
- "BertForMultipleChoice"
5
  ],
6
- "attention_probs_dropout_prob": 0.1,
7
- "classifier_dropout": null,
8
- "gradient_checkpointing": false,
9
- "hidden_act": "gelu",
10
- "hidden_dropout_prob": 0.1,
11
- "hidden_size": 768,
12
  "initializer_range": 0.02,
13
- "intermediate_size": 3072,
14
- "layer_norm_eps": 1e-12,
15
  "max_position_embeddings": 512,
16
- "model_type": "bert",
17
- "num_attention_heads": 12,
18
- "num_hidden_layers": 12,
19
  "pad_token_id": 0,
20
- "position_embedding_type": "absolute",
 
 
 
21
  "transformers_version": "4.34.0",
22
- "type_vocab_size": 2,
23
- "use_cache": true,
24
  "vocab_size": 30522
25
  }
 
1
  {
2
+ "_name_or_path": "distilbert-base-uncased",
3
+ "activation": "gelu",
4
  "architectures": [
5
+ "DistilBertForMultipleChoice"
6
  ],
7
+ "attention_dropout": 0.1,
8
+ "dim": 768,
9
+ "dropout": 0.1,
10
+ "hidden_dim": 3072,
 
 
11
  "initializer_range": 0.02,
 
 
12
  "max_position_embeddings": 512,
13
+ "model_type": "distilbert",
14
+ "n_heads": 12,
15
+ "n_layers": 6,
16
  "pad_token_id": 0,
17
+ "qa_dropout": 0.1,
18
+ "seq_classif_dropout": 0.2,
19
+ "sinusoidal_pos_embds": false,
20
+ "tie_weights_": true,
21
  "transformers_version": "4.34.0",
 
 
22
  "vocab_size": 30522
23
  }
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:60ba0625f32dc0dd1a9d06f5d5ce40e5c02895c3edbc915876f56e34831e044e
3
- size 438203668
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:281541c4894b89802ce99618345b520737372c283926dfcdc50608c5899faf0f
3
+ size 267948736