juandalibaba
/

my_awesome_wnut_model

Question Answering

Transformers

TensorFlow

distilbert

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

juandalibaba commited on Aug 24, 2023

Commit

63acbf9

1 Parent(s): de93609

Training in progress epoch 0

Browse files

Files changed (4) hide show

README.md +7 -13
config.json +1 -31
tf_model.h5 +2 -2
tokenizer.json +12 -3

README.md CHANGED Viewed

@@ -15,13 +15,9 @@ probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: 0.1022
-- Validation Loss: 0.2518
-- Train Precision: 0.6732
-- Train Recall: 0.4533
-- Train F1: 0.5418
-- Train Accuracy: 0.9482
-- Epoch: 2
 ## Model description
@@ -40,16 +36,14 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 636, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
 - training_precision: float32
 ### Training results
-| Train Loss | Validation Loss | Train Precision | Train Recall | Train F1 | Train Accuracy | Epoch |
-|:----------:|:---------------:|:---------------:|:------------:|:--------:|:--------------:|:-----:|
-| 0.1026     | 0.2518          | 0.6732          | 0.4533       | 0.5418   | 0.9482         | 0     |
-| 0.1016     | 0.2518          | 0.6732          | 0.4533       | 0.5418   | 0.9482         | 1     |
-| 0.1022     | 0.2518          | 0.6732          | 0.4533       | 0.5418   | 0.9482         | 2     |
 ### Framework versions

 This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 2.7876
+- Validation Loss: 1.9931
+- Epoch: 0
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 500, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
 - training_precision: float32
 ### Training results
+| Train Loss | Validation Loss | Epoch |
+|:----------:|:---------------:|:-----:|
+| 2.7876     | 1.9931          | 0     |
 ### Framework versions

config.json CHANGED Viewed

@@ -2,43 +2,13 @@
   "_name_or_path": "distilbert-base-uncased",
   "activation": "gelu",
   "architectures": [
-    "DistilBertForTokenClassification"
   ],
   "attention_dropout": 0.1,
   "dim": 768,
   "dropout": 0.1,
   "hidden_dim": 3072,
-  "id2label": {
-    "0": "O",
-    "1": "B-corporation",
-    "2": "I-corporation",
-    "3": "B-creative-work",
-    "4": "I-creative-work",
-    "5": "B-group",
-    "6": "I-group",
-    "7": "B-location",
-    "8": "I-location",
-    "9": "B-person",
-    "10": "I-person",
-    "11": "B-product",
-    "12": "I-product"
-  },
   "initializer_range": 0.02,
-  "label2id": {
-    "B-corporation": 1,
-    "B-creative-work": 3,
-    "B-group": 5,
-    "B-location": 7,
-    "B-person": 9,
-    "B-product": 11,
-    "I-corporation": 2,
-    "I-creative-work": 4,
-    "I-group": 6,
-    "I-location": 8,
-    "I-person": 10,
-    "I-product": 12,
-    "O": 0
-  },
   "max_position_embeddings": 512,
   "model_type": "distilbert",
   "n_heads": 12,

   "_name_or_path": "distilbert-base-uncased",
   "activation": "gelu",
   "architectures": [
+    "DistilBertForQuestionAnswering"
   ],
   "attention_dropout": 0.1,
   "dim": 768,
   "dropout": 0.1,
   "hidden_dim": 3072,
   "initializer_range": 0.02,
   "max_position_embeddings": 512,
   "model_type": "distilbert",
   "n_heads": 12,

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9a71559104acdc481e131da99aff0a91cb22bb6985b8c9f08bc285de89c4bc29
-size 265618704

 version https://git-lfs.github.com/spec/v1
+oid sha256:3a27ec4db071c288fbd3aeebac8c58fab03e71dc3f133e9cb5f94cb0dfd09efa
+size 265583592

tokenizer.json CHANGED Viewed

@@ -2,11 +2,20 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 512,
-    "strategy": "LongestFirst",
     "stride": 0
   },
-  "padding": null,
   "added_tokens": [
     {
       "id": 0,

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 384,
+    "strategy": "OnlySecond",
     "stride": 0
   },
+  "padding": {
+    "strategy": {
+      "Fixed": 384
+    },
+    "direction": "Right",
+    "pad_to_multiple_of": null,
+    "pad_id": 0,
+    "pad_type_id": 0,
+    "pad_token": "[PAD]"
+  },
   "added_tokens": [
     {
       "id": 0,