Training completed!

Browse files

Files changed (4) hide show

README.md +20 -22
config.json +6 -2
pytorch_model.bin +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -3,39 +3,34 @@ license: mit
 base_model: xlm-roberta-base
 tags:
 - generated_from_trainer
-- NER
-- crypto
 metrics:
 - f1
 model-index:
-- name: xlm-roberta-base-finetuned-ner-crypto
   results: []
-widget:
-- text: "Didn't I tell you that that was a decent entry point on $PROPHET? If you are in - congrats, Prophet is up 90% in the last 2 weeks and 50% up in the last week alone"
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# xlm-roberta-base-finetuned-ner-crypto
 This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0100
-- F1: 0.9883
 ## Model description
-This model is a fine-tuned version of xlm-roberta-base, specializing in Named Entity Recognition (NER) within the cryptocurrency domain. It is optimized to recognize and classify entities such as cryptocurrency ticker symbols, names, and addresses within text.
-## Intended uses
-Designed primarily for NER tasks in the cryptocurrency sector, this model excels in identifying and categorizing ticker symbols, cryptocurrency names, and addresses in textual content.
-## Limitations
-The model might not perform well in identifying and classifying entities that were not part of the training data or those that are less frequent in the cryptocurrency domain. It may also be sensitive to the context and format in which the entities are presented.
 ## Training and evaluation data
-The model was trained using a diverse dataset, including artificially generated tweets and ERC20 token metadata fetched through the [Covalent API](https://www.covalenthq.com/docs/unified-api/). GPT was employed to generate 500 synthetic tweets tailored for the cryptocurrency domain. The Covalent API was instrumental in obtaining a rich set of unique ERC20 token metadata entries, enhancing the model's understanding and recognition of cryptocurrency entities.
 ## Training procedure
@@ -43,25 +38,28 @@ The model was trained using a diverse dataset, including artificially generated
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 24
-- eval_batch_size: 24
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
-| 0.0338        | 1.0   | 1000 | 0.0126          | 0.9877 |
-| 0.0105        | 2.0   | 2000 | 0.0112          | 0.9867 |
-| 0.0081        | 3.0   | 3000 | 0.0100          | 0.9883 |
 ### Framework versions
 - Transformers 4.34.1
 - Pytorch 2.1.0+cu118
-- Datasets 2.14.5
 - Tokenizers 0.14.1

 base_model: xlm-roberta-base
 tags:
 - generated_from_trainer
 metrics:
 - f1
 model-index:
+- name: xlm-roberta-base-finetuned-NER-crypto
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# xlm-roberta-base-finetuned-NER-crypto
 This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0041
+- F1: 0.9960
 ## Model description
+More information needed
+## Intended uses & limitations
+More information needed
 ## Training and evaluation data
+More information needed
 ## Training procedure
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 32
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 6
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
+| 0.1208        | 1.0   | 125  | 0.0181          | 0.9872 |
+| 0.0061        | 2.0   | 250  | 0.0055          | 0.9951 |
+| 0.0028        | 3.0   | 375  | 0.0037          | 0.9948 |
+| 0.002         | 4.0   | 500  | 0.0037          | 0.9960 |
+| 0.0016        | 5.0   | 625  | 0.0040          | 0.9960 |
+| 0.0013        | 6.0   | 750  | 0.0041          | 0.9960 |
 ### Framework versions
 - Transformers 4.34.1
 - Pytorch 2.1.0+cu118
+- Datasets 2.14.6
 - Tokenizers 0.14.1

config.json CHANGED Viewed

@@ -17,18 +17,22 @@
     "3": "I-NAME",
     "4": "B-TICKER_SYMBOL",
     "5": "I-TICKER_SYMBOL",
-    "6": "O"
   },
   "initializer_range": 0.02,
   "intermediate_size": 3072,
   "label2id": {
     "B-ADDRESS": 0,
     "B-NAME": 2,
     "B-TICKER_SYMBOL": 4,
     "I-ADDRESS": 1,
     "I-NAME": 3,
     "I-TICKER_SYMBOL": 5,
-    "O": 6
   },
   "layer_norm_eps": 1e-05,
   "max_position_embeddings": 514,

     "3": "I-NAME",
     "4": "B-TICKER_SYMBOL",
     "5": "I-TICKER_SYMBOL",
+    "6": "B-CHAIN",
+    "7": "I-CHAIN",
+    "8": "O"
   },
   "initializer_range": 0.02,
   "intermediate_size": 3072,
   "label2id": {
     "B-ADDRESS": 0,
+    "B-CHAIN": 6,
     "B-NAME": 2,
     "B-TICKER_SYMBOL": 4,
     "I-ADDRESS": 1,
+    "I-CHAIN": 7,
     "I-NAME": 3,
     "I-TICKER_SYMBOL": 5,
+    "O": 8
   },
   "layer_norm_eps": 1e-05,
   "max_position_embeddings": 514,

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5f0a69c1efc354106cedf2ffd3ea3c8047fc0ee94a1db45fa0625c0e880969bf
-size 1109902502

 version https://git-lfs.github.com/spec/v1
+oid sha256:db46f3eedb6f759c61837663da1c8a0b1ef5f93a3b719236a354f995b5ccc11b
+size 1109908646

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:dcd4d8e12a0b74cf2d0e3c758b224bc40e3b0dc0b86c2e36ff5abfc32839ad37
 size 4536

 version https://git-lfs.github.com/spec/v1
+oid sha256:473a9325e049fc3b2929c677b29a23cb1de27175e31d340329803c03f19a81ab
 size 4536