Initial Commit

Browse files

Files changed (5) hide show

README.md +57 -58
config.json +1 -1
eval_results_cardiff.json +1 -1
pytorch_model.bin +2 -2
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -1,14 +1,13 @@
 ---
 base_model: microsoft/mdeberta-v3-base
 datasets:
 - tweet_sentiment_multilingual
-library_name: transformers
-license: mit
 metrics:
 - accuracy
 - f1
-tags:
-- generated_from_trainer
 model-index:
 - name: scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_
   results: []
@@ -21,9 +20,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the tweet_sentiment_multilingual dataset.
 It achieves the following results on the evaluation set:
-- Loss: 5.0035
-- Accuracy: 0.5625
-- F1: 0.5617
 ## Model description
@@ -45,66 +44,66 @@ The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 32
 - eval_batch_size: 32
-- seed: 55
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 50
 ### Training results
-| Training Loss | Epoch   | Step  | Validation Loss | Accuracy | F1     |
-|:-------------:|:-------:|:-----:|:---------------:|:--------:|:------:|
-| 1.021         | 1.0870  | 500   | 1.0027          | 0.5409   | 0.5367 |
-| 0.8432        | 2.1739  | 1000  | 1.0327          | 0.5814   | 0.5820 |
-| 0.6715        | 3.2609  | 1500  | 1.1554          | 0.5822   | 0.5778 |
-| 0.48          | 4.3478  | 2000  | 1.4182          | 0.5613   | 0.5573 |
-| 0.3384        | 5.4348  | 2500  | 1.8214          | 0.5567   | 0.5573 |
-| 0.2309        | 6.5217  | 3000  | 1.8385          | 0.5502   | 0.5445 |
-| 0.1737        | 7.6087  | 3500  | 2.0368          | 0.5444   | 0.5440 |
-| 0.1324        | 8.6957  | 4000  | 2.3667          | 0.5424   | 0.5414 |
-| 0.1132        | 9.7826  | 4500  | 2.0414          | 0.5509   | 0.5486 |
-| 0.1058        | 10.8696 | 5000  | 2.5673          | 0.5509   | 0.5491 |
-| 0.0833        | 11.9565 | 5500  | 2.7424          | 0.5513   | 0.5509 |
-| 0.0662        | 13.0435 | 6000  | 3.2582          | 0.5544   | 0.5529 |
-| 0.0664        | 14.1304 | 6500  | 3.5005          | 0.5556   | 0.5521 |
-| 0.0532        | 15.2174 | 7000  | 3.0692          | 0.5502   | 0.5509 |
-| 0.0494        | 16.3043 | 7500  | 3.1700          | 0.5478   | 0.5487 |
-| 0.0485        | 17.3913 | 8000  | 3.8948          | 0.5382   | 0.5377 |
-| 0.0359        | 18.4783 | 8500  | 3.5655          | 0.5583   | 0.5570 |
-| 0.0322        | 19.5652 | 9000  | 4.0121          | 0.5583   | 0.5547 |
-| 0.0294        | 20.6522 | 9500  | 3.5540          | 0.5579   | 0.5582 |
-| 0.026         | 21.7391 | 10000 | 4.0054          | 0.5525   | 0.5535 |
-| 0.0305        | 22.8261 | 10500 | 3.8289          | 0.5498   | 0.5453 |
-| 0.0232        | 23.9130 | 11000 | 4.4012          | 0.5556   | 0.5558 |
-| 0.0209        | 25.0    | 11500 | 4.0916          | 0.5559   | 0.5504 |
-| 0.0224        | 26.0870 | 12000 | 4.3087          | 0.5586   | 0.5583 |
-| 0.0192        | 27.1739 | 12500 | 4.0617          | 0.5467   | 0.5474 |
-| 0.0198        | 28.2609 | 13000 | 4.1456          | 0.5567   | 0.5555 |
-| 0.0148        | 29.3478 | 13500 | 4.5847          | 0.5505   | 0.5519 |
-| 0.016         | 30.4348 | 14000 | 4.3128          | 0.5494   | 0.5501 |
-| 0.0145        | 31.5217 | 14500 | 4.4021          | 0.5505   | 0.5500 |
-| 0.0146        | 32.6087 | 15000 | 4.3393          | 0.5509   | 0.5506 |
-| 0.0089        | 33.6957 | 15500 | 4.4852          | 0.5486   | 0.5499 |
-| 0.0089        | 34.7826 | 16000 | 4.8487          | 0.5475   | 0.5487 |
-| 0.0085        | 35.8696 | 16500 | 4.8052          | 0.5567   | 0.5573 |
-| 0.0077        | 36.9565 | 17000 | 4.6518          | 0.5502   | 0.5484 |
-| 0.0095        | 38.0435 | 17500 | 4.2742          | 0.5567   | 0.5554 |
-| 0.0054        | 39.1304 | 18000 | 4.7804          | 0.5548   | 0.5520 |
-| 0.0074        | 40.2174 | 18500 | 4.6940          | 0.5540   | 0.5516 |
-| 0.0053        | 41.3043 | 19000 | 4.6543          | 0.5590   | 0.5581 |
-| 0.003         | 42.3913 | 19500 | 5.0637          | 0.5563   | 0.5572 |
-| 0.0044        | 43.4783 | 20000 | 4.7918          | 0.5652   | 0.5657 |
-| 0.0053        | 44.5652 | 20500 | 4.7492          | 0.5625   | 0.5604 |
-| 0.0031        | 45.6522 | 21000 | 4.8642          | 0.5571   | 0.5567 |
-| 0.0026        | 46.7391 | 21500 | 4.9137          | 0.5617   | 0.5614 |
-| 0.0025        | 47.8261 | 22000 | 4.8985          | 0.5629   | 0.5626 |
-| 0.0007        | 48.9130 | 22500 | 4.9890          | 0.5633   | 0.5621 |
-| 0.0027        | 50.0    | 23000 | 5.0035          | 0.5625   | 0.5617 |
 ### Framework versions
-- Transformers 4.44.2
 - Pytorch 2.1.1+cu121
 - Datasets 2.14.5
-- Tokenizers 0.19.1

 ---
+license: mit
 base_model: microsoft/mdeberta-v3-base
+tags:
+- generated_from_trainer
 datasets:
 - tweet_sentiment_multilingual
 metrics:
 - accuracy
 - f1
 model-index:
 - name: scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_
   results: []
 This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the tweet_sentiment_multilingual dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.7763
+- Accuracy: 0.5525
+- F1: 0.5514
 ## Model description
 - learning_rate: 5e-05
 - train_batch_size: 32
 - eval_batch_size: 32
+- seed: 66
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 50
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Accuracy | F1     |
+|:-------------:|:-----:|:-----:|:---------------:|:--------:|:------:|
+| 1.0802        | 1.09  | 500   | 1.0761          | 0.4776   | 0.4641 |
+| 0.9869        | 2.17  | 1000  | 1.0145          | 0.5320   | 0.5250 |
+| 0.8881        | 3.26  | 1500  | 0.9953          | 0.5432   | 0.5391 |
+| 0.7877        | 4.35  | 2000  | 1.0837          | 0.5340   | 0.5311 |
+| 0.6671        | 5.43  | 2500  | 1.2702          | 0.5394   | 0.5411 |
+| 0.5716        | 6.52  | 3000  | 1.4643          | 0.5440   | 0.5409 |
+| 0.4817        | 7.61  | 3500  | 1.6304          | 0.5448   | 0.5336 |
+| 0.4102        | 8.7   | 4000  | 1.7103          | 0.5301   | 0.5267 |
+| 0.3334        | 9.78  | 4500  | 2.0038          | 0.5343   | 0.5334 |
+| 0.2776        | 10.87 | 5000  | 1.8016          | 0.5475   | 0.5472 |
+| 0.2349        | 11.96 | 5500  | 2.0203          | 0.5282   | 0.5280 |
+| 0.2017        | 13.04 | 6000  | 2.4490          | 0.5359   | 0.5334 |
+| 0.1727        | 14.13 | 6500  | 2.5313          | 0.5378   | 0.5382 |
+| 0.1491        | 15.22 | 7000  | 2.3797          | 0.5390   | 0.5388 |
+| 0.1425        | 16.3  | 7500  | 2.4724          | 0.5444   | 0.5446 |
+| 0.1265        | 17.39 | 8000  | 2.9398          | 0.5413   | 0.5389 |
+| 0.1185        | 18.48 | 8500  | 2.3527          | 0.5370   | 0.5370 |
+| 0.1038        | 19.57 | 9000  | 3.2756          | 0.5482   | 0.5442 |
+| 0.1071        | 20.65 | 9500  | 3.0308          | 0.5432   | 0.5441 |
+| 0.0865        | 21.74 | 10000 | 3.1408          | 0.5297   | 0.5296 |
+| 0.0813        | 22.83 | 10500 | 3.3928          | 0.5436   | 0.5434 |
+| 0.0831        | 23.91 | 11000 | 3.4793          | 0.5320   | 0.5339 |
+| 0.0736        | 25.0  | 11500 | 3.2782          | 0.5451   | 0.5452 |
+| 0.0672        | 26.09 | 12000 | 3.4270          | 0.5428   | 0.5396 |
+| 0.0616        | 27.17 | 12500 | 3.7192          | 0.5471   | 0.5425 |
+| 0.0588        | 28.26 | 13000 | 3.3739          | 0.5421   | 0.5424 |
+| 0.0537        | 29.35 | 13500 | 3.5891          | 0.5421   | 0.5393 |
+| 0.0534        | 30.43 | 14000 | 3.5400          | 0.5436   | 0.5391 |
+| 0.0503        | 31.52 | 14500 | 4.1166          | 0.5409   | 0.5378 |
+| 0.0431        | 32.61 | 15000 | 4.1346          | 0.5374   | 0.5339 |
+| 0.0423        | 33.7  | 15500 | 3.9483          | 0.5478   | 0.5456 |
+| 0.0371        | 34.78 | 16000 | 4.0371          | 0.5436   | 0.5429 |
+| 0.0339        | 35.87 | 16500 | 4.0302          | 0.5478   | 0.5480 |
+| 0.0381        | 36.96 | 17000 | 4.0057          | 0.5432   | 0.5425 |
+| 0.0274        | 38.04 | 17500 | 4.5734          | 0.5521   | 0.5520 |
+| 0.0288        | 39.13 | 18000 | 4.4791          | 0.5502   | 0.5472 |
+| 0.0203        | 40.22 | 18500 | 4.7187          | 0.5536   | 0.5538 |
+| 0.0248        | 41.3  | 19000 | 4.7855          | 0.5486   | 0.5490 |
+| 0.025         | 42.39 | 19500 | 4.4324          | 0.5502   | 0.5471 |
+| 0.0211        | 43.48 | 20000 | 4.7410          | 0.5475   | 0.5470 |
+| 0.0215        | 44.57 | 20500 | 4.6235          | 0.5478   | 0.5483 |
+| 0.0188        | 45.65 | 21000 | 4.6657          | 0.5517   | 0.5499 |
+| 0.0163        | 46.74 | 21500 | 4.7207          | 0.5509   | 0.5505 |
+| 0.0136        | 47.83 | 22000 | 4.7870          | 0.5525   | 0.5523 |
+| 0.0131        | 48.91 | 22500 | 4.8396          | 0.5505   | 0.5501 |
+| 0.0207        | 50.0  | 23000 | 4.7763          | 0.5525   | 0.5514 |
 ### Framework versions
+- Transformers 4.33.3
 - Pytorch 2.1.1+cu121
 - Datasets 2.14.5
+- Tokenizers 0.13.3

config.json CHANGED Viewed

@@ -39,7 +39,7 @@
   "relative_attention": true,
   "share_att_key": true,
   "torch_dtype": "float32",
-  "transformers_version": "4.44.2",
   "type_vocab_size": 0,
   "vocab_size": 251000
 }

   "relative_attention": true,
   "share_att_key": true,
   "torch_dtype": "float32",
+  "transformers_version": "4.33.3",
   "type_vocab_size": 0,
   "vocab_size": 251000
 }

eval_results_cardiff.json CHANGED Viewed

@@ -1 +1 @@

- {"arabic": {"f1": 0.~~5505121631124021~~, "accuracy": 0.~~5482758620689655~~, "confusion_matrix": [[~~153~~, ~~106~~, 31], [74, ~~164~~, 52], [58, 72, ~~160~~]]}, "english": {"f1": 0.~~6252888515559376~~, "accuracy": 0.~~6229885057471264~~, "confusion_matrix": [[~~204~~, 75, 11], [84, ~~174~~, 32], [47, 79, ~~164~~]]}, "french": {"f1": 0.~~5844273689986265~~, "accuracy": 0.~~5850574712643678~~, "confusion_matrix": [[~~192~~, 50, 48], [46, ~~159~~, 85], [65, 67, 158]]}, "german": {"f1": 0.~~6770577974077062~~, "accuracy": 0.~~6770114942528735~~, "confusion_matrix": [[~~198~~, 62, 30], [55, ~~216~~, 19], [63, 52, ~~175~~]]}, "hindi": {"f1": 0.~~4860992225426562~~, "accuracy": 0.~~4885057471264368~~, "confusion_matrix": [[~~164~~, 56, 70], [~~100~~, ~~115~~, 75], [87, 57, ~~146~~]]}, "italian": {"f1": 0.~~5664004433041901~~, "accuracy": 0.~~5747126436781609~~, "confusion_matrix": [[~~110~~, 88, 92], [20, ~~200~~, 70], [23, 77, ~~190~~]]}, "portuguese": {"f1": 0.~~6162615581379695~~, "accuracy": 0.~~6137931034482759~~, "confusion_matrix": [[~~154~~, ~~110~~, 26], [56, ~~194~~, 40], [28, 76, ~~186~~]]}, "spanish": {"f1": 0.~~5890270816894922~~, "accuracy": 0.~~5896551724137931~~, "confusion_matrix": [[~~181~~, 77, 32], [80, ~~145~~, 65], [51, 52, ~~187~~]]}, "all": {"f1": 0.~~5881456796344003~~, "accuracy": 0.~~5875~~, "confusion_matrix": [[~~1356~~, ~~624~~, ~~340~~], [~~515~~, ~~1367~~, ~~438~~], [~~422~~, ~~532~~, ~~1366~~]]}}

+ {"arabic": {"f1": 0.4913237337553058, "accuracy": 0.4896551724137931, "confusion_matrix": [[148, 99, 43], [89, 135, 66], [59, 88, 143]]}, "english": {"f1": 0.5802643889534133, "accuracy": 0.5804597701149425, "confusion_matrix": [[212, 66, 12], [119, 137, 34], [51, 83, 156]]}, "french": {"f1": 0.5737235230438815, "accuracy": 0.5724137931034483, "confusion_matrix": [[171, 57, 62], [40, 169, 81], [47, 85, 158]]}, "german": {"f1": 0.6055909597413228, "accuracy": 0.6057471264367816, "confusion_matrix": [[175, 64, 51], [60, 168, 62], [54, 52, 184]]}, "hindi": {"f1": 0.45075025262918283, "accuracy": 0.45057471264367815, "confusion_matrix": [[141, 88, 61], [90, 128, 72], [70, 97, 123]]}, "italian": {"f1": 0.5260257984431246, "accuracy": 0.5367816091954023, "confusion_matrix": [[96, 112, 82], [18, 203, 69], [23, 99, 168]]}, "portuguese": {"f1": 0.5823465032900264, "accuracy": 0.5816091954022988, "confusion_matrix": [[165, 83, 42], [75, 153, 62], [30, 72, 188]]}, "spanish": {"f1": 0.545025097311418, "accuracy": 0.5448275862068965, "confusion_matrix": [[156, 92, 42], [76, 140, 74], [46, 66, 178]]}, "all": {"f1": 0.5461495899502564, "accuracy": 0.5452586206896551, "confusion_matrix": [[1264, 661, 395], [567, 1233, 520], [380, 642, 1298]]}}

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:99dd1feff9f3278e92e79d51ce2bec9e17c1a0c55edb8f33955ed08156def8fe
-size 429199798

 version https://git-lfs.github.com/spec/v1
+oid sha256:93e487911902f10250cf9e34adbefeb7a66247822226160e41fd5fb272a78e02
+size 945174322

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:286d21ebe8c947df3bf342be6d4627995d796ef0b0af7e94d2b085f097bc8a6b
-size 5368

 version https://git-lfs.github.com/spec/v1
+oid sha256:51ce125a8e77b013b6621c2d66bd5caeb194ba0371f5b40904d5b89d99d1d131
+size 4664