metadata

license: mit
base_model: microsoft/mdeberta-v3-base
tags:
  - generated_from_trainer
datasets:
  - tweet_sentiment_multilingual
metrics:
  - accuracy
  - f1
model-index:
  - name: >-
      scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_
    results: []

scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:

Loss: 4.7763
Accuracy: 0.5525
F1: 0.5514

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.0802	1.09	500	1.0761	0.4776	0.4641
0.9869	2.17	1000	1.0145	0.5320	0.5250
0.8881	3.26	1500	0.9953	0.5432	0.5391
0.7877	4.35	2000	1.0837	0.5340	0.5311
0.6671	5.43	2500	1.2702	0.5394	0.5411
0.5716	6.52	3000	1.4643	0.5440	0.5409
0.4817	7.61	3500	1.6304	0.5448	0.5336
0.4102	8.7	4000	1.7103	0.5301	0.5267
0.3334	9.78	4500	2.0038	0.5343	0.5334
0.2776	10.87	5000	1.8016	0.5475	0.5472
0.2349	11.96	5500	2.0203	0.5282	0.5280
0.2017	13.04	6000	2.4490	0.5359	0.5334
0.1727	14.13	6500	2.5313	0.5378	0.5382
0.1491	15.22	7000	2.3797	0.5390	0.5388
0.1425	16.3	7500	2.4724	0.5444	0.5446
0.1265	17.39	8000	2.9398	0.5413	0.5389
0.1185	18.48	8500	2.3527	0.5370	0.5370
0.1038	19.57	9000	3.2756	0.5482	0.5442
0.1071	20.65	9500	3.0308	0.5432	0.5441
0.0865	21.74	10000	3.1408	0.5297	0.5296
0.0813	22.83	10500	3.3928	0.5436	0.5434
0.0831	23.91	11000	3.4793	0.5320	0.5339
0.0736	25.0	11500	3.2782	0.5451	0.5452
0.0672	26.09	12000	3.4270	0.5428	0.5396
0.0616	27.17	12500	3.7192	0.5471	0.5425
0.0588	28.26	13000	3.3739	0.5421	0.5424
0.0537	29.35	13500	3.5891	0.5421	0.5393
0.0534	30.43	14000	3.5400	0.5436	0.5391
0.0503	31.52	14500	4.1166	0.5409	0.5378
0.0431	32.61	15000	4.1346	0.5374	0.5339
0.0423	33.7	15500	3.9483	0.5478	0.5456
0.0371	34.78	16000	4.0371	0.5436	0.5429
0.0339	35.87	16500	4.0302	0.5478	0.5480
0.0381	36.96	17000	4.0057	0.5432	0.5425
0.0274	38.04	17500	4.5734	0.5521	0.5520
0.0288	39.13	18000	4.4791	0.5502	0.5472
0.0203	40.22	18500	4.7187	0.5536	0.5538
0.0248	41.3	19000	4.7855	0.5486	0.5490
0.025	42.39	19500	4.4324	0.5502	0.5471
0.0211	43.48	20000	4.7410	0.5475	0.5470
0.0215	44.57	20500	4.6235	0.5478	0.5483
0.0188	45.65	21000	4.6657	0.5517	0.5499
0.0163	46.74	21500	4.7207	0.5509	0.5505
0.0136	47.83	22000	4.7870	0.5525	0.5523
0.0131	48.91	22500	4.8396	0.5505	0.5501
0.0207	50.0	23000	4.7763	0.5525	0.5514

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3