deberta_finetune

This model is a fine-tuned version of microsoft/deberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:

eval_loss: 0.3943
eval_accuracy: 0.8673
eval_runtime: 164.2323
eval_samples_per_second: 29.178
eval_steps_per_second: 1.827
epoch: 2.0
step: 4164

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3

Framework versions

Transformers 4.25.1
Pytorch 1.13.0+cu116
Datasets 2.8.0
Tokenizers 0.13.2

Model Recycling

Evaluation on 36 datasets using nc33/deberta_finetune as a base model yields average score of 79.51 in comparison to 79.04 by microsoft/deberta-v3-base.

The model is ranked 3rd among all tested models for the microsoft/deberta-v3-base architecture as of 06/02/2023 Results:

20_newsgroup	ag_news	amazon_reviews_multi	anli	boolq	cb	cola	copa	dbpedia	esnli	financial_phrasebank	imdb	isear	mnli	mrpc	multirc	poem_sentiment	qnli	qqp	rotten_tomatoes	rte	sst2	sst_5bins	stsb	trec_coarse	trec_fine	tweet_ev_emoji	tweet_ev_emotion	tweet_ev_hate	tweet_ev_irony	tweet_ev_offensive	tweet_ev_sentiment	wic	wnli	wsc	yahoo_answers
86.1922	90.3667	67.48	58.5625	84.3425	73.2143	86.5772	68	79.6667	91.5717	88.6	94.472	72.2295	89.6359	90.1961	63.5314	87.5	93.5567	91.672	90.2439	83.0325	95.1835	58.371	90.4054	97.2	90.8	47.122	85.0809	59.3939	79.0816	83.7209	70.197	70.6897	67.6056	64.4231	72.3333

For more information, see: Model Recycling

nc33
/

deberta_finetune

deberta_finetune

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Framework versions

Model Recycling

Evaluation results