metadata

license: apache-2.0
base_model: distilbert-base-cased
tags:
  - generated_from_trainer
model-index:
  - name: distilbert-base-vietnamese-case
    results: []

distilbert-base-vietnamese-case

This model is a fine-tuned version of distilbert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.9239

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
6.1273	1.0	79	6.0333
5.9095	2.0	158	5.9172
5.8407	3.0	237	5.7789
5.7761	4.0	316	5.6779
5.6909	5.0	395	5.6731
5.6318	6.0	474	5.5712
5.5787	7.0	553	5.4994
5.4948	8.0	632	5.4146
5.4399	9.0	711	5.3760
5.3676	10.0	790	5.3624
5.3691	11.0	869	5.2900
5.2904	12.0	948	5.3213
5.228	13.0	1027	5.2162
5.2384	14.0	1106	5.2232
5.1101	15.0	1185	5.1858
5.1316	16.0	1264	4.9780
5.0517	17.0	1343	5.0227
5.0014	18.0	1422	4.9703
5.0012	19.0	1501	4.9751
4.9574	20.0	1580	4.9152
4.8492	21.0	1659	4.8699
4.8717	22.0	1738	4.8291
4.8014	23.0	1817	4.8247
4.7941	24.0	1896	4.7314
4.7218	25.0	1975	4.8128
4.6991	26.0	2054	4.7312
4.695	27.0	2133	4.6820
4.6339	28.0	2212	4.6659
4.5968	29.0	2291	4.6682
4.581	30.0	2370	4.5671
4.5606	31.0	2449	4.5874
4.4842	32.0	2528	4.4972
4.5101	33.0	2607	4.5457
4.4482	34.0	2686	4.4926
4.4563	35.0	2765	4.4372
4.4161	36.0	2844	4.3623
4.3537	37.0	2923	4.4122
4.3775	38.0	3002	4.3519
4.3519	39.0	3081	4.3866
4.3392	40.0	3160	4.3779
4.3011	41.0	3239	4.3855
4.2702	42.0	3318	4.2953
4.2614	43.0	3397	4.3726
4.2464	44.0	3476	4.3147
4.1984	45.0	3555	4.2556
4.2463	46.0	3634	4.2224
4.1559	47.0	3713	4.1839
4.1859	48.0	3792	4.2830
4.1063	49.0	3871	4.1803
4.1222	50.0	3950	4.1545
4.1423	51.0	4029	4.2308
4.0657	52.0	4108	4.1227
4.1018	53.0	4187	4.1687
4.0689	54.0	4266	4.1626
4.0676	55.0	4345	4.1790
4.0127	56.0	4424	4.0618
4.066	57.0	4503	4.0780
3.9994	58.0	4582	4.1382
4.0002	59.0	4661	4.0318
4.0064	60.0	4740	4.0891
3.9681	61.0	4819	4.0633
3.9608	62.0	4898	4.0223
3.9544	63.0	4977	4.0722
3.97	64.0	5056	4.0127
3.913	65.0	5135	3.9915
3.9177	66.0	5214	4.0256
3.9388	67.0	5293	3.9830
3.9429	68.0	5372	4.0162
3.9036	69.0	5451	4.0515
3.8851	70.0	5530	3.9716
3.8894	71.0	5609	3.9939
3.896	72.0	5688	3.9699
3.8893	73.0	5767	3.9772
3.8648	74.0	5846	4.0543
3.8511	75.0	5925	3.9879
3.8286	76.0	6004	3.9393
3.851	77.0	6083	4.0088
3.8407	78.0	6162	3.9580
3.8391	79.0	6241	3.9453
3.8537	80.0	6320	3.9377
3.823	81.0	6399	3.9423
3.8395	82.0	6478	3.9240
3.7859	83.0	6557	3.8921
3.8177	84.0	6636	3.9167
3.7862	85.0	6715	3.9479
3.7978	86.0	6794	3.9230
3.7939	87.0	6873	3.9401
3.8006	88.0	6952	3.9525
3.7697	89.0	7031	3.9304
3.7914	90.0	7110	3.8875
3.7799	91.0	7189	3.8851
3.812	92.0	7268	3.9349
3.7942	93.0	7347	3.8931
3.7671	94.0	7426	3.8653
3.7654	95.0	7505	3.8282
3.7648	96.0	7584	3.8408
3.8011	97.0	7663	3.8898
3.7781	98.0	7742	3.9560
3.8056	99.0	7821	3.8882
3.7749	100.0	7900	3.9239

Framework versions

Transformers 4.33.2
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3