MAdAiLab
/

SLM_vs_LLM_experiments

Model card Files Files and versions Community

SLM_vs_LLM_experiments / distilbert /distilbert_base_uncased_ledgar /README.md

akkky02's picture

Upload folder using huggingface_hub

9a4e4cc verified 4 months ago

|

No virus

3.92 kB

	---
	license: apache-2.0
	base_model: distilbert/distilbert-base-uncased
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: distilbert_base_uncased_ledgar
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# distilbert_base_uncased_ledgar

	This model is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6496
	- Accuracy: 0.8311
	- F1 Macro: 0.7116
	- F1 Micro: 0.8311

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 2
	- total_train_batch_size: 64
	- total_eval_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 Macro \| F1 Micro \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:--------:\|:--------:\|
	\| 3.8165 \| 0.11 \| 100 \| 3.5952 \| 0.3489 \| 0.0995 \| 0.3489 \|
	\| 2.8293 \| 0.21 \| 200 \| 2.6737 \| 0.5385 \| 0.2375 \| 0.5385 \|
	\| 2.2564 \| 0.32 \| 300 \| 2.0960 \| 0.6212 \| 0.3339 \| 0.6212 \|
	\| 1.8259 \| 0.43 \| 400 \| 1.7118 \| 0.6792 \| 0.4269 \| 0.6792 \|
	\| 1.5846 \| 0.53 \| 500 \| 1.4543 \| 0.7232 \| 0.4987 \| 0.7232 \|
	\| 1.3927 \| 0.64 \| 600 \| 1.2635 \| 0.758 \| 0.5628 \| 0.758 \|
	\| 1.2065 \| 0.75 \| 700 \| 1.1217 \| 0.7719 \| 0.5782 \| 0.7719 \|
	\| 1.16 \| 0.85 \| 800 \| 1.0303 \| 0.7832 \| 0.5984 \| 0.7832 \|
	\| 1.0168 \| 0.96 \| 900 \| 0.9443 \| 0.7887 \| 0.6119 \| 0.7887 \|
	\| 0.9006 \| 1.07 \| 1000 \| 0.8958 \| 0.7934 \| 0.6142 \| 0.7934 \|
	\| 0.8956 \| 1.17 \| 1100 \| 0.8517 \| 0.8002 \| 0.6294 \| 0.8002 \|
	\| 0.9159 \| 1.28 \| 1200 \| 0.8184 \| 0.8033 \| 0.6412 \| 0.8033 \|
	\| 0.8237 \| 1.39 \| 1300 \| 0.7814 \| 0.8077 \| 0.6529 \| 0.8077 \|
	\| 0.7341 \| 1.49 \| 1400 \| 0.7654 \| 0.8099 \| 0.6600 \| 0.8099 \|
	\| 0.7475 \| 1.6 \| 1500 \| 0.7458 \| 0.8135 \| 0.6650 \| 0.8135 \|
	\| 0.7699 \| 1.71 \| 1600 \| 0.7288 \| 0.8183 \| 0.6810 \| 0.8183 \|
	\| 0.7472 \| 1.81 \| 1700 \| 0.7125 \| 0.8179 \| 0.6820 \| 0.8179 \|
	\| 0.689 \| 1.92 \| 1800 \| 0.6965 \| 0.8201 \| 0.6822 \| 0.8201 \|
	\| 0.6807 \| 2.03 \| 1900 \| 0.6904 \| 0.8192 \| 0.6799 \| 0.8192 \|
	\| 0.6514 \| 2.13 \| 2000 \| 0.6836 \| 0.8239 \| 0.6923 \| 0.8239 \|
	\| 0.6662 \| 2.24 \| 2100 \| 0.6750 \| 0.8267 \| 0.7019 \| 0.8267 \|
	\| 0.6247 \| 2.35 \| 2200 \| 0.6703 \| 0.8284 \| 0.7028 \| 0.8284 \|
	\| 0.6443 \| 2.45 \| 2300 \| 0.6662 \| 0.8265 \| 0.7001 \| 0.8265 \|
	\| 0.632 \| 2.56 \| 2400 \| 0.6571 \| 0.8295 \| 0.7078 \| 0.8295 \|
	\| 0.5922 \| 2.67 \| 2500 \| 0.6539 \| 0.8298 \| 0.7084 \| 0.8298 \|
	\| 0.6423 \| 2.77 \| 2600 \| 0.6519 \| 0.8311 \| 0.7139 \| 0.8311 \|
	\| 0.6156 \| 2.88 \| 2700 \| 0.6500 \| 0.8311 \| 0.7123 \| 0.8311 \|
	\| 0.6097 \| 2.99 \| 2800 \| 0.6496 \| 0.8311 \| 0.7116 \| 0.8311 \|


	### Framework versions

	- Transformers 4.39.0.dev0
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2