MAdAiLab
/

SLM_vs_LLM_experiments

Model card Files Files and versions Community

SLM_vs_LLM_experiments / Qwen /Qwen1.5_1.8B_ledgar /README.md

akkky02's picture

Upload folder using huggingface_hub

ca5f7f5 verified 6 months ago

|

No virus

3.84 kB

	---
	license: other
	base_model: Qwen/Qwen1.5-1.8B
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: Qwen1.5_1.8B_ledgar
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Qwen1.5_1.8B_ledgar

	This model is a fine-tuned version of [Qwen/Qwen1.5-1.8B](https://huggingface.co/Qwen/Qwen1.5-1.8B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5064
	- Accuracy: 0.8669
	- F1 Macro: 0.7902
	- F1 Micro: 0.8669

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-06
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 2
	- total_train_batch_size: 64
	- total_eval_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 Macro \| F1 Micro \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:--------:\|:--------:\|
	\| 1.3077 \| 0.11 \| 100 \| 1.0945 \| 0.7277 \| 0.5771 \| 0.7277 \|
	\| 0.8627 \| 0.21 \| 200 \| 0.8368 \| 0.7907 \| 0.6657 \| 0.7907 \|
	\| 0.7179 \| 0.32 \| 300 \| 0.7824 \| 0.7971 \| 0.6862 \| 0.7971 \|
	\| 0.6961 \| 0.43 \| 400 \| 0.6952 \| 0.8138 \| 0.6992 \| 0.8138 \|
	\| 0.745 \| 0.53 \| 500 \| 0.6719 \| 0.8121 \| 0.7034 \| 0.8121 \|
	\| 0.6505 \| 0.64 \| 600 \| 0.6220 \| 0.834 \| 0.7469 \| 0.834 \|
	\| 0.5914 \| 0.75 \| 700 \| 0.6110 \| 0.8362 \| 0.7411 \| 0.8362 \|
	\| 0.5837 \| 0.85 \| 800 \| 0.5767 \| 0.8385 \| 0.7413 \| 0.8385 \|
	\| 0.5218 \| 0.96 \| 900 \| 0.5365 \| 0.849 \| 0.7703 \| 0.849 \|
	\| 0.2632 \| 1.07 \| 1000 \| 0.5504 \| 0.8562 \| 0.7684 \| 0.8562 \|
	\| 0.2607 \| 1.17 \| 1100 \| 0.5497 \| 0.8525 \| 0.7657 \| 0.8525 \|
	\| 0.274 \| 1.28 \| 1200 \| 0.5439 \| 0.8584 \| 0.7746 \| 0.8584 \|
	\| 0.2216 \| 1.39 \| 1300 \| 0.5687 \| 0.8563 \| 0.7754 \| 0.8563 \|
	\| 0.2044 \| 1.49 \| 1400 \| 0.5385 \| 0.861 \| 0.7820 \| 0.861 \|
	\| 0.2508 \| 1.6 \| 1500 \| 0.5658 \| 0.8577 \| 0.7711 \| 0.8577 \|
	\| 0.2513 \| 1.71 \| 1600 \| 0.5367 \| 0.8589 \| 0.7872 \| 0.8589 \|
	\| 0.2787 \| 1.81 \| 1700 \| 0.5133 \| 0.8653 \| 0.7903 \| 0.8653 \|
	\| 0.2357 \| 1.92 \| 1800 \| 0.5064 \| 0.8669 \| 0.7902 \| 0.8669 \|
	\| 0.049 \| 2.03 \| 1900 \| 0.5344 \| 0.8719 \| 0.7978 \| 0.8719 \|
	\| 0.0298 \| 2.13 \| 2000 \| 0.5762 \| 0.8737 \| 0.7992 \| 0.8737 \|
	\| 0.0427 \| 2.24 \| 2100 \| 0.5961 \| 0.8708 \| 0.7976 \| 0.8708 \|
	\| 0.036 \| 2.35 \| 2200 \| 0.6128 \| 0.8728 \| 0.7988 \| 0.8728 \|
	\| 0.0551 \| 2.45 \| 2300 \| 0.6165 \| 0.8708 \| 0.7976 \| 0.8708 \|
	\| 0.0392 \| 2.56 \| 2400 \| 0.6023 \| 0.8749 \| 0.8038 \| 0.8749 \|
	\| 0.0364 \| 2.67 \| 2500 \| 0.6168 \| 0.8729 \| 0.8001 \| 0.8729 \|
	\| 0.0416 \| 2.77 \| 2600 \| 0.6103 \| 0.8753 \| 0.8048 \| 0.8753 \|
	\| 0.0353 \| 2.88 \| 2700 \| 0.6118 \| 0.8749 \| 0.8054 \| 0.8749 \|
	\| 0.0308 \| 2.99 \| 2800 \| 0.6114 \| 0.875 \| 0.8057 \| 0.875 \|


	### Framework versions

	- Transformers 4.39.0.dev0
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2