MAdAiLab
/

SLM_vs_LLM_experiments

Model card Files Files and versions Community

SLM_vs_LLM_experiments / Qwen /Qwen1.5_1.8B_twitter /README.md

akkky02's picture

Upload folder using huggingface_hub

7e15240 verified 9 months ago

|

2.83 kB

	---
	license: other
	base_model: Qwen/Qwen1.5-1.8B
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: Qwen1.5_1.8B_twitter
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Qwen1.5_1.8B_twitter

	This model is a fine-tuned version of [Qwen/Qwen1.5-1.8B](https://huggingface.co/Qwen/Qwen1.5-1.8B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5039
	- Accuracy: 0.7776
	- F1 Macro: 0.7420
	- F1 Micro: 0.7776

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-06
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 2
	- total_train_batch_size: 32
	- total_eval_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 Macro \| F1 Micro \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:--------:\|:--------:\|
	\| 0.6585 \| 0.18 \| 50 \| 0.6435 \| 0.7123 \| 0.5811 \| 0.7123 \|
	\| 0.6396 \| 0.37 \| 100 \| 0.6016 \| 0.7298 \| 0.6998 \| 0.7298 \|
	\| 0.5108 \| 0.55 \| 150 \| 0.5227 \| 0.7528 \| 0.6963 \| 0.7528 \|
	\| 0.5065 \| 0.74 \| 200 \| 0.5503 \| 0.7417 \| 0.6347 \| 0.7417 \|
	\| 0.4883 \| 0.92 \| 250 \| 0.5039 \| 0.7776 \| 0.7420 \| 0.7776 \|
	\| 0.3296 \| 1.1 \| 300 \| 0.5250 \| 0.7730 \| 0.7307 \| 0.7730 \|
	\| 0.322 \| 1.29 \| 350 \| 0.5510 \| 0.7721 \| 0.7423 \| 0.7721 \|
	\| 0.3287 \| 1.47 \| 400 \| 0.5392 \| 0.7583 \| 0.6932 \| 0.7583 \|
	\| 0.3097 \| 1.65 \| 450 \| 0.5631 \| 0.7629 \| 0.7223 \| 0.7629 \|
	\| 0.3397 \| 1.84 \| 500 \| 0.5669 \| 0.7675 \| 0.7334 \| 0.7675 \|
	\| 0.2618 \| 2.02 \| 550 \| 0.5891 \| 0.75 \| 0.6870 \| 0.75 \|
	\| 0.1745 \| 2.21 \| 600 \| 0.6400 \| 0.7583 \| 0.7123 \| 0.7583 \|
	\| 0.1572 \| 2.39 \| 650 \| 0.6694 \| 0.7518 \| 0.6967 \| 0.7518 \|
	\| 0.1804 \| 2.57 \| 700 \| 0.6870 \| 0.7610 \| 0.7173 \| 0.7610 \|
	\| 0.1817 \| 2.76 \| 750 \| 0.6656 \| 0.7537 \| 0.7045 \| 0.7537 \|
	\| 0.1984 \| 2.94 \| 800 \| 0.6783 \| 0.7518 \| 0.6949 \| 0.7518 \|


	### Framework versions

	- Transformers 4.39.0.dev0
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2