MAdAiLab
/

SLM_vs_LLM_experiments

Safetensors

Model card Files Files and versions Community

SLM_vs_LLM_experiments / LoRA /Qwen /Qwen1.5_7B_LoRA_coastalcph /lex_glue_ledgar /README.md

akkky02

Upload folder using huggingface_hub

4a3dcc6 verified 5 months ago

preview code

raw

history blame

No virus

6.21 kB

	---
	license: other
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: Qwen/Qwen1.5-7B
	metrics:
	- accuracy
	model-index:
	- name: lex_glue_ledgar
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# lex_glue_ledgar

	This model is a fine-tuned version of [Qwen/Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5025
	- Accuracy: 0.867
	- F1 Macro: 0.7910
	- F1 Micro: 0.867

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 2
	- total_train_batch_size: 32
	- total_eval_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 Macro \| F1 Micro \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:--------:\|:--------:\|
	\| 1.7995 \| 0.05 \| 100 \| 1.6894 \| 0.6512 \| 0.4676 \| 0.6512 \|
	\| 1.3922 \| 0.11 \| 200 \| 1.2208 \| 0.7076 \| 0.5868 \| 0.7076 \|
	\| 1.0552 \| 0.16 \| 300 \| 0.9665 \| 0.7634 \| 0.6329 \| 0.7634 \|
	\| 0.8416 \| 0.21 \| 400 \| 0.9615 \| 0.767 \| 0.6280 \| 0.767 \|
	\| 0.8204 \| 0.27 \| 500 \| 0.8469 \| 0.7892 \| 0.6680 \| 0.7892 \|
	\| 0.7359 \| 0.32 \| 600 \| 0.7820 \| 0.8025 \| 0.6859 \| 0.8025 \|
	\| 0.7088 \| 0.37 \| 700 \| 0.7905 \| 0.7975 \| 0.6808 \| 0.7975 \|
	\| 0.6096 \| 0.43 \| 800 \| 0.7862 \| 0.8009 \| 0.6823 \| 0.8009 \|
	\| 0.8682 \| 0.48 \| 900 \| 0.7768 \| 0.7987 \| 0.6967 \| 0.7987 \|
	\| 0.6772 \| 0.53 \| 1000 \| 0.7300 \| 0.8094 \| 0.6934 \| 0.8094 \|
	\| 0.6224 \| 0.59 \| 1100 \| 0.6760 \| 0.8146 \| 0.7190 \| 0.8146 \|
	\| 0.5875 \| 0.64 \| 1200 \| 0.6449 \| 0.8253 \| 0.7442 \| 0.8253 \|
	\| 0.6147 \| 0.69 \| 1300 \| 0.6603 \| 0.8305 \| 0.7208 \| 0.8305 \|
	\| 0.6355 \| 0.75 \| 1400 \| 0.6256 \| 0.8285 \| 0.7294 \| 0.8285 \|
	\| 0.7076 \| 0.8 \| 1500 \| 0.6340 \| 0.8288 \| 0.7290 \| 0.8288 \|
	\| 0.4995 \| 0.85 \| 1600 \| 0.6186 \| 0.8315 \| 0.7422 \| 0.8315 \|
	\| 0.5754 \| 0.91 \| 1700 \| 0.6105 \| 0.8402 \| 0.7482 \| 0.8402 \|
	\| 0.6775 \| 0.96 \| 1800 \| 0.5947 \| 0.8369 \| 0.7531 \| 0.8369 \|
	\| 0.3267 \| 1.01 \| 1900 \| 0.5678 \| 0.8528 \| 0.7704 \| 0.8528 \|
	\| 0.2022 \| 1.07 \| 2000 \| 0.6361 \| 0.844 \| 0.7639 \| 0.844 \|
	\| 0.3831 \| 1.12 \| 2100 \| 0.5957 \| 0.8503 \| 0.7672 \| 0.8503 \|
	\| 0.3235 \| 1.17 \| 2200 \| 0.6062 \| 0.8476 \| 0.7685 \| 0.8476 \|
	\| 0.2279 \| 1.23 \| 2300 \| 0.6255 \| 0.847 \| 0.7658 \| 0.847 \|
	\| 0.3224 \| 1.28 \| 2400 \| 0.5754 \| 0.8537 \| 0.7772 \| 0.8537 \|
	\| 0.3281 \| 1.33 \| 2500 \| 0.5763 \| 0.8598 \| 0.7769 \| 0.8598 \|
	\| 0.3909 \| 1.39 \| 2600 \| 0.5519 \| 0.8545 \| 0.7778 \| 0.8545 \|
	\| 0.3064 \| 1.44 \| 2700 \| 0.5842 \| 0.8536 \| 0.7790 \| 0.8536 \|
	\| 0.2333 \| 1.49 \| 2800 \| 0.6084 \| 0.8447 \| 0.7674 \| 0.8447 \|
	\| 0.2361 \| 1.55 \| 2900 \| 0.5975 \| 0.8588 \| 0.7853 \| 0.8588 \|
	\| 0.3415 \| 1.6 \| 3000 \| 0.5701 \| 0.8572 \| 0.7844 \| 0.8572 \|
	\| 0.2535 \| 1.65 \| 3100 \| 0.5557 \| 0.8618 \| 0.7828 \| 0.8618 \|
	\| 0.2356 \| 1.71 \| 3200 \| 0.5242 \| 0.8612 \| 0.7822 \| 0.8612 \|
	\| 0.3383 \| 1.76 \| 3300 \| 0.5250 \| 0.8553 \| 0.7873 \| 0.8553 \|
	\| 0.1886 \| 1.81 \| 3400 \| 0.5301 \| 0.8658 \| 0.7924 \| 0.8658 \|
	\| 0.2468 \| 1.87 \| 3500 \| 0.5459 \| 0.8595 \| 0.7813 \| 0.8595 \|
	\| 0.2947 \| 1.92 \| 3600 \| 0.5141 \| 0.8688 \| 0.7910 \| 0.8688 \|
	\| 0.2625 \| 1.97 \| 3700 \| 0.5025 \| 0.867 \| 0.7910 \| 0.867 \|
	\| 0.0829 \| 2.03 \| 3800 \| 0.5625 \| 0.8697 \| 0.8004 \| 0.8697 \|
	\| 0.0297 \| 2.08 \| 3900 \| 0.6303 \| 0.8698 \| 0.8018 \| 0.8698 \|
	\| 0.0474 \| 2.13 \| 4000 \| 0.6244 \| 0.8713 \| 0.8046 \| 0.8713 \|
	\| 0.0267 \| 2.19 \| 4100 \| 0.5801 \| 0.8737 \| 0.8061 \| 0.8737 \|
	\| 0.0487 \| 2.24 \| 4200 \| 0.5915 \| 0.8745 \| 0.8018 \| 0.8745 \|
	\| 0.0272 \| 2.29 \| 4300 \| 0.6174 \| 0.8764 \| 0.8043 \| 0.8764 \|
	\| 0.02 \| 2.35 \| 4400 \| 0.6261 \| 0.87 \| 0.7986 \| 0.87 \|
	\| 0.0414 \| 2.4 \| 4500 \| 0.6157 \| 0.8748 \| 0.8036 \| 0.8748 \|
	\| 0.0394 \| 2.45 \| 4600 \| 0.6051 \| 0.8755 \| 0.8076 \| 0.8755 \|
	\| 0.0513 \| 2.51 \| 4700 \| 0.6078 \| 0.874 \| 0.8072 \| 0.874 \|
	\| 0.0553 \| 2.56 \| 4800 \| 0.6021 \| 0.8734 \| 0.8023 \| 0.8734 \|
	\| 0.0843 \| 2.61 \| 4900 \| 0.6084 \| 0.8766 \| 0.8096 \| 0.8766 \|
	\| 0.0361 \| 2.67 \| 5000 \| 0.6129 \| 0.8764 \| 0.8091 \| 0.8764 \|
	\| 0.0485 \| 2.72 \| 5100 \| 0.6214 \| 0.8789 \| 0.8096 \| 0.8789 \|
	\| 0.0209 \| 2.77 \| 5200 \| 0.5887 \| 0.8795 \| 0.8102 \| 0.8795 \|
	\| 0.028 \| 2.83 \| 5300 \| 0.5953 \| 0.8798 \| 0.8132 \| 0.8798 \|
	\| 0.0513 \| 2.88 \| 5400 \| 0.5944 \| 0.8818 \| 0.8154 \| 0.8818 \|
	\| 0.0073 \| 2.93 \| 5500 \| 0.6021 \| 0.8794 \| 0.8136 \| 0.8794 \|
	\| 0.0398 \| 2.99 \| 5600 \| 0.6064 \| 0.88 \| 0.8124 \| 0.88 \|


	### Framework versions

	- PEFT 0.9.0
	- Transformers 4.39.0.dev0
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2

	---
	license: other
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: Qwen/Qwen1.5-7B
	metrics:
	- accuracy
	model-index:
	- name: lex_glue_ledgar
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# lex_glue_ledgar

	This model is a fine-tuned version of [Qwen/Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5025
	- Accuracy: 0.867
	- F1 Macro: 0.7910
	- F1 Micro: 0.867

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 2
	- total_train_batch_size: 32
	- total_eval_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 Macro \| F1 Micro \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:--------:\|:--------:\|
	\| 1.7995 \| 0.05 \| 100 \| 1.6894 \| 0.6512 \| 0.4676 \| 0.6512 \|
	\| 1.3922 \| 0.11 \| 200 \| 1.2208 \| 0.7076 \| 0.5868 \| 0.7076 \|
	\| 1.0552 \| 0.16 \| 300 \| 0.9665 \| 0.7634 \| 0.6329 \| 0.7634 \|
	\| 0.8416 \| 0.21 \| 400 \| 0.9615 \| 0.767 \| 0.6280 \| 0.767 \|
	\| 0.8204 \| 0.27 \| 500 \| 0.8469 \| 0.7892 \| 0.6680 \| 0.7892 \|
	\| 0.7359 \| 0.32 \| 600 \| 0.7820 \| 0.8025 \| 0.6859 \| 0.8025 \|
	\| 0.7088 \| 0.37 \| 700 \| 0.7905 \| 0.7975 \| 0.6808 \| 0.7975 \|
	\| 0.6096 \| 0.43 \| 800 \| 0.7862 \| 0.8009 \| 0.6823 \| 0.8009 \|
	\| 0.8682 \| 0.48 \| 900 \| 0.7768 \| 0.7987 \| 0.6967 \| 0.7987 \|
	\| 0.6772 \| 0.53 \| 1000 \| 0.7300 \| 0.8094 \| 0.6934 \| 0.8094 \|
	\| 0.6224 \| 0.59 \| 1100 \| 0.6760 \| 0.8146 \| 0.7190 \| 0.8146 \|
	\| 0.5875 \| 0.64 \| 1200 \| 0.6449 \| 0.8253 \| 0.7442 \| 0.8253 \|
	\| 0.6147 \| 0.69 \| 1300 \| 0.6603 \| 0.8305 \| 0.7208 \| 0.8305 \|
	\| 0.6355 \| 0.75 \| 1400 \| 0.6256 \| 0.8285 \| 0.7294 \| 0.8285 \|
	\| 0.7076 \| 0.8 \| 1500 \| 0.6340 \| 0.8288 \| 0.7290 \| 0.8288 \|
	\| 0.4995 \| 0.85 \| 1600 \| 0.6186 \| 0.8315 \| 0.7422 \| 0.8315 \|
	\| 0.5754 \| 0.91 \| 1700 \| 0.6105 \| 0.8402 \| 0.7482 \| 0.8402 \|
	\| 0.6775 \| 0.96 \| 1800 \| 0.5947 \| 0.8369 \| 0.7531 \| 0.8369 \|
	\| 0.3267 \| 1.01 \| 1900 \| 0.5678 \| 0.8528 \| 0.7704 \| 0.8528 \|
	\| 0.2022 \| 1.07 \| 2000 \| 0.6361 \| 0.844 \| 0.7639 \| 0.844 \|
	\| 0.3831 \| 1.12 \| 2100 \| 0.5957 \| 0.8503 \| 0.7672 \| 0.8503 \|
	\| 0.3235 \| 1.17 \| 2200 \| 0.6062 \| 0.8476 \| 0.7685 \| 0.8476 \|
	\| 0.2279 \| 1.23 \| 2300 \| 0.6255 \| 0.847 \| 0.7658 \| 0.847 \|
	\| 0.3224 \| 1.28 \| 2400 \| 0.5754 \| 0.8537 \| 0.7772 \| 0.8537 \|
	\| 0.3281 \| 1.33 \| 2500 \| 0.5763 \| 0.8598 \| 0.7769 \| 0.8598 \|
	\| 0.3909 \| 1.39 \| 2600 \| 0.5519 \| 0.8545 \| 0.7778 \| 0.8545 \|
	\| 0.3064 \| 1.44 \| 2700 \| 0.5842 \| 0.8536 \| 0.7790 \| 0.8536 \|
	\| 0.2333 \| 1.49 \| 2800 \| 0.6084 \| 0.8447 \| 0.7674 \| 0.8447 \|
	\| 0.2361 \| 1.55 \| 2900 \| 0.5975 \| 0.8588 \| 0.7853 \| 0.8588 \|
	\| 0.3415 \| 1.6 \| 3000 \| 0.5701 \| 0.8572 \| 0.7844 \| 0.8572 \|
	\| 0.2535 \| 1.65 \| 3100 \| 0.5557 \| 0.8618 \| 0.7828 \| 0.8618 \|
	\| 0.2356 \| 1.71 \| 3200 \| 0.5242 \| 0.8612 \| 0.7822 \| 0.8612 \|
	\| 0.3383 \| 1.76 \| 3300 \| 0.5250 \| 0.8553 \| 0.7873 \| 0.8553 \|
	\| 0.1886 \| 1.81 \| 3400 \| 0.5301 \| 0.8658 \| 0.7924 \| 0.8658 \|
	\| 0.2468 \| 1.87 \| 3500 \| 0.5459 \| 0.8595 \| 0.7813 \| 0.8595 \|
	\| 0.2947 \| 1.92 \| 3600 \| 0.5141 \| 0.8688 \| 0.7910 \| 0.8688 \|
	\| 0.2625 \| 1.97 \| 3700 \| 0.5025 \| 0.867 \| 0.7910 \| 0.867 \|
	\| 0.0829 \| 2.03 \| 3800 \| 0.5625 \| 0.8697 \| 0.8004 \| 0.8697 \|
	\| 0.0297 \| 2.08 \| 3900 \| 0.6303 \| 0.8698 \| 0.8018 \| 0.8698 \|
	\| 0.0474 \| 2.13 \| 4000 \| 0.6244 \| 0.8713 \| 0.8046 \| 0.8713 \|
	\| 0.0267 \| 2.19 \| 4100 \| 0.5801 \| 0.8737 \| 0.8061 \| 0.8737 \|
	\| 0.0487 \| 2.24 \| 4200 \| 0.5915 \| 0.8745 \| 0.8018 \| 0.8745 \|
	\| 0.0272 \| 2.29 \| 4300 \| 0.6174 \| 0.8764 \| 0.8043 \| 0.8764 \|
	\| 0.02 \| 2.35 \| 4400 \| 0.6261 \| 0.87 \| 0.7986 \| 0.87 \|
	\| 0.0414 \| 2.4 \| 4500 \| 0.6157 \| 0.8748 \| 0.8036 \| 0.8748 \|
	\| 0.0394 \| 2.45 \| 4600 \| 0.6051 \| 0.8755 \| 0.8076 \| 0.8755 \|
	\| 0.0513 \| 2.51 \| 4700 \| 0.6078 \| 0.874 \| 0.8072 \| 0.874 \|
	\| 0.0553 \| 2.56 \| 4800 \| 0.6021 \| 0.8734 \| 0.8023 \| 0.8734 \|
	\| 0.0843 \| 2.61 \| 4900 \| 0.6084 \| 0.8766 \| 0.8096 \| 0.8766 \|
	\| 0.0361 \| 2.67 \| 5000 \| 0.6129 \| 0.8764 \| 0.8091 \| 0.8764 \|
	\| 0.0485 \| 2.72 \| 5100 \| 0.6214 \| 0.8789 \| 0.8096 \| 0.8789 \|
	\| 0.0209 \| 2.77 \| 5200 \| 0.5887 \| 0.8795 \| 0.8102 \| 0.8795 \|
	\| 0.028 \| 2.83 \| 5300 \| 0.5953 \| 0.8798 \| 0.8132 \| 0.8798 \|
	\| 0.0513 \| 2.88 \| 5400 \| 0.5944 \| 0.8818 \| 0.8154 \| 0.8818 \|
	\| 0.0073 \| 2.93 \| 5500 \| 0.6021 \| 0.8794 \| 0.8136 \| 0.8794 \|
	\| 0.0398 \| 2.99 \| 5600 \| 0.6064 \| 0.88 \| 0.8124 \| 0.88 \|


	### Framework versions

	- PEFT 0.9.0
	- Transformers 4.39.0.dev0
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2