tyzhu
/

find_marker_both_sent_train_400_eval_40_first_permute_Qwen_Qwen1.5-4B_3e-4_lora

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 4

Commit

be55627

•

1 Parent(s): 371ab54

Model save

Browse files

Files changed (1) hide show

README.md +117 -0

README.md ADDED Viewed

	@@ -0,0 +1,117 @@

+---
+license: other
+base_model: Qwen/Qwen1.5-4B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: find_marker_both_sent_train_400_eval_40_first_permute_Qwen_Qwen1.5-4B_3e-4_lora
+  results: []
+library_name: peft
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# find_marker_both_sent_train_400_eval_40_first_permute_Qwen_Qwen1.5-4B_3e-4_lora
+This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3150
+- Accuracy: 0.7659
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0003
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Accuracy |
+|:-------------:|:-------:|:----:|:---------------:|:--------:|
+| 1.5824        | 0.9933  | 130  | 1.1797          | 0.6851   |
+| 0.7977        | 1.9943  | 261  | 0.5359          | 0.7429   |
+| 0.3387        | 2.9952  | 392  | 0.3361          | 0.7614   |
+| 0.1537        | 3.9962  | 523  | 0.2855          | 0.7653   |
+| 0.1389        | 4.9971  | 654  | 0.2712          | 0.7666   |
+| 0.1383        | 5.9981  | 785  | 0.2502          | 0.7676   |
+| 0.1252        | 6.9990  | 916  | 0.2457          | 0.7684   |
+| 0.122         | 8.0     | 1047 | 0.2310          | 0.7694   |
+| 0.1169        | 8.9933  | 1177 | 0.2316          | 0.7689   |
+| 0.1167        | 9.9943  | 1308 | 0.2311          | 0.7699   |
+| 0.1161        | 10.9952 | 1439 | 0.2159          | 0.7708   |
+| 0.1126        | 11.9962 | 1570 | 0.2188          | 0.7694   |
+| 0.1088        | 12.9971 | 1701 | 0.2270          | 0.7661   |
+| 0.1104        | 13.9981 | 1832 | 0.2181          | 0.7677   |
+| 0.1076        | 14.9990 | 1963 | 0.2135          | 0.7680   |
+| 0.1069        | 16.0    | 2094 | 0.2219          | 0.7670   |
+| 0.1048        | 16.9933 | 2224 | 0.2298          | 0.7668   |
+| 0.1044        | 17.9943 | 2355 | 0.2341          | 0.7666   |
+| 0.1061        | 18.9952 | 2486 | 0.2628          | 0.7660   |
+| 0.1104        | 19.9962 | 2617 | 0.2712          | 0.7651   |
+| 0.1111        | 20.9971 | 2748 | 0.2921          | 0.7652   |
+| 0.1102        | 21.9981 | 2879 | 0.2700          | 0.7660   |
+| 0.1049        | 22.9990 | 3010 | 0.2905          | 0.7662   |
+| 0.1024        | 24.0    | 3141 | 0.2852          | 0.7664   |
+| 0.1079        | 24.9933 | 3271 | 0.2418          | 0.7653   |
+| 0.1066        | 25.9943 | 3402 | 0.2759          | 0.7662   |
+| 0.1054        | 26.9952 | 3533 | 0.2958          | 0.7656   |
+| 0.105         | 27.9962 | 3664 | 0.3109          | 0.7663   |
+| 0.1066        | 28.9971 | 3795 | 0.3062          | 0.7660   |
+| 0.1048        | 29.9981 | 3926 | 0.2714          | 0.7660   |
+| 0.1043        | 30.9990 | 4057 | 0.2821          | 0.7662   |
+| 0.1039        | 32.0    | 4188 | 0.2961          | 0.7661   |
+| 0.1055        | 32.9933 | 4318 | 0.2942          | 0.7662   |
+| 0.1045        | 33.9943 | 4449 | 0.3152          | 0.7659   |
+| 0.1045        | 34.9952 | 4580 | 0.2828          | 0.7666   |
+| 0.1038        | 35.9962 | 4711 | 0.2355          | 0.7662   |
+| 0.102         | 36.9971 | 4842 | 0.2926          | 0.7664   |
+| 0.103         | 37.9981 | 4973 | 0.2825          | 0.7660   |
+| 0.1061        | 38.9990 | 5104 | 0.2899          | 0.7663   |
+| 0.1064        | 40.0    | 5235 | 0.2930          | 0.7660   |
+| 0.105         | 40.9933 | 5365 | 0.2806          | 0.7657   |
+| 0.1038        | 41.9943 | 5496 | 0.2973          | 0.7664   |
+| 0.1016        | 42.9952 | 5627 | 0.3379          | 0.7662   |
+| 0.1046        | 43.9962 | 5758 | 0.3200          | 0.7655   |
+| 0.1039        | 44.9971 | 5889 | 0.3151          | 0.7652   |
+| 0.107         | 45.9981 | 6020 | 0.2969          | 0.7658   |
+| 0.1059        | 46.9990 | 6151 | 0.3146          | 0.7659   |
+| 0.1058        | 48.0    | 6282 | 0.3070          | 0.7656   |
+| 0.103         | 48.9933 | 6412 | 0.3060          | 0.7660   |
+| 0.1012        | 49.6657 | 6500 | 0.3150          | 0.7659   |
+### Framework versions
+- PEFT 0.5.0
+- Transformers 4.40.2
+- Pytorch 2.3.0
+- Datasets 2.19.1
+- Tokenizers 0.19.1