tyzhu
/

find_marker_both_sent_train_400_eval_40_random_permute_rerun_4_Qwen_Qwen1.5-4B_3e-4_lora

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 4

Commit

1d74bf3

•

1 Parent(s): 7337ef0

Model save

Browse files

Files changed (1) hide show

README.md +117 -0

README.md ADDED Viewed

	@@ -0,0 +1,117 @@

+---
+license: other
+base_model: Qwen/Qwen1.5-4B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: find_marker_both_sent_train_400_eval_40_random_permute_rerun_4_Qwen_Qwen1.5-4B_3e-4_lora
+  results: []
+library_name: peft
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# find_marker_both_sent_train_400_eval_40_random_permute_rerun_4_Qwen_Qwen1.5-4B_3e-4_lora
+This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3920
+- Accuracy: 0.7643
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0003
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Accuracy |
+|:-------------:|:-------:|:----:|:---------------:|:--------:|
+| 1.6149        | 0.9968  | 117  | 1.2999          | 0.6755   |
+| 0.862         | 1.9936  | 234  | 0.6614          | 0.7318   |
+| 0.3519        | 2.9989  | 352  | 0.3529          | 0.7612   |
+| 0.2097        | 3.9957  | 469  | 0.3055          | 0.7604   |
+| 0.1746        | 4.9925  | 586  | 0.2799          | 0.7659   |
+| 0.1507        | 5.9979  | 704  | 0.2721          | 0.7651   |
+| 0.1431        | 6.9947  | 821  | 0.2592          | 0.7674   |
+| 0.1381        | 8.0     | 939  | 0.2589          | 0.7667   |
+| 0.1337        | 8.9968  | 1056 | 0.2509          | 0.7679   |
+| 0.1292        | 9.9936  | 1173 | 0.2452          | 0.7682   |
+| 0.1232        | 10.9989 | 1291 | 0.2604          | 0.7656   |
+| 0.1214        | 11.9957 | 1408 | 0.2679          | 0.7653   |
+| 0.119         | 12.9925 | 1525 | 0.2421          | 0.7681   |
+| 0.1165        | 13.9979 | 1643 | 0.2545          | 0.7654   |
+| 0.1161        | 14.9947 | 1760 | 0.2666          | 0.7630   |
+| 0.1193        | 16.0    | 1878 | 0.2661          | 0.7645   |
+| 0.1264        | 16.9968 | 1995 | 0.2994          | 0.7626   |
+| 0.1201        | 17.9936 | 2112 | 0.2607          | 0.7647   |
+| 0.1144        | 18.9989 | 2230 | 0.2665          | 0.7655   |
+| 0.1147        | 19.9957 | 2347 | 0.2606          | 0.7646   |
+| 0.1143        | 20.9925 | 2464 | 0.2834          | 0.7645   |
+| 0.1105        | 21.9979 | 2582 | 0.2843          | 0.7645   |
+| 0.1103        | 22.9947 | 2699 | 0.2959          | 0.7639   |
+| 0.1081        | 24.0    | 2817 | 0.3331          | 0.7640   |
+| 0.1093        | 24.9968 | 2934 | 0.3566          | 0.7640   |
+| 0.1086        | 25.9936 | 3051 | 0.2995          | 0.7630   |
+| 0.1124        | 26.9989 | 3169 | 0.2889          | 0.7624   |
+| 0.1169        | 27.9957 | 3286 | 0.3392          | 0.7630   |
+| 0.1225        | 28.9925 | 3403 | 0.2916          | 0.7633   |
+| 0.1179        | 29.9979 | 3521 | 0.2572          | 0.7645   |
+| 0.1139        | 30.9947 | 3638 | 0.3382          | 0.7635   |
+| 0.1141        | 32.0    | 3756 | 0.3028          | 0.7635   |
+| 0.1119        | 32.9968 | 3873 | 0.3388          | 0.7637   |
+| 0.1124        | 33.9936 | 3990 | 0.3304          | 0.7636   |
+| 0.1089        | 34.9989 | 4108 | 0.3556          | 0.7641   |
+| 0.1095        | 35.9957 | 4225 | 0.3314          | 0.7641   |
+| 0.1082        | 36.9925 | 4342 | 0.3770          | 0.7640   |
+| 0.1071        | 37.9979 | 4460 | 0.3392          | 0.7645   |
+| 0.1076        | 38.9947 | 4577 | 0.3363          | 0.7640   |
+| 0.1074        | 40.0    | 4695 | 0.3731          | 0.7629   |
+| 0.1289        | 40.9968 | 4812 | 0.3028          | 0.7634   |
+| 0.1264        | 41.9936 | 4929 | 0.3093          | 0.7639   |
+| 0.1126        | 42.9989 | 5047 | 0.3074          | 0.7643   |
+| 0.1122        | 43.9957 | 5164 | 0.3375          | 0.7646   |
+| 0.1096        | 44.9925 | 5281 | 0.3388          | 0.7645   |
+| 0.1077        | 45.9979 | 5399 | 0.3173          | 0.7644   |
+| 0.1063        | 46.9947 | 5516 | 0.3343          | 0.7643   |
+| 0.1086        | 48.0    | 5634 | 0.3137          | 0.7644   |
+| 0.1052        | 48.9968 | 5751 | 0.3941          | 0.7645   |
+| 0.1078        | 49.8403 | 5850 | 0.3920          | 0.7643   |
+### Framework versions
+- PEFT 0.5.0
+- Transformers 4.40.2
+- Pytorch 2.3.0
+- Datasets 2.19.1
+- Tokenizers 0.19.1