tyzhu
/

lmind_nq_train6000_eval6489_v1_docidx_v3_1e-4_lora2

Safetensors

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 8

Commit

c6d4bbe

•

1 Parent(s): de47e8c

Model save

Browse files

Files changed (1) hide show

README.md +117 -0

README.md ADDED Viewed

	@@ -0,0 +1,117 @@

+---
+license: other
+base_model: Qwen/Qwen1.5-4B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: lmind_nq_train6000_eval6489_v1_docidx_v3_1e-4_lora2
+  results: []
+library_name: peft
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# lmind_nq_train6000_eval6489_v1_docidx_v3_1e-4_lora2
+This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 6.7424
+- Accuracy: 0.4188
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch   | Step  | Validation Loss | Accuracy |
+|:-------------:|:-------:|:-----:|:---------------:|:--------:|
+| 1.9571        | 0.9985  | 341   | 3.9512          | 0.4538   |
+| 1.8819        | 2.0     | 683   | 4.1128          | 0.4483   |
+| 1.7702        | 2.9985  | 1024  | 4.3277          | 0.4461   |
+| 1.6163        | 4.0     | 1366  | 4.5849          | 0.4424   |
+| 1.4427        | 4.9985  | 1707  | 4.8503          | 0.4386   |
+| 1.2498        | 6.0     | 2049  | 5.0926          | 0.4349   |
+| 1.0655        | 6.9985  | 2390  | 5.2708          | 0.4326   |
+| 0.8733        | 8.0     | 2732  | 5.4024          | 0.4317   |
+| 0.7219        | 8.9985  | 3073  | 5.5348          | 0.4294   |
+| 0.5932        | 10.0    | 3415  | 5.7690          | 0.4261   |
+| 0.4719        | 10.9985 | 3756  | 5.8943          | 0.4254   |
+| 0.3838        | 12.0    | 4098  | 6.0191          | 0.4247   |
+| 0.329         | 12.9985 | 4439  | 6.1044          | 0.4246   |
+| 0.2742        | 14.0    | 4781  | 6.1465          | 0.4216   |
+| 0.2432        | 14.9985 | 5122  | 6.3254          | 0.4227   |
+| 0.2158        | 16.0    | 5464  | 6.4410          | 0.4228   |
+| 0.2013        | 16.9985 | 5805  | 6.3924          | 0.4215   |
+| 0.1851        | 18.0    | 6147  | 6.5217          | 0.4201   |
+| 0.1721        | 18.9985 | 6488  | 6.5573          | 0.4209   |
+| 0.1676        | 20.0    | 6830  | 6.5661          | 0.4214   |
+| 0.1579        | 20.9985 | 7171  | 6.5663          | 0.4213   |
+| 0.1575        | 22.0    | 7513  | 6.6259          | 0.4202   |
+| 0.15          | 22.9985 | 7854  | 6.5955          | 0.4214   |
+| 0.1427        | 24.0    | 8196  | 6.6297          | 0.4216   |
+| 0.145         | 24.9985 | 8537  | 6.5757          | 0.4227   |
+| 0.1393        | 26.0    | 8879  | 6.5675          | 0.4213   |
+| 0.1405        | 26.9985 | 9220  | 6.6650          | 0.4213   |
+| 0.1365        | 28.0    | 9562  | 6.6427          | 0.4210   |
+| 0.1372        | 28.9985 | 9903  | 6.5481          | 0.4209   |
+| 0.134         | 30.0    | 10245 | 6.6617          | 0.4199   |
+| 0.1287        | 30.9985 | 10586 | 6.6241          | 0.4207   |
+| 0.1305        | 32.0    | 10928 | 6.6094          | 0.4199   |
+| 0.1274        | 32.9985 | 11269 | 6.6823          | 0.4165   |
+| 0.1296        | 34.0    | 11611 | 6.6210          | 0.4195   |
+| 0.1271        | 34.9985 | 11952 | 6.7042          | 0.4185   |
+| 0.1239        | 36.0    | 12294 | 6.6016          | 0.4204   |
+| 0.1263        | 36.9985 | 12635 | 6.5736          | 0.4195   |
+| 0.1234        | 38.0    | 12977 | 6.6094          | 0.4169   |
+| 0.1236        | 38.9985 | 13318 | 6.6395          | 0.4151   |
+| 0.1211        | 40.0    | 13660 | 6.6604          | 0.4132   |
+| 0.1235        | 40.9985 | 14001 | 6.7098          | 0.4172   |
+| 0.1206        | 42.0    | 14343 | 6.6072          | 0.4172   |
+| 0.1165        | 42.9985 | 14684 | 6.7641          | 0.4178   |
+| 0.1207        | 44.0    | 15026 | 6.6669          | 0.4187   |
+| 0.1168        | 44.9985 | 15367 | 6.7258          | 0.4185   |
+| 0.1194        | 46.0    | 15709 | 6.7819          | 0.4187   |
+| 0.1179        | 46.9985 | 16050 | 6.7337          | 0.4189   |
+| 0.1158        | 48.0    | 16392 | 6.7115          | 0.4196   |
+| 0.1197        | 48.9985 | 16733 | 6.7568          | 0.4179   |
+| 0.1163        | 49.9268 | 17050 | 6.7424          | 0.4188   |
+### Framework versions
+- PEFT 0.5.0
+- Transformers 4.41.1
+- Pytorch 2.1.0+cu121
+- Datasets 2.19.1
+- Tokenizers 0.19.1