VitaliiVrublevskyi
/

Llama-2-7b-hf-finetuned-mrpc-v3

TensorBoard

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

VitaliiVrublevskyi commited on Sep 26, 2023

Commit

76ce245

•

1 Parent(s): 6d9068a

update model card README.md

Browse files

Files changed (1) hide show

README.md +95 -13

README.md CHANGED Viewed

@@ -1,20 +1,102 @@
 ---
-library_name: peft
 ---
 ## Training procedure
-The following `bitsandbytes` quantization config was used during training:
-- load_in_8bit: True
-- load_in_4bit: False
-- llm_int8_threshold: 6.0
-- llm_int8_skip_modules: None
-- llm_int8_enable_fp32_cpu_offload: False
-- llm_int8_has_fp16_weight: False
-- bnb_4bit_quant_type: fp4
-- bnb_4bit_use_double_quant: False
-- bnb_4bit_compute_dtype: float32
-### Framework versions
-- PEFT 0.4.0

 ---
+base_model: meta-llama/Llama-2-7b-hf
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- accuracy
+- f1
+model-index:
+- name: Llama-2-7b-hf-finetuned-mrpc-v3
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Llama-2-7b-hf-finetuned-mrpc-v3
+This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.6823
+- Accuracy: 0.7475
+- F1: 0.8245
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
 ## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 40
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
+| No log        | 1.0   | 230  | 0.6528          | 0.625    | 0.6982 |
+| No log        | 2.0   | 460  | 0.6217          | 0.6936   | 0.8159 |
+| 0.6443        | 3.0   | 690  | 0.6033          | 0.6985   | 0.7993 |
+| 0.6443        | 4.0   | 920  | 0.6240          | 0.6838   | 0.8089 |
+| 0.6173        | 5.0   | 1150 | 0.5451          | 0.7255   | 0.8170 |
+| 0.6173        | 6.0   | 1380 | 0.5380          | 0.7451   | 0.8188 |
+| 0.5776        | 7.0   | 1610 | 0.5376          | 0.7426   | 0.8346 |
+| 0.5776        | 8.0   | 1840 | 0.5518          | 0.7230   | 0.8243 |
+| 0.5353        | 9.0   | 2070 | 0.5270          | 0.7475   | 0.8325 |
+| 0.5353        | 10.0  | 2300 | 0.5381          | 0.7377   | 0.8086 |
+| 0.5071        | 11.0  | 2530 | 0.5453          | 0.7181   | 0.7842 |
+| 0.5071        | 12.0  | 2760 | 0.5335          | 0.7475   | 0.8341 |
+| 0.5071        | 13.0  | 2990 | 0.5617          | 0.7083   | 0.7733 |
+| 0.492         | 14.0  | 3220 | 0.5343          | 0.7426   | 0.8115 |
+| 0.492         | 15.0  | 3450 | 0.5133          | 0.7696   | 0.8423 |
+| 0.4608        | 16.0  | 3680 | 0.5573          | 0.7549   | 0.8366 |
+| 0.4608        | 17.0  | 3910 | 0.5282          | 0.7721   | 0.8447 |
+| 0.4283        | 18.0  | 4140 | 0.5894          | 0.7132   | 0.7710 |
+| 0.4283        | 19.0  | 4370 | 0.5875          | 0.7328   | 0.8239 |
+| 0.4042        | 20.0  | 4600 | 0.5447          | 0.7647   | 0.8339 |
+| 0.4042        | 21.0  | 4830 | 0.5712          | 0.7598   | 0.8399 |
+| 0.3904        | 22.0  | 5060 | 0.5563          | 0.7623   | 0.8301 |
+| 0.3904        | 23.0  | 5290 | 0.5718          | 0.7623   | 0.8364 |
+| 0.3597        | 24.0  | 5520 | 0.5592          | 0.7525   | 0.8250 |
+| 0.3597        | 25.0  | 5750 | 0.5941          | 0.7574   | 0.8364 |
+| 0.3597        | 26.0  | 5980 | 0.5811          | 0.7623   | 0.8370 |
+| 0.3445        | 27.0  | 6210 | 0.6083          | 0.7549   | 0.8339 |
+| 0.3445        | 28.0  | 6440 | 0.6049          | 0.75     | 0.8265 |
+| 0.3197        | 29.0  | 6670 | 0.6042          | 0.7549   | 0.8311 |
+| 0.3197        | 30.0  | 6900 | 0.6260          | 0.7377   | 0.8099 |
+| 0.3           | 31.0  | 7130 | 0.6438          | 0.75     | 0.8229 |
+| 0.3           | 32.0  | 7360 | 0.6319          | 0.7402   | 0.8233 |
+| 0.2873        | 33.0  | 7590 | 0.6502          | 0.7402   | 0.8191 |
+| 0.2873        | 34.0  | 7820 | 0.6591          | 0.7426   | 0.8187 |
+| 0.2719        | 35.0  | 8050 | 0.6474          | 0.7451   | 0.8219 |
+| 0.2719        | 36.0  | 8280 | 0.6803          | 0.7598   | 0.8367 |
+| 0.2583        | 37.0  | 8510 | 0.6903          | 0.7475   | 0.8221 |
+| 0.2583        | 38.0  | 8740 | 0.6965          | 0.7525   | 0.8279 |
+| 0.2583        | 39.0  | 8970 | 0.6850          | 0.75     | 0.8235 |
+| 0.2423        | 40.0  | 9200 | 0.6823          | 0.7475   | 0.8245 |
+### Framework versions
+- Transformers 4.31.0
+- Pytorch 2.0.1+cu118
+- Datasets 2.14.5
+- Tokenizers 0.13.3