falcon-7b-sharded

Browse files

Files changed (5) hide show

README.md +23 -6
adapter_config.json +1 -1
adapter_model.bin +2 -2
tokenizer.json +1 -6
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -1,7 +1,9 @@
 ---
-base_model: ybelkada/falcon-7b-sharded-bf16
 tags:
 - generated_from_trainer
 model-index:
 - name: falcon-7b-sharded
   results: []
@@ -12,7 +14,10 @@ should probably proofread and complete it, then remove this comment. -->
 # falcon-7b-sharded
-This model is a fine-tuned version of [ybelkada/falcon-7b-sharded-bf16](https://huggingface.co/ybelkada/falcon-7b-sharded-bf16) on an unknown dataset.
 ## Model description
@@ -31,19 +36,31 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0002
 - train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: constant
-- lr_scheduler_warmup_ratio: 0.03
 - training_steps: 500
 ### Training results
 ### Framework versions
@@ -51,4 +68,4 @@ The following hyperparameters were used during training:
 - Transformers 4.34.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.5
-- Tokenizers 0.14.0

 ---
+base_model: cosmin/falcon-7b-sharded-bf16
 tags:
 - generated_from_trainer
+metrics:
+- f1
 model-index:
 - name: falcon-7b-sharded
   results: []
 # falcon-7b-sharded
+This model is a fine-tuned version of [cosmin/falcon-7b-sharded-bf16](https://huggingface.co/cosmin/falcon-7b-sharded-bf16) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.3060
+- F1: 0.0027
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
 - train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
 - training_steps: 500
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | F1     |
+|:-------------:|:-----:|:----:|:---------------:|:------:|
+| 3.0776        | 1.0   | 55   | 2.8379          | 0.0027 |
+| 2.3055        | 1.99  | 110  | 2.5165          | 0.0220 |
+| 2.3104        | 2.99  | 165  | 2.4452          | 0.0027 |
+| 2.1221        | 4.0   | 221  | 2.3845          | 0.0027 |
+| 2.2114        | 5.0   | 276  | 2.3660          | 0.0151 |
+| 2.0432        | 5.99  | 331  | 2.3325          | 0.0124 |
+| 2.0811        | 6.99  | 386  | 2.3185          | 0.0027 |
+| 2.0372        | 8.0   | 442  | 2.3066          | 0.0027 |
+| 2.019         | 9.0   | 497  | 2.3058          | 0.0027 |
+| 2.0906        | 9.05  | 500  | 2.3060          | 0.0027 |
 ### Framework versions
 - Transformers 4.34.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.5
+- Tokenizers 0.14.1

adapter_config.json CHANGED Viewed

@@ -12,6 +12,6 @@
   "num_virtual_tokens": 20,
   "peft_type": "P_TUNING",
   "revision": null,
-  "task_type": "CAUSAL_LM",
   "token_dim": 4544
 }

   "num_virtual_tokens": 20,
   "peft_type": "P_TUNING",
   "revision": null,
+  "task_type": "QUESTION_ANS",
   "token_dim": 4544
 }

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:786916376b10ef6a7b55ee41861cc4a5c373af52834dd6d1ff4e80da0a0603a0
-size 364349

 version https://git-lfs.github.com/spec/v1
+oid sha256:ead77dd8d9757e9a6d8ca2492750b4283a7c25a7ff7dbe3353ee5665f5ea73c8
+size 401281

tokenizer.json CHANGED Viewed

@@ -1,11 +1,6 @@
 {
   "version": "1.0",
-  "truncation": {
-    "direction": "Right",
-    "max_length": 512,
-    "strategy": "LongestFirst",
-    "stride": 0
-  },
   "padding": null,
   "added_tokens": [
     {

 {
   "version": "1.0",
+  "truncation": null,
   "padding": null,
   "added_tokens": [
     {

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9770ccf02dca346af435546e5b35ef8d63969b5a94a4084a52971fa36cc895be
-size 4091

 version https://git-lfs.github.com/spec/v1
+oid sha256:a20fb378e3ee2744a6b215dcb603a4afeeedcf7ae064a21d7e9ef727f18990db
+size 4027