hung200504 commited on
Commit
9dbe82c
·
1 Parent(s): e27d3ee

falcon-7b-sharded

Browse files
Files changed (5) hide show
  1. README.md +23 -6
  2. adapter_config.json +1 -1
  3. adapter_model.bin +2 -2
  4. tokenizer.json +1 -6
  5. training_args.bin +2 -2
README.md CHANGED
@@ -1,7 +1,9 @@
1
  ---
2
- base_model: ybelkada/falcon-7b-sharded-bf16
3
  tags:
4
  - generated_from_trainer
 
 
5
  model-index:
6
  - name: falcon-7b-sharded
7
  results: []
@@ -12,7 +14,10 @@ should probably proofread and complete it, then remove this comment. -->
12
 
13
  # falcon-7b-sharded
14
 
15
- This model is a fine-tuned version of [ybelkada/falcon-7b-sharded-bf16](https://huggingface.co/ybelkada/falcon-7b-sharded-bf16) on an unknown dataset.
 
 
 
16
 
17
  ## Model description
18
 
@@ -31,19 +36,31 @@ More information needed
31
  ### Training hyperparameters
32
 
33
  The following hyperparameters were used during training:
34
- - learning_rate: 0.0002
35
  - train_batch_size: 4
36
  - eval_batch_size: 8
37
  - seed: 42
38
  - gradient_accumulation_steps: 4
39
  - total_train_batch_size: 16
40
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
- - lr_scheduler_type: constant
42
- - lr_scheduler_warmup_ratio: 0.03
43
  - training_steps: 500
44
 
45
  ### Training results
46
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
 
49
  ### Framework versions
@@ -51,4 +68,4 @@ The following hyperparameters were used during training:
51
  - Transformers 4.34.0
52
  - Pytorch 2.0.1+cu118
53
  - Datasets 2.14.5
54
- - Tokenizers 0.14.0
 
1
  ---
2
+ base_model: cosmin/falcon-7b-sharded-bf16
3
  tags:
4
  - generated_from_trainer
5
+ metrics:
6
+ - f1
7
  model-index:
8
  - name: falcon-7b-sharded
9
  results: []
 
14
 
15
  # falcon-7b-sharded
16
 
17
+ This model is a fine-tuned version of [cosmin/falcon-7b-sharded-bf16](https://huggingface.co/cosmin/falcon-7b-sharded-bf16) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 2.3060
20
+ - F1: 0.0027
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - learning_rate: 0.0001
40
  - train_batch_size: 4
41
  - eval_batch_size: 8
42
  - seed: 42
43
  - gradient_accumulation_steps: 4
44
  - total_train_batch_size: 16
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
+ - lr_scheduler_type: linear
47
+ - lr_scheduler_warmup_ratio: 0.1
48
  - training_steps: 500
49
 
50
  ### Training results
51
 
52
+ | Training Loss | Epoch | Step | Validation Loss | F1 |
53
+ |:-------------:|:-----:|:----:|:---------------:|:------:|
54
+ | 3.0776 | 1.0 | 55 | 2.8379 | 0.0027 |
55
+ | 2.3055 | 1.99 | 110 | 2.5165 | 0.0220 |
56
+ | 2.3104 | 2.99 | 165 | 2.4452 | 0.0027 |
57
+ | 2.1221 | 4.0 | 221 | 2.3845 | 0.0027 |
58
+ | 2.2114 | 5.0 | 276 | 2.3660 | 0.0151 |
59
+ | 2.0432 | 5.99 | 331 | 2.3325 | 0.0124 |
60
+ | 2.0811 | 6.99 | 386 | 2.3185 | 0.0027 |
61
+ | 2.0372 | 8.0 | 442 | 2.3066 | 0.0027 |
62
+ | 2.019 | 9.0 | 497 | 2.3058 | 0.0027 |
63
+ | 2.0906 | 9.05 | 500 | 2.3060 | 0.0027 |
64
 
65
 
66
  ### Framework versions
 
68
  - Transformers 4.34.0
69
  - Pytorch 2.0.1+cu118
70
  - Datasets 2.14.5
71
+ - Tokenizers 0.14.1
adapter_config.json CHANGED
@@ -12,6 +12,6 @@
12
  "num_virtual_tokens": 20,
13
  "peft_type": "P_TUNING",
14
  "revision": null,
15
- "task_type": "CAUSAL_LM",
16
  "token_dim": 4544
17
  }
 
12
  "num_virtual_tokens": 20,
13
  "peft_type": "P_TUNING",
14
  "revision": null,
15
+ "task_type": "QUESTION_ANS",
16
  "token_dim": 4544
17
  }
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:786916376b10ef6a7b55ee41861cc4a5c373af52834dd6d1ff4e80da0a0603a0
3
- size 364349
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ead77dd8d9757e9a6d8ca2492750b4283a7c25a7ff7dbe3353ee5665f5ea73c8
3
+ size 401281
tokenizer.json CHANGED
@@ -1,11 +1,6 @@
1
  {
2
  "version": "1.0",
3
- "truncation": {
4
- "direction": "Right",
5
- "max_length": 512,
6
- "strategy": "LongestFirst",
7
- "stride": 0
8
- },
9
  "padding": null,
10
  "added_tokens": [
11
  {
 
1
  {
2
  "version": "1.0",
3
+ "truncation": null,
 
 
 
 
 
4
  "padding": null,
5
  "added_tokens": [
6
  {
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9770ccf02dca346af435546e5b35ef8d63969b5a94a4084a52971fa36cc895be
3
- size 4091
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a20fb378e3ee2744a6b215dcb603a4afeeedcf7ae064a21d7e9ef727f18990db
3
+ size 4027