seriouspark commited on
Commit
3542532
1 Parent(s): 5940267

seriouspark/gemma-2b-it-v0.3-lora-persona1_epoch1

Browse files
README.md CHANGED
@@ -5,7 +5,7 @@ tags:
5
  - trl
6
  - sft
7
  - generated_from_trainer
8
- base_model: google/gemma-7b-it
9
  model-index:
10
  - name: outputs
11
  results: []
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # outputs
18
 
19
- This model is a fine-tuned version of [google/gemma-7b-it](https://huggingface.co/google/gemma-7b-it) on an unknown dataset.
20
 
21
  ## Model description
22
 
@@ -44,13 +44,17 @@ The following hyperparameters were used during training:
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 0.1
47
- - training_steps: 1000
48
  - mixed_precision_training: Native AMP
49
 
 
 
 
 
50
  ### Framework versions
51
 
52
  - PEFT 0.8.2
53
  - Transformers 4.39.0
54
- - Pytorch 2.1.2
55
- - Datasets 2.19.2
56
  - Tokenizers 0.15.2
 
5
  - trl
6
  - sft
7
  - generated_from_trainer
8
+ base_model: google/gemma-2b-it
9
  model-index:
10
  - name: outputs
11
  results: []
 
16
 
17
  # outputs
18
 
19
+ This model is a fine-tuned version of [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) on an unknown dataset.
20
 
21
  ## Model description
22
 
 
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 0.1
47
+ - training_steps: 16000
48
  - mixed_precision_training: Native AMP
49
 
50
+ ### Training results
51
+
52
+
53
+
54
  ### Framework versions
55
 
56
  - PEFT 0.8.2
57
  - Transformers 4.39.0
58
+ - Pytorch 2.2.1+cu121
59
+ - Datasets 2.17.0
60
  - Tokenizers 0.15.2
adapter_config.json CHANGED
@@ -19,13 +19,13 @@
19
  "rank_pattern": {},
20
  "revision": null,
21
  "target_modules": [
22
- "q_proj",
23
  "gate_proj",
24
- "down_proj",
25
  "k_proj",
 
26
  "up_proj",
27
- "o_proj",
28
- "v_proj"
29
  ],
30
  "task_type": "CAUSAL_LM",
31
  "use_rslora": false
 
19
  "rank_pattern": {},
20
  "revision": null,
21
  "target_modules": [
22
+ "v_proj",
23
  "gate_proj",
24
+ "o_proj",
25
  "k_proj",
26
+ "q_proj",
27
  "up_proj",
28
+ "down_proj"
 
29
  ],
30
  "task_type": "CAUSAL_LM",
31
  "use_rslora": false
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7d7fb3cb96f78b6bf279c3ff078caf4bd6fbdcdaa0ad3b4b030d8ba0ce8edb74
3
- size 400104600
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e00cb039c054e7b8b981586a207081e21bcb9453dcd86cf2aa8925343bdf3ab
3
+ size 156965440
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c5e18dd421112bed671fb07745e80f2721b67f3a2fcf4fdf49c1a938140902da
3
- size 4856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f9e68f7c00431ff948ed595f55879bf54e07b92103602e906af35a43fa83280
3
+ size 4920