OpOp1 commited on
Commit
1f8beb6
1 Parent(s): 2398d42

OpOp1/TI-GPT-735M

Browse files
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- license: other
3
  library_name: peft
4
  tags:
5
  - generated_from_trainer
6
- base_model: google/gemma-2b-it
7
  model-index:
8
  - name: shawgpt-ft
9
  results: []
@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # shawgpt-ft
16
 
17
- This model is a fine-tuned version of [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 1.7817
20
 
21
  ## Model description
22
 
@@ -51,16 +51,16 @@ The following hyperparameters were used during training:
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | 4.063 | 1.0 | 10 | 3.5780 |
55
- | 3.31 | 2.0 | 20 | 3.0259 |
56
- | 2.8245 | 3.0 | 30 | 2.5810 |
57
- | 2.4092 | 4.0 | 40 | 2.2151 |
58
- | 2.1057 | 5.0 | 50 | 1.9864 |
59
- | 1.9341 | 6.0 | 60 | 1.8753 |
60
- | 1.8583 | 7.0 | 70 | 1.8140 |
61
- | 1.7906 | 8.0 | 80 | 1.7611 |
62
- | 1.7858 | 9.0 | 90 | 1.7852 |
63
- | 1.7948 | 10.0 | 100 | 1.7817 |
64
 
65
 
66
  ### Framework versions
 
1
  ---
2
+ license: cc-by-nc-4.0
3
  library_name: peft
4
  tags:
5
  - generated_from_trainer
6
+ base_model: MBZUAI/LaMini-GPT-774M
7
  model-index:
8
  - name: shawgpt-ft
9
  results: []
 
14
 
15
  # shawgpt-ft
16
 
17
+ This model is a fine-tuned version of [MBZUAI/LaMini-GPT-774M](https://huggingface.co/MBZUAI/LaMini-GPT-774M) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.4635
20
 
21
  ## Model description
22
 
 
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
+ | 3.5991 | 1.0 | 5 | 3.4306 |
55
+ | 3.3522 | 2.0 | 10 | 3.0302 |
56
+ | 2.9388 | 3.0 | 15 | 2.6452 |
57
+ | 2.621 | 4.0 | 20 | 2.3555 |
58
+ | 2.3501 | 5.0 | 25 | 2.1047 |
59
+ | 2.1243 | 6.0 | 30 | 1.8846 |
60
+ | 1.9309 | 7.0 | 35 | 1.6957 |
61
+ | 1.7786 | 8.0 | 40 | 1.5726 |
62
+ | 1.6718 | 9.0 | 45 | 1.4961 |
63
+ | 1.6283 | 10.0 | 50 | 1.4635 |
64
 
65
 
66
  ### Framework versions
adapter_config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "alpha_pattern": {},
3
  "auto_mapping": null,
4
- "base_model_name_or_path": "google/gemma-2b-it",
5
  "bias": "none",
6
  "fan_in_fan_out": false,
7
  "inference_mode": true,
@@ -20,7 +20,7 @@
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
- "q_proj"
24
  ],
25
  "task_type": "CAUSAL_LM",
26
  "use_dora": false,
 
1
  {
2
  "alpha_pattern": {},
3
  "auto_mapping": null,
4
+ "base_model_name_or_path": "MBZUAI/LaMini-GPT-774M",
5
  "bias": "none",
6
  "fan_in_fan_out": false,
7
  "inference_mode": true,
 
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
+ "c_proj"
24
  ],
25
  "task_type": "CAUSAL_LM",
26
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a093ecf135bad9fd1e1ebb7c3ea83e0e0af4f097b473775303233f1539a943d3
3
- size 2364032
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af6f0d7bf8f3b5b58143633d46df622bbb118957f496b1cf34bd9a0d73cca5b8
3
+ size 10340256
runs/Apr11_18-22-57_4c15467c46e6/events.out.tfevents.1712859780.4c15467c46e6.7200.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82e6a799a862a08aa382d1811f13f7c409d2df244e3eafbd52bd592f69bbdcfa
3
+ size 10355
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cc24de6a7360e87caadd0feaf93abc73aaa0d2eacf9acbb786807cc86a0012f4
3
  size 4856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc4897ca3e2228ef1fe3dc1b55f4565f261a9c7fef6d507548504092b908bdcb
3
  size 4856