Aharneish commited on
Commit
cf0c98b
1 Parent(s): a55ef58

Upload model

Browse files
Files changed (3) hide show
  1. README.md +189 -132
  2. adapter_config.json +2 -0
  3. adapter_model.bin +2 -2
README.md CHANGED
@@ -1,150 +1,207 @@
1
  ---
2
- license: mit
3
  base_model: Aharneish/gpt2-spiritual
4
- tags:
5
- - generated_from_trainer
6
- model-index:
7
- - name: gpt-2-spiritualtest-LoRA
8
- results: []
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
 
14
- # gpt-2-spiritualtest-LoRA
15
 
16
- This model is a fine-tuned version of [Aharneish/gpt2-spiritual](https://huggingface.co/Aharneish/gpt2-spiritual) on the None dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 0.6818
19
 
20
- ## Model description
21
 
22
- More information needed
23
 
24
- ## Intended uses & limitations
25
 
26
- More information needed
27
 
28
- ## Training and evaluation data
29
 
30
- More information needed
31
 
32
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
- ### Training hyperparameters
35
-
36
- The following hyperparameters were used during training:
37
- - learning_rate: 1e-05
38
- - train_batch_size: 32
39
- - eval_batch_size: 32
40
- - seed: 42
41
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
- - lr_scheduler_type: linear
43
- - num_epochs: 200
44
-
45
- ### Training results
46
-
47
- | Training Loss | Epoch | Step | Validation Loss |
48
- |:-------------:|:------:|:-----:|:---------------:|
49
- | 2.489 | 2.12 | 500 | 1.9065 |
50
- | 2.2722 | 4.24 | 1000 | 1.6764 |
51
- | 2.1401 | 6.36 | 1500 | 1.5225 |
52
- | 2.0433 | 8.47 | 2000 | 1.3953 |
53
- | 1.9827 | 10.59 | 2500 | 1.3053 |
54
- | 1.9249 | 12.71 | 3000 | 1.2289 |
55
- | 1.8814 | 14.83 | 3500 | 1.1599 |
56
- | 1.8562 | 16.95 | 4000 | 1.1164 |
57
- | 1.8285 | 19.07 | 4500 | 1.0753 |
58
- | 1.8037 | 21.19 | 5000 | 1.0442 |
59
- | 1.7835 | 23.31 | 5500 | 1.0104 |
60
- | 1.7675 | 25.42 | 6000 | 0.9916 |
61
- | 1.7554 | 27.54 | 6500 | 0.9726 |
62
- | 1.7389 | 29.66 | 7000 | 0.9672 |
63
- | 1.7284 | 31.78 | 7500 | 0.9443 |
64
- | 1.7196 | 33.9 | 8000 | 0.9335 |
65
- | 1.7104 | 36.02 | 8500 | 0.9153 |
66
- | 1.7013 | 38.14 | 9000 | 0.9058 |
67
- | 1.6862 | 40.25 | 9500 | 0.8875 |
68
- | 1.6828 | 42.37 | 10000 | 0.8942 |
69
- | 1.6779 | 44.49 | 10500 | 0.8804 |
70
- | 1.67 | 46.61 | 11000 | 0.8699 |
71
- | 1.6648 | 48.73 | 11500 | 0.8617 |
72
- | 1.6576 | 50.85 | 12000 | 0.8481 |
73
- | 1.6506 | 52.97 | 12500 | 0.8562 |
74
- | 1.647 | 55.08 | 13000 | 0.8444 |
75
- | 1.6382 | 57.2 | 13500 | 0.8349 |
76
- | 1.6401 | 59.32 | 14000 | 0.8380 |
77
- | 1.6304 | 61.44 | 14500 | 0.8254 |
78
- | 1.6283 | 63.56 | 15000 | 0.8234 |
79
- | 1.6159 | 65.68 | 15500 | 0.8119 |
80
- | 1.622 | 67.8 | 16000 | 0.8119 |
81
- | 1.6146 | 69.92 | 16500 | 0.8091 |
82
- | 1.6101 | 72.03 | 17000 | 0.8034 |
83
- | 1.6049 | 74.15 | 17500 | 0.7934 |
84
- | 1.5976 | 76.27 | 18000 | 0.7905 |
85
- | 1.5949 | 78.39 | 18500 | 0.7883 |
86
- | 1.5907 | 80.51 | 19000 | 0.7874 |
87
- | 1.5952 | 82.63 | 19500 | 0.7869 |
88
- | 1.5843 | 84.75 | 20000 | 0.7811 |
89
- | 1.5857 | 86.86 | 20500 | 0.7793 |
90
- | 1.5813 | 88.98 | 21000 | 0.7725 |
91
- | 1.5753 | 91.1 | 21500 | 0.7727 |
92
- | 1.5725 | 93.22 | 22000 | 0.7663 |
93
- | 1.5687 | 95.34 | 22500 | 0.7643 |
94
- | 1.5696 | 97.46 | 23000 | 0.7667 |
95
- | 1.5605 | 99.58 | 23500 | 0.7615 |
96
- | 1.5681 | 101.69 | 24000 | 0.7581 |
97
- | 1.5587 | 103.81 | 24500 | 0.7563 |
98
- | 1.5573 | 105.93 | 25000 | 0.7559 |
99
- | 1.5532 | 108.05 | 25500 | 0.7482 |
100
- | 1.5488 | 110.17 | 26000 | 0.7496 |
101
- | 1.5468 | 112.29 | 26500 | 0.7440 |
102
- | 1.5496 | 114.41 | 27000 | 0.7427 |
103
- | 1.5471 | 116.53 | 27500 | 0.7449 |
104
- | 1.5367 | 118.64 | 28000 | 0.7405 |
105
- | 1.5375 | 120.76 | 28500 | 0.7368 |
106
- | 1.5362 | 122.88 | 29000 | 0.7302 |
107
- | 1.5347 | 125.0 | 29500 | 0.7294 |
108
- | 1.5309 | 127.12 | 30000 | 0.7306 |
109
- | 1.5267 | 129.24 | 30500 | 0.7240 |
110
- | 1.5289 | 131.36 | 31000 | 0.7288 |
111
- | 1.523 | 133.47 | 31500 | 0.7268 |
112
- | 1.5197 | 135.59 | 32000 | 0.7200 |
113
- | 1.5184 | 137.71 | 32500 | 0.7192 |
114
- | 1.5188 | 139.83 | 33000 | 0.7140 |
115
- | 1.5161 | 141.95 | 33500 | 0.7182 |
116
- | 1.5156 | 144.07 | 34000 | 0.7136 |
117
- | 1.5066 | 146.19 | 34500 | 0.7079 |
118
- | 1.5063 | 148.31 | 35000 | 0.7099 |
119
- | 1.5103 | 150.42 | 35500 | 0.7099 |
120
- | 1.5046 | 152.54 | 36000 | 0.7059 |
121
- | 1.503 | 154.66 | 36500 | 0.7057 |
122
- | 1.5005 | 156.78 | 37000 | 0.7026 |
123
- | 1.4998 | 158.9 | 37500 | 0.7014 |
124
- | 1.4989 | 161.02 | 38000 | 0.6996 |
125
- | 1.4931 | 163.14 | 38500 | 0.6997 |
126
- | 1.4915 | 165.25 | 39000 | 0.6957 |
127
- | 1.489 | 167.37 | 39500 | 0.6974 |
128
- | 1.4906 | 169.49 | 40000 | 0.6969 |
129
- | 1.4859 | 171.61 | 40500 | 0.6956 |
130
- | 1.4881 | 173.73 | 41000 | 0.6921 |
131
- | 1.4836 | 175.85 | 41500 | 0.6928 |
132
- | 1.4818 | 177.97 | 42000 | 0.6901 |
133
- | 1.482 | 180.08 | 42500 | 0.6912 |
134
- | 1.4778 | 182.2 | 43000 | 0.6885 |
135
- | 1.4763 | 184.32 | 43500 | 0.6885 |
136
- | 1.4807 | 186.44 | 44000 | 0.6848 |
137
- | 1.474 | 188.56 | 44500 | 0.6833 |
138
- | 1.4712 | 190.68 | 45000 | 0.6829 |
139
- | 1.4715 | 192.8 | 45500 | 0.6826 |
140
- | 1.4682 | 194.92 | 46000 | 0.6831 |
141
- | 1.4706 | 197.03 | 46500 | 0.6819 |
142
- | 1.4674 | 199.15 | 47000 | 0.6818 |
143
 
144
 
145
  ### Framework versions
146
 
147
- - Transformers 4.34.0
148
- - Pytorch 2.0.1+cu118
149
- - Datasets 2.14.5
150
- - Tokenizers 0.14.1
 
1
  ---
2
+ library_name: peft
3
  base_model: Aharneish/gpt2-spiritual
 
 
 
 
 
4
  ---
5
 
6
+ # Model Card for Model ID
 
7
 
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
 
 
 
 
10
 
 
11
 
12
+ ## Model Details
13
 
14
+ ### Model Description
15
 
16
+ <!-- Provide a longer summary of what this model is. -->
17
 
 
18
 
 
19
 
20
+ - **Developed by:** [More Information Needed]
21
+ - **Shared by [optional]:** [More Information Needed]
22
+ - **Model type:** [More Information Needed]
23
+ - **Language(s) (NLP):** [More Information Needed]
24
+ - **License:** [More Information Needed]
25
+ - **Finetuned from model [optional]:** [More Information Needed]
26
+
27
+ ### Model Sources [optional]
28
+
29
+ <!-- Provide the basic links for the model. -->
30
+
31
+ - **Repository:** [More Information Needed]
32
+ - **Paper [optional]:** [More Information Needed]
33
+ - **Demo [optional]:** [More Information Needed]
34
+
35
+ ## Uses
36
+
37
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
38
+
39
+ ### Direct Use
40
+
41
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
42
+
43
+ [More Information Needed]
44
+
45
+ ### Downstream Use [optional]
46
+
47
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
48
+
49
+ [More Information Needed]
50
+
51
+ ### Out-of-Scope Use
52
+
53
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
54
+
55
+ [More Information Needed]
56
+
57
+ ## Bias, Risks, and Limitations
58
+
59
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
60
+
61
+ [More Information Needed]
62
+
63
+ ### Recommendations
64
+
65
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
66
+
67
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
68
+
69
+ ## How to Get Started with the Model
70
+
71
+ Use the code below to get started with the model.
72
+
73
+ [More Information Needed]
74
+
75
+ ## Training Details
76
+
77
+ ### Training Data
78
+
79
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
80
+
81
+ [More Information Needed]
82
+
83
+ ### Training Procedure
84
+
85
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
86
+
87
+ #### Preprocessing [optional]
88
+
89
+ [More Information Needed]
90
+
91
+
92
+ #### Training Hyperparameters
93
+
94
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
95
+
96
+ #### Speeds, Sizes, Times [optional]
97
+
98
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
99
+
100
+ [More Information Needed]
101
+
102
+ ## Evaluation
103
+
104
+ <!-- This section describes the evaluation protocols and provides the results. -->
105
+
106
+ ### Testing Data, Factors & Metrics
107
+
108
+ #### Testing Data
109
+
110
+ <!-- This should link to a Data Card if possible. -->
111
+
112
+ [More Information Needed]
113
+
114
+ #### Factors
115
 
116
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
117
+
118
+ [More Information Needed]
119
+
120
+ #### Metrics
121
+
122
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
123
+
124
+ [More Information Needed]
125
+
126
+ ### Results
127
+
128
+ [More Information Needed]
129
+
130
+ #### Summary
131
+
132
+
133
+
134
+ ## Model Examination [optional]
135
+
136
+ <!-- Relevant interpretability work for the model goes here -->
137
+
138
+ [More Information Needed]
139
+
140
+ ## Environmental Impact
141
+
142
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
143
+
144
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
145
+
146
+ - **Hardware Type:** [More Information Needed]
147
+ - **Hours used:** [More Information Needed]
148
+ - **Cloud Provider:** [More Information Needed]
149
+ - **Compute Region:** [More Information Needed]
150
+ - **Carbon Emitted:** [More Information Needed]
151
+
152
+ ## Technical Specifications [optional]
153
+
154
+ ### Model Architecture and Objective
155
+
156
+ [More Information Needed]
157
+
158
+ ### Compute Infrastructure
159
+
160
+ [More Information Needed]
161
+
162
+ #### Hardware
163
+
164
+ [More Information Needed]
165
+
166
+ #### Software
167
+
168
+ [More Information Needed]
169
+
170
+ ## Citation [optional]
171
+
172
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
173
+
174
+ **BibTeX:**
175
+
176
+ [More Information Needed]
177
+
178
+ **APA:**
179
+
180
+ [More Information Needed]
181
+
182
+ ## Glossary [optional]
183
+
184
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
185
+
186
+ [More Information Needed]
187
+
188
+ ## More Information [optional]
189
+
190
+ [More Information Needed]
191
+
192
+ ## Model Card Authors [optional]
193
+
194
+ [More Information Needed]
195
+
196
+ ## Model Card Contact
197
+
198
+ [More Information Needed]
199
+
200
+
201
+ ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
202
 
203
 
204
  ### Framework versions
205
 
206
+
207
+ - PEFT 0.6.0.dev0
 
 
adapter_config.json CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "auto_mapping": null,
3
  "base_model_name_or_path": "Aharneish/gpt2-spiritual",
4
  "bias": "none",
@@ -12,6 +13,7 @@
12
  "modules_to_save": null,
13
  "peft_type": "LORA",
14
  "r": 16,
 
15
  "revision": null,
16
  "target_modules": [
17
  "c_attn"
 
1
  {
2
+ "alpha_pattern": {},
3
  "auto_mapping": null,
4
  "base_model_name_or_path": "Aharneish/gpt2-spiritual",
5
  "bias": "none",
 
13
  "modules_to_save": null,
14
  "peft_type": "LORA",
15
  "r": 16,
16
+ "rank_pattern": {},
17
  "revision": null,
18
  "target_modules": [
19
  "c_attn"
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:329f8fbc351b7cf1dd372933782553d8b549854ddc056efc7be162a751095d65
3
- size 2367673
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e5e1621f48d9ad8feb1d6d31050275f0aafd080c5c07153301fe2f48411f4406
3
+ size 443