Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,13 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
tags:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
# Model Card for Model ID
|
@@ -17,13 +24,13 @@ tags: []
|
|
17 |
|
18 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
19 |
|
20 |
-
- **Developed by:**
|
21 |
-
- **Funded by [optional]:**
|
22 |
-
- **Shared by [optional]:**
|
23 |
-
- **Model type:**
|
24 |
-
- **Language(s) (NLP):**
|
25 |
-
- **License:**
|
26 |
-
- **Finetuned from model [optional]:**
|
27 |
|
28 |
### Model Sources [optional]
|
29 |
|
@@ -35,11 +42,14 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
35 |
|
36 |
## Uses
|
37 |
|
38 |
-
|
39 |
|
40 |
### Direct Use
|
|
|
41 |
|
42 |
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
|
|
|
|
43 |
|
44 |
[More Information Needed]
|
45 |
|
@@ -102,6 +112,50 @@ Use the code below to get started with the model.
|
|
102 |
|
103 |
## Evaluation
|
104 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
105 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
106 |
|
107 |
### Testing Data, Factors & Metrics
|
@@ -144,11 +198,13 @@ Use the code below to get started with the model.
|
|
144 |
|
145 |
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
146 |
|
147 |
-
- **Hardware Type:**
|
148 |
-
- **Hours used:**
|
149 |
-
- **Cloud Provider:**
|
150 |
-
- **Compute Region:**
|
151 |
-
- **Carbon Emitted:**
|
|
|
|
|
152 |
|
153 |
## Technical Specifications [optional]
|
154 |
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
tags:
|
4 |
+
- merge
|
5 |
+
- sliced
|
6 |
+
- minimalist
|
7 |
+
license: apache-2.0
|
8 |
+
metrics:
|
9 |
+
- accuracy
|
10 |
+
- bleu
|
11 |
---
|
12 |
|
13 |
# Model Card for Model ID
|
|
|
24 |
|
25 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
26 |
|
27 |
+
- **Developed by:** Tatman Electric
|
28 |
+
- **Funded by [optional]:** Spare Pocket Lint
|
29 |
+
- **Shared by [optional]:** TRL
|
30 |
+
- **Model type:** Sliced Layered
|
31 |
+
- **Language(s) (NLP):** Mixed
|
32 |
+
- **License:** Pythia @ EleutherAI
|
33 |
+
- **Finetuned from model [optional]:** EleutherAI/pythia-2.8b-deduped
|
34 |
|
35 |
### Model Sources [optional]
|
36 |
|
|
|
42 |
|
43 |
## Uses
|
44 |
|
45 |
+
Before there were merged models, there were slices of shards of... stuff. Those slices have meaning. Those slices are real slices too.
|
46 |
|
47 |
### Direct Use
|
48 |
+
Part of a series of slice and dice mods.
|
49 |
|
50 |
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
51 |
+
##### Single Hidden Layer Pythia
|
52 |
+
What does a single hidden layer preserve from a 12 layer base model?
|
53 |
|
54 |
[More Information Needed]
|
55 |
|
|
|
112 |
|
113 |
## Evaluation
|
114 |
|
115 |
+
| Groups |Version| Filter |n-shot| Metric | Value | |Stderr|
|
116 |
+
|--------------------|-------|----------------|-----:|-----------|------:|---|-----:|
|
117 |
+
|Open LLM Leaderboard|N/A |none | 5|rouge1_max |36.3550|± |0.9462|
|
118 |
+
| | |flexible-extract| 5|exact_match| 0.0220|± |0.0066|
|
119 |
+
| - arc_challenge | 1|none | 25|acc | 0.1760|± |0.0170|
|
120 |
+
| | |none | 25|acc_norm | 0.2320|± |0.0189|
|
121 |
+
| - gsm8k | 3|strict-match | 5|exact_match| 0.0060|± |0.0035|
|
122 |
+
| | |flexible-extract| 5|exact_match| 0.0220|± |0.0066|
|
123 |
+
| - hellaswag | 1|none | 10|acc | 0.3520|± |0.0214|
|
124 |
+
| | |none | 10|acc_norm | 0.4040|± |0.0220|
|
125 |
+
| - winogrande | 1|none | 5|acc | 0.5120|± |0.0224|
|
126 |
+
| | |none | 5|bleu_diff |-0.6500|± |0.6421|
|
127 |
+
| | |none | 5|rouge1_acc | 0.3700|± |0.0216|
|
128 |
+
| | |none | 5|rouge1_diff|-1.5564|± |1.0223|
|
129 |
+
| | |none | 5|acc | 0.2664|± |0.0036|
|
130 |
+
| | |none | 5|rougeL_max |33.8798|± |0.9367|
|
131 |
+
| | |none | 5|rouge2_diff|-3.3178|± |0.9477|
|
132 |
+
| | |none | 5|bleu_max |15.2292|± |0.6714|
|
133 |
+
| | |none | 5|bleu_acc | 0.4360|± |0.0222|
|
134 |
+
| | |none | 5|rouge2_max |16.4873|± |1.0172|
|
135 |
+
| | |none | 5|acc_norm | 0.3180|± |0.0145|
|
136 |
+
| | |strict-match | 5|exact_match| 0.0060|± |0.0035|
|
137 |
+
| | |none | 5|rougeL_diff|-0.7765|± |1.0034|
|
138 |
+
| | |none | 5|rougeL_acc | 0.3860|± |0.0218|
|
139 |
+
| | |none | 5|rouge2_acc | 0.1920|± |0.0176|
|
140 |
+
| - mmlu |N/A |none | 0|acc | 0.2533|± |0.0039|
|
141 |
+
| - humanities |N/A |none | 5|acc | 0.2408|± |0.0075|
|
142 |
+
| - other |N/A |none | 5|acc | 0.2443|± |0.0080|
|
143 |
+
| - social_sciences |N/A |none | 5|acc | 0.2538|± |0.0081|
|
144 |
+
| - stem |N/A |none | 5|acc | 0.2740|± |0.0079|
|
145 |
+
| - truthfulqa |N/A |none | 0|rouge1_max |36.3550|± |0.9462|
|
146 |
+
| | |none | 0|bleu_diff |-0.6500|± |0.6421|
|
147 |
+
| | |none | 0|rouge1_acc | 0.3700|± |0.0216|
|
148 |
+
| | |none | 0|rouge1_diff|-1.5564|± |1.0223|
|
149 |
+
| | |none | 0|acc | 0.3435|± |0.0137|
|
150 |
+
| | |none | 0|rougeL_max |33.8798|± |0.9367|
|
151 |
+
| | |none | 0|bleu_max |15.2292|± |0.6714|
|
152 |
+
| | |none | 0|bleu_acc | 0.4360|± |0.0222|
|
153 |
+
| | |none | 0|rouge2_max |16.4873|± |1.0172|
|
154 |
+
| | |none | 0|rougeL_acc | 0.3860|± |0.0218|
|
155 |
+
| | |none | 0|rougeL_diff|-0.7765|± |1.0034|
|
156 |
+
| | |none | 0|rouge2_acc | 0.1920|± |0.0176|
|
157 |
+
| | |none | 0|rouge2_diff|-3.3178|± |0.9477|
|
158 |
+
|
159 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
160 |
|
161 |
### Testing Data, Factors & Metrics
|
|
|
198 |
|
199 |
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
200 |
|
201 |
+
- **Hardware Type:** OldAsDirt
|
202 |
+
- **Hours used:** 5
|
203 |
+
- **Cloud Provider:** YourMomsBasement
|
204 |
+
- **Compute Region:** Siberia
|
205 |
+
- **Carbon Emitted:** 8ppm
|
206 |
+
|
207 |
+
No yaks were harmed in the making of this model.
|
208 |
|
209 |
## Technical Specifications [optional]
|
210 |
|