Update README.md
Browse files
README.md
CHANGED
@@ -5,14 +5,14 @@ tags: []
|
|
5 |
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
-
|
9 |
-
|
10 |
-
|
|
11 |
-
|
|
12 |
-
|
|
13 |
-
|
|
14 |
-
|
15 |
-
|
16 |
|
17 |
## Model Details
|
18 |
|
|
|
5 |
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
+
this model isn't really made for benchmarks, it's worse on everything besides ARC-C and TruthfulQA
|
9 |
+
|
10 |
+
| Model | ARC-C | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8k |
|
11 |
+
| ------------------------------------------------------------ | --------- | --------- | ---------- | ---------- | ---------- | --------- |
|
12 |
+
| [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) | 59.98 | **83.31** | **64.16** | 42.15 | **78.37** | **37.83** |
|
13 |
+
| [crumb/92d52f-ame-full-7B](https://hf.co/crumb/92d52f-ame-full-7B) | **61.18** | 81.52 | 63.44 | **42.39** | 77.58 | 35.41 |
|
14 |
+
|
15 |
+
it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
|
16 |
|
17 |
## Model Details
|
18 |
|