crumb commited on
Commit
caa3d03
1 Parent(s): ed7034d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -5,14 +5,14 @@ tags: []
5
 
6
  # Model Card for Model ID
7
 
8
- | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
9
- |-------------|------:|------|-----:|--------|-----:|---|-----:|
10
- |arc_challenge| 1|none | 25|acc |0.5691|± |0.0145|
11
- | | |none | 25|acc_norm|0.6118|± |0.0142|
12
- |truthfulqa_mc2| 2|none | 0|acc |0.4239|± |0.0145|
13
- |winogrande| 1|none | 5|acc |0.7758|± |0.0117|
14
- |hellaswag| 1|none | 10|acc |0.6310|± |0.0048|
15
- | | |none | 10|acc_norm|0.8152|± |0.0039|
16
 
17
  ## Model Details
18
 
 
5
 
6
  # Model Card for Model ID
7
 
8
+ this model isn't really made for benchmarks, it's worse on everything besides ARC-C and TruthfulQA
9
+
10
+ | Model | ARC-C | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8k |
11
+ | ------------------------------------------------------------ | --------- | --------- | ---------- | ---------- | ---------- | --------- |
12
+ | [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) | 59.98 | **83.31** | **64.16** | 42.15 | **78.37** | **37.83** |
13
+ | [crumb/92d52f-ame-full-7B](https://hf.co/crumb/92d52f-ame-full-7B) | **61.18** | 81.52 | 63.44 | **42.39** | 77.58 | 35.41 |
14
+
15
+ it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
16
 
17
  ## Model Details
18