crumb
/

92d52f-ame-full-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

crumb commited on Jul 12

Commit

785f5a9

•

1 Parent(s): 46318ec

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ this model isn't really made for benchmarks, it's worse on everything besides AR
 | [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) | 59.98     | **83.31** | **64.16** | 42.15      | **78.37**  | **37.83** |
 | [crumb/92d52f-ame-full-7B](https://hf.co/crumb/92d52f-ame-full-7B) | **61.18** | 81.52     | 63.44      | **42.39**  | 77.58      | 35.41     |
-it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
 > what? how is that useful?

 | [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) | 59.98     | **83.31** | **64.16** | 42.15      | **78.37**  | **37.83** |
 | [crumb/92d52f-ame-full-7B](https://hf.co/crumb/92d52f-ame-full-7B) | **61.18** | 81.52     | 63.44      | **42.39**  | 77.58      | 35.41     |
+it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training. it was trained over ~0.5B tokens
 > what? how is that useful?