Update README.md
Browse files
README.md
CHANGED
@@ -14,6 +14,10 @@ this model isn't really made for benchmarks, it's worse on everything besides AR
|
|
14 |
|
15 |
it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
|
16 |
|
|
|
|
|
|
|
|
|
17 |
## Model Details
|
18 |
|
19 |
### Model Description
|
|
|
14 |
|
15 |
it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
|
16 |
|
17 |
+
> what? how is that useful?
|
18 |
+
|
19 |
+
i'm hoping to finetune it further while replacing the entire tokenizer with any number of other tokenizers, all utilizing the unique mask ids, to hopefully build a causal model of any sufficiently long artifact from any domain, for example, the voynich manuscript or an alien artifact
|
20 |
+
|
21 |
## Model Details
|
22 |
|
23 |
### Model Description
|