crumb commited on
Commit
46318ec
1 Parent(s): caa3d03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -14,6 +14,10 @@ this model isn't really made for benchmarks, it's worse on everything besides AR
14
 
15
  it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
16
 
 
 
 
 
17
  ## Model Details
18
 
19
  ### Model Description
 
14
 
15
  it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
16
 
17
+ > what? how is that useful?
18
+
19
+ i'm hoping to finetune it further while replacing the entire tokenizer with any number of other tokenizers, all utilizing the unique mask ids, to hopefully build a causal model of any sufficiently long artifact from any domain, for example, the voynich manuscript or an alien artifact
20
+
21
  ## Model Details
22
 
23
  ### Model Description