crumb
/

92d52f-ame-full-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

crumb commited on Jul 12

Commit

46318ec

•

1 Parent(s): caa3d03

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -14,6 +14,10 @@ this model isn't really made for benchmarks, it's worse on everything besides AR
 it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
 ## Model Details
 ### Model Description

 it's got extra tokens which can all equally be used as masks, you can replace all instances of one token in context with one of the extra tokens (`[f'<ID-{i:06X}>' for i in range(2048)]`) to give the model an extra hard time. it was trained with context length 2048 on three separate replacement techniques through a schedule, with 80% of all sequences being completely replaced with the mask tokens near the end of training.
+> what? how is that useful?
+i'm hoping to finetune it further while replacing the entire tokenizer with any number of other tokenizers, all utilizing the unique mask ids, to hopefully build a causal model of any sufficiently long artifact from any domain, for example, the voynich manuscript or an alien artifact
 ## Model Details
 ### Model Description