gpt2-wiki103 / README.md
irodkin's picture
Update README.md
7acce2e
|
raw
history blame
No virus
545 Bytes
metadata
datasets:
  - wikitext
  - wikitext-103-v1
language:
  - en
metrics:
  - perplexity
  - cross_entropy

(!) Don't forget to preprocess unknown_tokens and substitute them with <|endoftext|>. Otherwise the <unk> tokens in dataset will be split into the '<', 'unk' and '>' tokens

Dependence of the cross entropy loss on the length of the context for prediction

  • x-axis*128 = context length
  • y-axis = cross entropy

image/png