File size: 547 Bytes
9c8cbed
 
 
 
 
 
 
 
 
 
 
eb81ab9
7acce2e
 
9c8cbed
 
 
 
 
912e2fb
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
datasets:
- wikitext
- wikitext-103-v1
language:
- en
metrics:
- perplexity
- cross_entropy
---

**(!) _Don't forget to preprocess unknown_tokens and substitute them with <|endoftext|>. Otherwise the \<unk\> tokens in dataset will be split into the '<', 'unk' and '>' tokens_**


**Dependence of the cross entropy loss on the length of the context for prediction**

- x-axis*128 = context length
- y-axis = cross entropy


![image/png](https://cdn-uploads.huggingface.co/production/uploads/63c1ac8cc58fcfeac186bda2/BSEfNr1ca53CkMtAF2jhV.png)