iwiwi commited on
Commit
f20b9a8
1 Parent(s): 3a7892f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -76,7 +76,7 @@ Around 100B tokens from a mixture of the following corpora were used for the con
76
  - [Japanese mc4](https://huggingface.co/datasets/mc4)
77
  - [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz)
78
  - [Japanese OSCAR](https://oscar-project.github.io/documentation/)
79
- - [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B)
80
 
81
 
82
  ## Use and Limitations
 
76
  - [Japanese mc4](https://huggingface.co/datasets/mc4)
77
  - [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz)
78
  - [Japanese OSCAR](https://oscar-project.github.io/documentation/)
79
+ - [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B) without the Books3 subset
80
 
81
 
82
  ## Use and Limitations