Spell correction
Browse filesPerhaps US Patent Office?
README.md
CHANGED
@@ -133,7 +133,7 @@ Falcon-7B was trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/da
|
|
133 |
| Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews |
|
134 |
| Code | 3% | 45B | |
|
135 |
| RefinedWeb-French | 3% | 45B | massive web crawl |
|
136 |
-
| Technical | 2% | 30B | arXiv, PubMed,
|
137 |
|
138 |
|
139 |
The data was tokenized with the Falcon-[7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) tokenizer.
|
|
|
133 |
| Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews |
|
134 |
| Code | 3% | 45B | |
|
135 |
| RefinedWeb-French | 3% | 45B | massive web crawl |
|
136 |
+
| Technical | 2% | 30B | arXiv, PubMed, USPTO, etc. |
|
137 |
|
138 |
|
139 |
The data was tokenized with the Falcon-[7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) tokenizer.
|