Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,8 @@ tags:
|
|
10 |
- masked-lm
|
11 |
- pytorch
|
12 |
license: agpl-3.0
|
|
|
|
|
13 |
---
|
14 |
|
15 |
# IceBERT
|
@@ -25,4 +27,4 @@ IceBERT was trained with fairseq using the RoBERTa-base architecture. The traini
|
|
25 |
| Open Icelandic e-books (Rafbókavefurinn) | 14 MB | 2.6M |
|
26 |
| Data from the medical library of Landspitali | 33 MB | 5.2M |
|
27 |
| Student theses from Icelandic universities (Skemman) | 2.2 GB | 367M |
|
28 |
-
| Total | 15.8 GB | 2,664M |
|
|
|
10 |
- masked-lm
|
11 |
- pytorch
|
12 |
license: agpl-3.0
|
13 |
+
datasets:
|
14 |
+
- mideind/icelandic-common-crawl-corpus-IC3
|
15 |
---
|
16 |
|
17 |
# IceBERT
|
|
|
27 |
| Open Icelandic e-books (Rafbókavefurinn) | 14 MB | 2.6M |
|
28 |
| Data from the medical library of Landspitali | 33 MB | 5.2M |
|
29 |
| Student theses from Icelandic universities (Skemman) | 2.2 GB | 367M |
|
30 |
+
| Total | 15.8 GB | 2,664M |
|