robinq commited on
Commit
32153df
1 Parent(s): 8edc2d3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - sv
4
+
5
+ ---
6
+
7
+ # Megatron-BERT-base Swedish 600k
8
+
9
+ This BERT model was trained using the Megatron-LM library.
10
+ The size of the model is a regular BERT-base with 110M parameters.
11
+ The model was trained on about 70GB of data, consisting mostly of OSCAR and Swedish newspaper text curated by the National Library of Sweden.
12
+
13
+ Training was done for 600k training steps. Its [sister model](https://huggingface.co/KBLab/megatron-bert-base-swedish-cased-125k) used the same setup, but was instead trained for only 125k steps.
14
+
15
+
16
+ The model has three sister models trained on the same dataset:
17
+ - [Megatron-BERT-base-125k](https://huggingface.co/KBLab/megatron-bert-base-swedish-cased-125k)
18
+ - [Megatron-BERT-base-600k](https://huggingface.co/KBLab/megatron-bert-base-swedish-cased-600k)
19
+ - [Megatron-BERT-large-110k]()