aseker00 commited on
Commit
2286340
1 Parent(s): eb5dd6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -31,7 +31,7 @@ alephbert.eval()
31
 
32
  ## Training data
33
  1. OSCAR [(Ortiz, 2019)](https://oscar-corpus.com/) Hebrew section (10GB text, 20M sentences).
34
- 2. Hebrew dump of [Wikipedia](https://dumps.wikimedia.org/hewiki/latest/) (650 MB text, 3.8M sentences).
35
  3. Hebrew Tweets collected from the Twitter sample stream (7G text, 70M sentences).
36
 
37
  ## Training procedure
@@ -43,7 +43,7 @@ To optimize training time we split the data into 4 sections based on max number
43
  1. num tokens < 32 (70M sentences)
44
  2. 32 <= num tokens < 64 (12M sentences)
45
  3. 64 <= num tokens < 128 (10M sentences)
46
- 4. 128 <= num tokens < 512 (70M sentences)
47
 
48
  Each section was first trained for 5 epochs with an initial learning rate set to 1e-4. Then each section was trained for another 5 epochs with an initial learning rate set to 1e-5, for a total of 10 epochs.
49
 
 
31
 
32
  ## Training data
33
  1. OSCAR [(Ortiz, 2019)](https://oscar-corpus.com/) Hebrew section (10GB text, 20M sentences).
34
+ 2. Hebrew dump of [Wikipedia](https://dumps.wikimedia.org/hewiki/latest/) (650 MB text, 3M sentences).
35
  3. Hebrew Tweets collected from the Twitter sample stream (7G text, 70M sentences).
36
 
37
  ## Training procedure
 
43
  1. num tokens < 32 (70M sentences)
44
  2. 32 <= num tokens < 64 (12M sentences)
45
  3. 64 <= num tokens < 128 (10M sentences)
46
+ 4. 128 <= num tokens < 512 (1.5M sentences)
47
 
48
  Each section was first trained for 5 epochs with an initial learning rate set to 1e-4. Then each section was trained for another 5 epochs with an initial learning rate set to 1e-5, for a total of 10 epochs.
49