pkoiralap commited on
Commit
92a3afc
1 Parent(s): 190daaa

update number of papers in readme file

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -1,4 +1,4 @@
1
- This model is further trained on top of scibert-base using masked language modeling loss (MLM). The corpus is roughly 100,000 earth science-based publications.
2
 
3
  The tokenizer used is AutoTokenizer, which is trained on the same corpus.
4
 
@@ -7,4 +7,4 @@ Stay tuned for further downstream task tests and updates to the model.
7
  in the works
8
  - MLM + NSP task loss
9
  - Add more data sources for training
10
- - Test using downstream tasks
 
1
+ This model is further trained on top of scibert-base using masked language modeling loss (MLM). The corpus is roughly 270,000 earth science-based publications.
2
 
3
  The tokenizer used is AutoTokenizer, which is trained on the same corpus.
4
 
 
7
  in the works
8
  - MLM + NSP task loss
9
  - Add more data sources for training
10
+ - Test using downstream tasks