MultiBertGunjanPatrick
/

multiberts-seed-4-800k

multiberts-seed-4

Inference Endpoints

Model card Files Files and versions Community

gchhablani commited on Sep 25, 2021

Commit

a0ce6f6

•

1 Parent(s): 2e36a56

Add or Fix Model

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -3,13 +3,14 @@ language: en
 tags:
 - exbert
 - multiberts
 license: apache-2.0
 datasets:
 - bookcorpus
 - wikipedia
 ---
-# MultiBERTs Seed 0 Checkpoint 800k (uncased)
-Seed 0 intermediate checkoint 800k MultiBERTs (pretrained BERT) model on English language using a masked language modeling (MLM) objective. It was introduced in
 [this paper](https://arxiv.org/pdf/2106.16163.pdf) and first released in
 [this repository](https://github.com/google-research/language/tree/master/language/multiberts). This model is uncased: it does not make a difference
 between english and English.
@@ -46,7 +47,7 @@ Here is how to use this model to get the features of a given text in PyTorch:
 ```python
 from transformers import BertTokenizer, BertModel
 tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
-model = BertModel.from_pretrained("multiberts-seed-4-800k")
 text = "Replace me by any text you'd like."
 encoded_input = tokenizer(text, return_tensors='pt')
 output = model(**encoded_input)

 tags:
 - exbert
 - multiberts
+- multiberts-seed-1
 license: apache-2.0
 datasets:
 - bookcorpus
 - wikipedia
 ---
+# MultiBERTs Seed 1 Checkpoint 20k (uncased)
+Seed 1 intermediate checkpoint 20k MultiBERTs (pretrained BERT) model on English language using a masked language modeling (MLM) objective. It was introduced in
 [this paper](https://arxiv.org/pdf/2106.16163.pdf) and first released in
 [this repository](https://github.com/google-research/language/tree/master/language/multiberts). This model is uncased: it does not make a difference
 between english and English.
 ```python
 from transformers import BertTokenizer, BertModel
 tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
+model = BertModel.from_pretrained("multiberts-seed-1-20k")
 text = "Replace me by any text you'd like."
 encoded_input = tokenizer(text, return_tensors='pt')
 output = model(**encoded_input)