GiliGold
/

Knesset-DictaBERT

masked-language-model

parliamentary-proceedings

Inference Endpoints

Model card Files Files and versions Community

GiliGold commited on Jul 20

Commit

4a98108

•

1 Parent(s): 23c00f5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -62,7 +62,7 @@ print('\n'.join(tokenizer.convert_ids_to_tokens(top_2_tokens)))
 ## Evaluation
 The evaluation was conducted on a 10% test set of the Knesset Corpus, consisting of approximately 3.2 million sentences.
 The perplexity was calculated on this full test set.
-Due to time constraints, accuracy measures were calculated on a subset of this test set, consisting of approximately 3 million sentences (approximately 520 million tokens).
 #### Perplexity
 The perplexity of the original DictaBERT on the full test set is 22.87.

 ## Evaluation
 The evaluation was conducted on a 10% test set of the Knesset Corpus, consisting of approximately 3.2 million sentences.
 The perplexity was calculated on this full test set.
+Due to time constraints, accuracy measures were calculated on a subset of this test set, consisting of approximately 300,000 sentences (approximately 3.5 million tokens).
 #### Perplexity
 The perplexity of the original DictaBERT on the full test set is 22.87.