Update README.md
Browse files
README.md
CHANGED
@@ -62,7 +62,7 @@ print('\n'.join(tokenizer.convert_ids_to_tokens(top_2_tokens)))
|
|
62 |
## Evaluation
|
63 |
The evaluation was conducted on a 10% test set of the Knesset Corpus, consisting of approximately 3.2 million sentences.
|
64 |
The perplexity was calculated on this full test set.
|
65 |
-
Due to time constraints, accuracy measures were calculated on a subset of this test set, consisting of approximately
|
66 |
|
67 |
#### Perplexity
|
68 |
The perplexity of the original DictaBERT on the full test set is 22.87.
|
|
|
62 |
## Evaluation
|
63 |
The evaluation was conducted on a 10% test set of the Knesset Corpus, consisting of approximately 3.2 million sentences.
|
64 |
The perplexity was calculated on this full test set.
|
65 |
+
Due to time constraints, accuracy measures were calculated on a subset of this test set, consisting of approximately 300,000 sentences (approximately 3.5 million tokens).
|
66 |
|
67 |
#### Perplexity
|
68 |
The perplexity of the original DictaBERT on the full test set is 22.87.
|