Vít Novotný
commited on
Commit
•
f353573
1
Parent(s):
c5f9bdf
Update link to repository
Browse files
README.md
CHANGED
@@ -14,13 +14,13 @@ CLEF 2022 and first released in [this repository][2]. This model is case-sensiti
|
|
14 |
it makes a difference between english and English.
|
15 |
|
16 |
[1]: https://www.cs.rit.edu/~dprl/ARQMath/
|
17 |
-
[2]: https://github.com/witiko/scm-at-arqmath3
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
-
MathBERTa is [the RoBERTa base transformer model][3] whose tokenizer has been
|
22 |
-
extended with LaTeX math symbols and which has been fine-tuned on a large
|
23 |
-
corpus of English mathematical texts.
|
24 |
|
25 |
Like RoBERTa, MathBERTa has been fine-tuned with the Masked language modeling
|
26 |
(MLM) objective. Taking a sentence, the model randomly masks 15% of the words
|
@@ -30,6 +30,8 @@ learns an inner representation of the English language and the language of
|
|
30 |
LaTeX that can then be used to extract features useful for downstream tasks.
|
31 |
|
32 |
[3]: https://huggingface.co/roberta-base
|
|
|
|
|
33 |
|
34 |
## Intended uses & limitations
|
35 |
|
|
|
14 |
it makes a difference between english and English.
|
15 |
|
16 |
[1]: https://www.cs.rit.edu/~dprl/ARQMath/
|
17 |
+
[2]: https://github.com/witiko/scm-at-arqmath3
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
+
MathBERTa is [the RoBERTa base transformer model][3] whose [tokenizer has been
|
22 |
+
extended with LaTeX math symbols][7] and which has been [fine-tuned on a large
|
23 |
+
corpus of English mathematical texts][8].
|
24 |
|
25 |
Like RoBERTa, MathBERTa has been fine-tuned with the Masked language modeling
|
26 |
(MLM) objective. Taking a sentence, the model randomly masks 15% of the words
|
|
|
30 |
LaTeX that can then be used to extract features useful for downstream tasks.
|
31 |
|
32 |
[3]: https://huggingface.co/roberta-base
|
33 |
+
[7]: https://github.com/Witiko/scm-at-arqmath3/blob/main/02-train-tokenizers.ipynb
|
34 |
+
[8]: https://github.com/witiko/scm-at-arqmath3/blob/main/03-finetune-roberta.ipynb
|
35 |
|
36 |
## Intended uses & limitations
|
37 |
|