ai-forever
commited on
Commit
•
e0dc07c
1
Parent(s):
2744513
Update README.md
Browse files
README.md
CHANGED
@@ -28,7 +28,18 @@ An extensive dataset with “artificial” errors was taken as a training corpus
|
|
28 |
### Quality
|
29 |
Below are automatic metrics for determining the correctness of the spell checkers.
|
30 |
We compare our solution with both open automatic spell checkers and the ChatGPT family of models on all four available datasets:
|
31 |
-
- **RUSpellRU**:
|
32 |
- **MultidomainGold**: examples from 7 text sources, including the open web, news, social media, reviews, subtitles, policy documents and literary works;
|
33 |
- **MedSpellChecker**: texts with errors from medical anamnesis;
|
34 |
- **GitHubTypoCorpusRu**: spelling errors and typos in commits from [GitHub](https://github.com);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
### Quality
|
29 |
Below are automatic metrics for determining the correctness of the spell checkers.
|
30 |
We compare our solution with both open automatic spell checkers and the ChatGPT family of models on all four available datasets:
|
31 |
+
- **RUSpellRU**: texts collected from ([LiveJournal](https://www.livejournal.com/media)), with manually corrected typos and errors;
|
32 |
- **MultidomainGold**: examples from 7 text sources, including the open web, news, social media, reviews, subtitles, policy documents and literary works;
|
33 |
- **MedSpellChecker**: texts with errors from medical anamnesis;
|
34 |
- **GitHubTypoCorpusRu**: spelling errors and typos in commits from [GitHub](https://github.com);
|
35 |
+
|
36 |
+
**RUSpellRU**
|
37 |
+
| Model | Precision | Recall | F1 |
|
38 |
+
| --- | --- | --- | --- |
|
39 |
+
| M2M100-1.2B | 59.4 | 43.3 | 50.1 |
|
40 |
+
| ChatGPT gpt-3.5-turbo-0301 | 55.8 | 75.3 | 64.1 |
|
41 |
+
| ChatGPT gpt-4-0314 | 57.0 | 75.9 | 63.9 |
|
42 |
+
| ChatGPT text-davinci-003 | 55.9 | 75.3 | 64.2 |
|
43 |
+
| Yandex.Speller | 83.0 | 59.8 | 69.5 |
|
44 |
+
| JamSpell | 42.1 | 32.8 | 36.9 |
|
45 |
+
| HunSpell | 31.3 | 34.9 | 33.0 |
|