|
# Second Millennium Babylonian model for [BabyLemmatizer](https://github.com/asahala/BabyLemmatizer) |
|
Total data set size ca. 120k words (including lacunae). Consists of all Oracc texts labeled as any variant of Babylonian or Akkadian in the second millennium BCE. |
|
|
|
See model Babylonian-1st for the first millennium Babylonian. |
|
|
|
## Evaluation results |
|
|
|
``` |
|
Neural Net Evaluation |
|
COMPONENT AVG CI MODEL0 |
|
POS-tagger 97.85 ±0.00 97.85 |
|
Lemmatizer 94.58 ±0.00 94.58 |
|
Combined 93.87 ±0.00 93.87 |
|
POS-tagger OOV 91.94 ±0.00 91.94 |
|
Lemmatizer OOV 71.33 ±0.00 71.33 |
|
Combined OOV 69.91 ±0.00 69.91 |
|
----------------------------------------------- |
|
OOV input rate 13.04 13.04 |
|
|
|
|
|
|
|
Post-correct Evaluation |
|
COMPONENT AVG CI MODEL0 |
|
POS-tagger 97.85 ±0.00 97.85 |
|
Lemmatizer 94.59 ±0.00 94.59 |
|
Combined 93.88 ±0.00 93.88 |
|
POS-tagger OOV 91.94 ±0.00 91.94 |
|
Lemmatizer OOV 71.33 ±0.00 71.33 |
|
Combined OOV 69.91 ±0.00 69.91 |
|
----------------------------------------------- |
|
OOV input rate 13.04 13.04 |
|
``` |
|
|