# Second Millennium Babylonian model for [BabyLemmatizer](https://github.com/asahala/BabyLemmatizer) Total data set size ca. 120k words (including lacunae). Consists of all Oracc texts labeled as any variant of Babylonian or Akkadian in the second millennium BCE. See model Babylonian-1st for the first millennium Babylonian. ## Evaluation results ``` Neural Net Evaluation COMPONENT AVG CI MODEL0 POS-tagger 97.85 ±0.00 97.85 Lemmatizer 94.58 ±0.00 94.58 Combined 93.87 ±0.00 93.87 POS-tagger OOV 91.94 ±0.00 91.94 Lemmatizer OOV 71.33 ±0.00 71.33 Combined OOV 69.91 ±0.00 69.91 ----------------------------------------------- OOV input rate 13.04 13.04 Post-correct Evaluation COMPONENT AVG CI MODEL0 POS-tagger 97.85 ±0.00 97.85 Lemmatizer 94.59 ±0.00 94.59 Combined 93.88 ±0.00 93.88 POS-tagger OOV 91.94 ±0.00 91.94 Lemmatizer OOV 71.33 ±0.00 71.33 Combined OOV 69.91 ±0.00 69.91 ----------------------------------------------- OOV input rate 13.04 13.04 ```