Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Sumerian models for BabyLemmatizer

These models use indexed logo-syllabic tokenization and require BabyLemmatizer 2.1. Consists of two models, sumerian-lit for literary Sumerian and sumerian-adm for Administrative Sumerian.

sumerian-adm consists of all Sumerian Early Dynastic, Old Babylonian, Old Akkadian, Ebla and Lagaš II administrative texts in Oracc's ePSD2 corpus, consisting 570k words.

sumerian-lit consists of all Sumerian literary texts from Oracc comprising 268k words.

Evaluation results for administrative model

Neural Net Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      96.48   ±0.00    96.48
Lemmatizer      95.39   ±0.00    95.39
Combined        94.42   ±0.00    94.42
POS-tagger OOV  82.03   ±0.00    82.03
Lemmatizer OOV  71.87   ±0.00    71.87
Combined   OOV  68.00   ±0.00    68.00
-----------------------------------------------
OOV input rate  5.44             5.44

Post-correct Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      96.48   ±0.00    96.48
Lemmatizer      95.42   ±0.00    95.42
Combined        94.44   ±0.00    94.44
POS-tagger OOV  82.03   ±0.00    82.03
Lemmatizer OOV  71.87   ±0.00    71.87
Combined   OOV  68.00   ±0.00    68.00
-----------------------------------------------
OOV input rate  5.44             5.44

Evaluation results for literary model

Neural Net Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      94.00   ±0.00    94.00
Lemmatizer      93.71   ±0.00    93.71
Combined        91.37   ±0.00    91.37
POS-tagger OOV  82.61   ±0.00    82.61
Lemmatizer OOV  80.87   ±0.00    80.87
Combined   OOV  74.54   ±0.00    74.54
-----------------------------------------------
OOV input rate  19.04            19.04

Post-correct Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      94.00   ±0.00    94.00
Lemmatizer      93.70   ±0.00    93.70
Combined        91.36   ±0.00    91.36
POS-tagger OOV  82.61   ±0.00    82.61
Lemmatizer OOV  80.87   ±0.00    80.87
Combined   OOV  74.54   ±0.00    74.54
-----------------------------------------------
OOV input rate  19.04            19.04
Downloads last month
0
Unable to determine this model's library. Check the docs .