File size: 2,212 Bytes
17bc223
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# Sumerian models for [BabyLemmatizer](https://github.com/asahala/BabyLemmatizer)

These models use indexed logo-syllabic tokenization and require BabyLemmatizer 2.1. Consists of two models, sumerian-lit for literary Sumerian and sumerian-adm for Administrative Sumerian.

```sumerian-adm``` consists of all Sumerian Early Dynastic, Old Babylonian, Old Akkadian, Ebla and Lagaš II administrative texts in Oracc's ePSD2 corpus, consisting 570k words.

```sumerian-lit``` consists of all Sumerian literary texts from Oracc comprising 268k words.

## Evaluation results for administrative model

```
Neural Net Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      96.48   ±0.00    96.48
Lemmatizer      95.39   ±0.00    95.39
Combined        94.42   ±0.00    94.42
POS-tagger OOV  82.03   ±0.00    82.03
Lemmatizer OOV  71.87   ±0.00    71.87
Combined   OOV  68.00   ±0.00    68.00
-----------------------------------------------
OOV input rate  5.44             5.44

Post-correct Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      96.48   ±0.00    96.48
Lemmatizer      95.42   ±0.00    95.42
Combined        94.44   ±0.00    94.44
POS-tagger OOV  82.03   ±0.00    82.03
Lemmatizer OOV  71.87   ±0.00    71.87
Combined   OOV  68.00   ±0.00    68.00
-----------------------------------------------
OOV input rate  5.44             5.44
```

## Evaluation results for literary model

```
Neural Net Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      94.00   ±0.00    94.00
Lemmatizer      93.71   ±0.00    93.71
Combined        91.37   ±0.00    91.37
POS-tagger OOV  82.61   ±0.00    82.61
Lemmatizer OOV  80.87   ±0.00    80.87
Combined   OOV  74.54   ±0.00    74.54
-----------------------------------------------
OOV input rate  19.04            19.04

Post-correct Evaluation
COMPONENT       AVG     CI       MODEL0
POS-tagger      94.00   ±0.00    94.00
Lemmatizer      93.70   ±0.00    93.70
Combined        91.36   ±0.00    91.36
POS-tagger OOV  82.61   ±0.00    82.61
Lemmatizer OOV  80.87   ±0.00    80.87
Combined   OOV  74.54   ±0.00    74.54
-----------------------------------------------
OOV input rate  19.04            19.04
```