Model and entities

roberta_classics_ner is a domain-specific RoBERTa-based model for named entity recognition in Classical Studies. It recognises bibliographical entities, such as:

id label desciption Example
0 'O' Out of entity
1 'B-AAUTHOR' Ancient authors Herodotus
2 'I-AAUTHOR'
3 'B-AWORK' The title of an ancient work Symposium, Aeneid
4 'I-AWORK'
5 'B-REFAUWORK' A structured reference to an ancient work Homer, Il.
6 'I-REFAUWORK'
7 'B-REFSCOPE' The scope of a reference II.1.993a30–b11
8 'I-REFSCOPE'
9 'B-FRAGREF' A reference to fragmentary texts or scholia Frag. 19. West
10 'I-FRAGREF'

Example

B-AAUTHOR   B-AWORK                                      B-REFSCOPE
Homer  's   Iliad opens with an invocation to the muse ( 1. 1).

Dataset

roberta_classics_ner was fine-tuned and evaluated on EpiBau, a dataset which has not been released publicly yet. It is composed of four volumes of Structures of Epic Poetry, a compendium on the narrative patterns and structural elements in ancient epic. Entity counts of the Epibau dataset are the following:

train-set dev-set test-set
word count 712462 125729 122324
AAUTHOR 4436 1368 1511
AWORK 3145 780 670
REFAUWORK 5102 988 1209
REFSCOPE 14768 3193 2847
FRAGREF 266 29 33
total entities 13822 1415 2419

Results

The model was developed in the context of experiments reported here.Trained and tested on EpiBau with a 85-15 split, the model yields a general F1 score of .82 (micro-averages). Detailed scores are displayed below. Evaluation was performed with the CLEF-HIPE-scorer, in strict mode)

metric AAUTHOR AWORK REFSCOPE REFAUWORK
F1 .819 .796 .863 .756
Precision .842 .818 .860 .755
Recall .797 .766 .756 .866

Questions, remarks, help or contribution ? Get in touch here, we'll be happy to chat !

Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.