altsoph commited on
Commit
8cb570b
1 Parent(s): 28f1616

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ thumbnail: https://raw.githubusercontent.com/altsoph/misc/main/imgs/aer_logo.png
5
+ tags:
6
+ - nlp
7
+ - roberta
8
+ - xlmr
9
+ - classifier
10
+ - aer
11
+ - narrative
12
+ - entity recognition
13
+ license: mit
14
+ ---
15
+
16
+ An XLM-Roberta based language model fine-tuned for AER (Actionable Entities Recognition) -- recognition of entities that protagonists could interact with for further plot development.
17
+
18
+ We used 5K+ locations from 1K interactive text fiction games and extracted textual descriptions of locations and lists of actionable entities in them.
19
+ The resulting [BAER dataset is available here](https://github.com/altsoph/BAER). Then we used it to train this model.
20
+
21
+ The example of usage:
22
+ ```py
23
+ from transformers import AutoModelForTokenClassification, AutoTokenizer, pipeline
24
+
25
+ MODEL_NAME = "altsoph/xlmr-AER"
26
+
27
+ text = """This bedroom is extremely spare, with dirty laundry scattered haphazardly all over the floor. Cleaner clothing can be found in the dresser.
28
+ A bathroom lies to the south, while a door to the east leads to the living room."""
29
+
30
+ model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME)
31
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
32
+
33
+ pipe = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple", ignore_labels=['O','PAD'])
34
+ entities = pipe(text)
35
+
36
+ print(entities)
37
+ ```
38
+
39
+
40
+ If you use the model, please cite the following:
41
+
42
+ ```
43
+ @inproceedings{Tikhonov-etal-2022-AER,
44
+ title = "Actionable Entities Recognition Benchmark for Interactive Fiction",
45
+ author = "Alexey Tikhonov and Ivan P. Yamshchikov",
46
+ year = "2022",
47
+ }
48
+ ```