File size: 2,893 Bytes
8a32c91
a8944de
 
bc31d26
a8944de
 
 
 
 
 
 
bc31d26
 
 
 
 
 
a8944de
 
 
 
 
bc31d26
a8944de
bc31d26
a8944de
 
bc31d26
 
 
8a32c91
a8944de
 
 
b6399c9
a8944de
456b39f
a8944de
7ab28d6
a8944de
d1b9887
a8944de
6202e2f
95ae2e2
 
 
94596fd
 
 
 
 
 
 
 
 
 
 
368d459
 
 
 
9aa0e1c
95ae2e2
 
 
 
 
 
 
 
 
 
9aa0e1c
95ae2e2
9aa0e1c
 
 
 
 
 
95ae2e2
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
language:
- en
license: cc-by-nc-sa-4.0
tags:
- seq2seq
- relation-extraction
- triple-generation
- entity-linking
- entity-type-linking
- relation-linking
datasets: Babelscape/rebel-dataset
widget:
- text: The Italian Space Agency’s Light Italian CubeSat for Imaging of Asteroids,
    or LICIACube, will fly by Dimorphos to capture images and video of the impact
    plume as it sprays up off the asteroid and maybe even spy the crater it could
    leave behind.
model-index:
- name: knowgl
  results:
  - task:
      type: Relation-Extraction
      name: Relation Extraction
    dataset:
      name: Babelscape/rebel-dataset
      type: REBEL
    metrics:
    - type: re+ macro f1
      value: 70.74
      name: RE+ Macro F1
---

# KnowGL: Knowledge Generation and Linking from Text

The `knowgl-large` model is trained by combining Wikidata with an extended version of the training data in the [REBEL](https://huggingface.co/datasets/Babelscape/rebel-dataset) dataset. Given a sentence, KnowGL generates triple(s) in the following format:
```
[(subject mention # subject label # subject type) | relation label | (object mention # object label # object type)]
```
If there are more than one triples generated, they are separated by `$` in the output. More details in [Rossiello et al. (AAAI 2023)](https://arxiv.org/pdf/2210.13952.pdf).

The model achieves state-of-the-art results for relation extraction on the REBEL dataset. See results in [Mihindukulasooriya et al. (ISWC 2022)](https://arxiv.org/pdf/2207.05188.pdf).

The generated labels (for the subject, relation, and object) and their types can be directly mapped to Wikidata IDs associated with them.

#### Citation
```bibtex
@inproceedings{DBLP:conf/aaai/RossielloCMCG23,
  author       = {Gaetano Rossiello and
                  Md. Faisal Mahbub Chowdhury and
                  Nandana Mihindukulasooriya and
                  Owen Cornec and
                  Alfio Massimiliano Gliozzo},
  title        = {KnowGL: Knowledge Generation and Linking from Text},
  booktitle    = {{AAAI}},
  pages        = {16476--16478},
  publisher    = {{AAAI} Press},
  year         = {2023}
}
```

```bibtex
@inproceedings{DBLP:conf/semweb/Mihindukulasooriya22,
  author    = {Nandana Mihindukulasooriya and
               Mike Sava and
               Gaetano Rossiello and
               Md. Faisal Mahbub Chowdhury and
               Irene Yachbes and
               Aditya Gidh and
               Jillian Duckwitz and
               Kovit Nisar and
               Michael Santos and
               Alfio Gliozzo},
  title     = {Knowledge Graph Induction Enabling Recommending and Trend Analysis:
               {A} Corporate Research Community Use Case},
  booktitle = {{ISWC}},
  series    = {Lecture Notes in Computer Science},
  volume    = {13489},
  pages     = {827--844},
  publisher = {Springer},
  year      = {2022}
}
```