Aunsiels commited on
Commit
d0258b7
1 Parent(s): d5ce4ae

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -0
README.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ widget:
6
+ - text: "fish lives in ocean[SEP]"
7
+ example_title: "Mapping1"
8
+ - text: "elephant be killed in africa[SEP]"
9
+ example_title: "Mapping2"
10
+ - text: "doctor write prescription[SEP]"
11
+ example_title: "Mapping3"
12
+ - text: "fish "
13
+ example_title: "KB generation1"
14
+ - text: "elephant capable of "
15
+ example_title: "KB generation2"
16
+ - text: "doctor at location "
17
+ example_title: "KB generation3"
18
+ - text: "Some air pollutants fall to earth in the form of acid rain.[SEP]"
19
+ example_title: "Relation Extraction1"
20
+ - text: "Elon Musk Races to Secure Financing for Twitter Bid.[SEP]"
21
+ example_title: "Relation Extraction2"
22
+ ---
23
+ # Quasimodo-GenT LM-based Alignment
24
+
25
+ <!-- Provide a quick summary of what the model is/does. -->
26
+
27
+ This model is trained to translate an open triple (initially from Quasimodo) into a closed triple that uses relationships from ConceptNet.
28
+
29
+ ## Model Details
30
+
31
+ ### Model Description
32
+
33
+ <!-- Provide a longer summary of what this model is. -->
34
+
35
+
36
+
37
+ - **Developed by: Julien Romero
38
+ - **Model type: GPT2
39
+ - **Language(s) (NLP): English
40
+ - **Finetuned from model: gpt2-large
41
+
42
+ ### Model Sources [optional]
43
+
44
+ <!-- Provide the basic links for the model. -->
45
+
46
+ - **Repository: [https://github.com/Aunsiels/GenT](https://github.com/Aunsiels/GenT)
47
+ - **Paper [optional]: [https://arxiv.org/pdf/2306.12766.pdf](https://arxiv.org/pdf/2306.12766.pdf)
48
+
49
+ ## Uses
50
+
51
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
52
+
53
+ We observed good results by using a beam search decoding. Other decoding methods might be less adapted.
54
+
55
+ ### Direct Use
56
+
57
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
58
+
59
+ You must give the open triple with subject, object, and predicate separated by a tabulation and then followed by [SEP]. Examples:
60
+
61
+ ```
62
+ fish lives in ocean[SEP]
63
+ elephant be killed in africa[SEP]
64
+ doctor write prescription[SEP]
65
+ ```
66
+
67
+ ### From Subject/Subject-Predicate
68
+
69
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
70
+
71
+ It is also possible to give a subject or a subject-predicate to generate a knowledge base directly. The output must be parsed correctly in this case. Examples:
72
+
73
+ ```
74
+ fish
75
+ elephant capable of
76
+ doctor at location
77
+ ```
78
+
79
+ ### From Text
80
+
81
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
82
+
83
+ When used with text as input, this model can behave like a relation extractor, although it was not trained on this task. Examples:
84
+
85
+ ```
86
+ Some air pollutants fall to earth in the form of acid rain.[SEP]
87
+ Elon Musk Races to Secure Financing for Twitter Bid.[SEP]
88
+ ```
89
+
90
+ ## Citation [optional]
91
+
92
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
93
+
94
+ **BibTeX:**
95
+
96
+ @InProceedings{10.1007/978-3-031-47240-4_20,
97
+ author="Romero, Julien
98
+ and Razniewski, Simon",
99
+ editor="Payne, Terry R.
100
+ and Presutti, Valentina
101
+ and Qi, Guilin
102
+ and Poveda-Villal{\'o}n, Mar{\'i}a
103
+ and Stoilos, Giorgos
104
+ and Hollink, Laura
105
+ and Kaoudi, Zoi
106
+ and Cheng, Gong
107
+ and Li, Juanzi",
108
+ title="Mapping and Cleaning Open Commonsense Knowledge Bases with Generative Translation",
109
+ booktitle="The Semantic Web -- ISWC 2023",
110
+ year="2023",
111
+ publisher="Springer Nature Switzerland",
112
+ address="Cham",
113
+ pages="368--387",
114
+ abstract="Structured knowledge bases (KBs) are the backbone of many knowledge-intensive applications, and their automated construction has received considerable attention. In particular, open information extraction (OpenIE) is often used to induce structure from a text. However, although it allows high recall, the extracted knowledge tends to inherit noise from the sources and the OpenIE algorithm. Besides, OpenIE tuples contain an open-ended, non-canonicalized set of relations, making the extracted knowledge's downstream exploitation harder. In this paper, we study the problem of mapping an open KB into the fixed schema of an existing KB, specifically for the case of commonsense knowledge. We propose approaching the problem by generative translation, i.e., by training a language model to generate fixed-schema assertions from open ones. Experiments show that this approach occupies a sweet spot between traditional manual, rule-based, or classification-based canonicalization and purely generative KB construction like COMET. Moreover, it produces higher mapping accuracy than the former while avoiding the association-based noise of the latter. Code and data are available. (https://github.com/Aunsiels/GenT, julienromero.fr/data/GenT)",
115
+ isbn="978-3-031-47240-4"
116
+ }
117
+
118
+
119
+
120
+
121
+
122
+