arxiv:2302.14785

Joint Representations of Text and Knowledge Graphs for Retrieval and Evaluation

Published on Feb 28, 2023

Authors:

Claire Gardent

Abstract

A key feature of neural models is that they can produce semantic vector representations of objects (texts, images, speech, etc.) ensuring that similar objects are close to each other in the vector space. While much work has focused on learning representations for other modalities, there are no aligned cross-modal <PRE_TAG>representations</POST_TAG> for text and knowledge base (KB) elements. One challenge for learning such representations is the lack of parallel data, which we use contrastive training on heuristics-based datasets and data augmentation to overcome, training embedding models on (KB graph, text) pairs. On WebNLG, a cleaner manually crafted dataset, we show that they learn aligned representations suitable for retrieval. We then fine-tune on annotated data to create EREDAT (Ensembled Representations for Evaluation of DAta-to-Text), a similarity metric between English text and KB graphs. EREDAT outperforms or matches state-of-the-art metrics in terms of correlation with human judgments on WebNLG even though, unlike them, it does not require a reference text to compare against.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2302.14785 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2302.14785 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2302.14785 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.