luke-japanese-base / README.md
ikuyamada's picture
Initial commit
29167de
metadata
language: ja
thumbnail: https://github.com/studio-ousia/luke/raw/master/resources/luke_logo.png
tags:
  - luke
  - named entity recognition
  - entity typing
  - relation classification
  - question answering
license: apache-2.0

luke-japanese

luke-japanese is the Japanese version of LUKE (Language Understanding with Knowledge-based Embeddings), a pre-trained knowledge-enhanced contextualized representation of words and entities based on transformer. LUKE treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Please refer to our GitHub repository for more details and updates.

luke-japaneseは、単語とエンティティの知識拡張型訓練済みモデルLUKEの日本 語版です。LUKE は単語とエンティティを独立したトークンとして扱い、これらの文脈を 考慮した表現を出力します。詳細については 、GitHub リポジトリを参照してください。

Experimental results on JGLUE

The performance of luke-japanese evaluated on the dev set of JGLUE is shown as follows:

Model MARC-ja JSTS JNLI JCommonsenseQA
acc Pearson/Spearman acc acc
luke-japanese-base 0.963 0.912/0.875 0.912 0.842
Baselines:
Tohoku BERT base 0.958 0.899/0.859 0.899 0.808
NICT BERT base 0.958 0.903/0.867 0.902 0.823
Waseda RoBERTa base 0.962 0.901/0.865 0.895 0.840
XLM RoBERTa base 0.961 0.870/0.825 0.893 0.687

Citation

@inproceedings{yamada2020luke,
  title={LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention},
  author={Ikuya Yamada and Akari Asai and Hiroyuki Shindo and Hideaki Takeda and Yuji Matsumoto},
  booktitle={EMNLP},
  year={2020}
}