README.md · studio-ousia/luke-japanese-base at 29167defefea7a0029c0224113e4ee11611f5ec5

metadata

language: ja
thumbnail: https://github.com/studio-ousia/luke/raw/master/resources/luke_logo.png
tags:
  - luke
  - named entity recognition
  - entity typing
  - relation classification
  - question answering
license: apache-2.0

luke-japanese

luke-japanese is the Japanese version of LUKE (Language Understanding with Knowledge-based Embeddings), a pre-trained knowledge-enhanced contextualized representation of words and entities based on transformer. LUKE treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Please refer to our GitHub repository for more details and updates.

luke-japaneseは、単語とエンティティの知識拡張型訓練済みモデルLUKEの日本語版です。LUKE は単語とエンティティを独立したトークンとして扱い、これらの文脈を考慮した表現を出力します。詳細については、GitHub リポジトリを参照してください。

Experimental results on JGLUE

The performance of luke-japanese evaluated on the dev set of JGLUE is shown as follows:

Model	MARC-ja	JSTS	JNLI	JCommonsenseQA
	acc	Pearson/Spearman	acc	acc
luke-japanese-base	0.963	0.912/0.875	0.912	0.842
Baselines:
Tohoku BERT base	0.958	0.899/0.859	0.899	0.808
NICT BERT base	0.958	0.903/0.867	0.902	0.823
Waseda RoBERTa base	0.962	0.901/0.865	0.895	0.840
XLM RoBERTa base	0.961	0.870/0.825	0.893	0.687

Citation

@inproceedings{yamada2020luke,
  title={LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention},
  author={Ikuya Yamada and Akari Asai and Hiroyuki Shindo and Hideaki Takeda and Yuji Matsumoto},
  booktitle={EMNLP},
  year={2020}
}