Edit model card


luke-japanese is the Japanese version of LUKE (Language Understanding with Knowledge-based Embeddings), a pre-trained knowledge-enhanced contextualized representation of words and entities. LUKE treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Please refer to our GitHub repository for more details and updates.

This model is a lightweight version which does not contain Wikipedia entity embeddings. Please use the full version for tasks that use Wikipedia entities as inputs.

luke-japaneseは、単語とエンティティの知識拡張型訓練済み Transformer モデルLUKEの日本語版です。LUKE は単語とエンティティを独立したトークンとして扱い、これらの文脈を考慮した表現を出力します。詳細については、GitHub リポジトリを参照してください。

このモデルは、Wikipedia エンティティのエンベディングを含まない軽量版のモデルです。Wikipedia エンティティを入力として使うタスクには、full versionを使用してください。

Experimental results on JGLUE

The experimental results evaluated on the dev set of JGLUE are shown as follows:

Model MARC-ja JSTS JNLI JCommonsenseQA
acc Pearson/Spearman acc acc
LUKE Japanese base 0.965 0.916/0.877 0.912 0.842
Tohoku BERT base 0.958 0.909/0.868 0.899 0.808
NICT BERT base 0.958 0.910/0.871 0.902 0.823
Waseda RoBERTa base 0.962 0.913/0.873 0.895 0.840
XLM RoBERTa base 0.961 0.877/0.831 0.893 0.687

The baseline scores are obtained from here.


  title={LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention},
  author={Ikuya Yamada and Akari Asai and Hiroyuki Shindo and Hideaki Takeda and Yuji Matsumoto},
Downloads last month

Space using studio-ousia/luke-japanese-base-lite 1