--- language: ja thumbnail: https://github.com/studio-ousia/luke/raw/master/resources/luke_logo.png tags: - luke - named entity recognition - entity typing - relation classification - question answering license: apache-2.0 --- ## luke-japanese **luke-japanese** is the Japanese version of **LUKE** (**L**anguage **U**nderstanding with **K**nowledge-based **E**mbeddings), a pre-trained _knowledge-enhanced_ contextualized representation of words and entities based on transformer. LUKE treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Please refer to our [GitHub repository](https://github.com/studio-ousia/luke) for more details and updates. **luke-japanese**は、単語とエンティティの知識拡張型訓練済みモデル**LUKE**の日本 語版です。LUKE は単語とエンティティを独立したトークンとして扱い、これらの文脈を 考慮した表現を出力します。詳細については 、[GitHub リポジトリ](https://github.com/studio-ousia/luke)を参照してください。 ### Experimental results on JGLUE The performance of luke-japanese evaluated on the dev set of [JGLUE](https://github.com/yahoojapan/JGLUE) is shown as follows: | Model | MARC-ja | JSTS | JNLI | JCommonsenseQA | | ---------------------- | --------- | ------------------- | --------- | -------------- | | | acc | Pearson/Spearman | acc | acc | | **luke-japanese-base** | **0.963** | **0.912**/**0.875** | **0.912** | **0.842** | | _Baselines:_ | | | Tohoku BERT base | 0.958 | 0.899/0.859 | 0.899 | 0.808 | | NICT BERT base | 0.958 | 0.903/0.867 | 0.902 | 0.823 | | Waseda RoBERTa base | 0.962 | 0.901/0.865 | 0.895 | 0.840 | | XLM RoBERTa base | 0.961 | 0.870/0.825 | 0.893 | 0.687 | ### Citation ```latex @inproceedings{yamada2020luke, title={LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention}, author={Ikuya Yamada and Akari Asai and Hiroyuki Shindo and Hideaki Takeda and Yuji Matsumoto}, booktitle={EMNLP}, year={2020} } ```