File size: 2,361 Bytes
f1b6f9f
39eaaef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f1b6f9f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a3bacf3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f1b6f9f
 
 
 
 
cc5e9e8
 
 
 
 
 
 
 
f1b6f9f
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
language: 
- multilingual
- ar 
- bn 
- de 
- el 
- en
- es
- fi
- fr
- hi
- id 
- it
- ja
- ko
- nl
- pl
- pt
- ru
- sv
- sw
- te
- th
- tr
- vi
- zh
thumbnail: https://github.com/studio-ousia/luke/raw/master/resources/luke_logo.png
tags:
  - luke
  - named entity recognition
  - relation classification
  - question answering
license: apache-2.0
---

## mLUKE

**mLUKE** (multilingual LUKE) is a multilingual extension of LUKE.

Please check the [official repository](https://github.com/studio-ousia/luke) for
more details and updates.

This is the mLUKE base model with 12 hidden layers, 768 hidden size. The total number
of parameters in this model is 585M (278M for the word embeddings and encoder, 307M for the entity embeddings).
The model was initialized with the weights of XLM-RoBERTa(base) and trained using December 2020 version of Wikipedia in 24 languages.


## Note
When you load the model from `AutoModel.from_pretrained` with the default configuration, you will see the following warning:

```
Some weights of the model checkpoint at studio-ousia/mluke-base-lite were not used when initializing LukeModel: [
'luke.encoder.layer.0.attention.self.w2e_query.weight', 'luke.encoder.layer.0.attention.self.w2e_query.bias', 
'luke.encoder.layer.0.attention.self.e2w_query.weight', 'luke.encoder.layer.0.attention.self.e2w_query.bias', 
'luke.encoder.layer.0.attention.self.e2e_query.weight', 'luke.encoder.layer.0.attention.self.e2e_query.bias', 
...]
```

These weights are the weights for entity-aware attention (as described in [the LUKE paper](https://arxiv.org/abs/2010.01057)).
This is expected because `use_entity_aware_attention` is set to `false` by default, but the pretrained weights contain the weights for it in case you enable `use_entity_aware_attention` and have the weights loaded into the model.

### Citation

If you find mLUKE useful for your work, please cite the following paper:

```latex
@inproceedings{ri-etal-2022-mluke,
    title = "m{LUKE}: {T}he Power of Entity Representations in Multilingual Pretrained Language Models",
    author = "Ri, Ryokan  and
      Yamada, Ikuya  and
      Tsuruoka, Yoshimasa",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    year = "2022",
    url = "https://aclanthology.org/2022.acl-long.505",
```