Update README.md
Browse files
README.md
CHANGED
@@ -47,6 +47,21 @@ The model was initialized with the weights of XLM-RoBERTa(large) and trained usi
|
|
47 |
|
48 |
This model is a lite-weight version of [studio-ousia/mluke-large](https://huggingface.co/studio-ousia/mluke-large), without Wikipedia entity embeddings but only with special entities such as `[MASK]`.
|
49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
### Citation
|
51 |
|
52 |
If you find mLUKE useful for your work, please cite the following paper:
|
|
|
47 |
|
48 |
This model is a lite-weight version of [studio-ousia/mluke-large](https://huggingface.co/studio-ousia/mluke-large), without Wikipedia entity embeddings but only with special entities such as `[MASK]`.
|
49 |
|
50 |
+
## Note
|
51 |
+
When you load the model from `AutoModel.from_pretrained` with the default configuration, you will see the following warning:
|
52 |
+
|
53 |
+
```
|
54 |
+
Some weights of the model checkpoint at studio-ousia/mluke-base-lite were not used when initializing LukeModel: [
|
55 |
+
'luke.encoder.layer.0.attention.self.w2e_query.weight', 'luke.encoder.layer.0.attention.self.w2e_query.bias',
|
56 |
+
'luke.encoder.layer.0.attention.self.e2w_query.weight', 'luke.encoder.layer.0.attention.self.e2w_query.bias',
|
57 |
+
'luke.encoder.layer.0.attention.self.e2e_query.weight', 'luke.encoder.layer.0.attention.self.e2e_query.bias',
|
58 |
+
...]
|
59 |
+
```
|
60 |
+
|
61 |
+
These weights are the weights for entity-aware attention (as described in [the LUKE paper](https://arxiv.org/abs/2010.01057)).
|
62 |
+
This is expected because `use_entity_aware_attention` is set to `false` by default, but the pretrained weights contain the weights for it in case you enable `use_entity_aware_attention` and have the weights loaded into the model.
|
63 |
+
|
64 |
+
|
65 |
### Citation
|
66 |
|
67 |
If you find mLUKE useful for your work, please cite the following paper:
|