huawei-noah
/

EntityCS-39-MLM-xlmr-base

@@ -58,7 +58,7 @@ To train models on the corpus, we first employ the conventional 80-10-10 MLM obj
 with [MASK] 80% of the time, with Random subwords (from the entire vocabulary) 10% of the time, and leave the remaining 10% unchanged (Same).
 To integrate entity-level cross-lingual knowledge into the model, we propose Entity Prediction objectives, where we only mask subwords belonging
-to an entity. By predicting the masked entities in ENTITYCS sentences, we expect the model to capture the semantics of the same entity in different
 languages.
 Two different masking strategies are proposed for predicting entities: Whole Entity Prediction (`WEP`) and Partial Entity Prediction (`PEP`).
@@ -78,8 +78,7 @@ setting, PEP<sub>MS</sub>, we remove the 10% Random subwords substitution, i.e.
 subwords and 10% Same subwords from the masking candidates. In the third setting, PEP<sub>M</sub>, we
 further remove the 10% Same subwords prediction, essentially predicting only the masked subwords.
-Prior work has proven it is effective to combine
-Entity Prediction with MLM for cross-lingual transfer ([Jiang et al., 2020](https://aclanthology.org/2020.emnlp-main.479/)), therefore we investigate the
 combination of the Entity Prediction objectives together with MLM on non-entity subwords. Specifically, when combined with MLM, we lower the
 entity masking probability (p) to 50% to roughly keep the same overall masking percentage.
 This results into the following objectives: WEP + MLM, PEP<sub>MRS</sub> + MLM, PEP<sub>MS</sub> + MLM, PEP<sub>M</sub> + MLM
@@ -112,7 +111,7 @@ For results on each downstream task, please refer to the [paper](https://aclanth
 ## How to Get Started with the Model
-Use the code below to get started with the model: https://github.com/huawei-noah/noah-research/tree/master/NLP/EntityCS
 ## Citation

 with [MASK] 80% of the time, with Random subwords (from the entire vocabulary) 10% of the time, and leave the remaining 10% unchanged (Same).
 To integrate entity-level cross-lingual knowledge into the model, we propose Entity Prediction objectives, where we only mask subwords belonging
+to an entity. By predicting the masked entities in EntityCS sentences, we expect the model to capture the semantics of the same entity in different
 languages.
 Two different masking strategies are proposed for predicting entities: Whole Entity Prediction (`WEP`) and Partial Entity Prediction (`PEP`).
 subwords and 10% Same subwords from the masking candidates. In the third setting, PEP<sub>M</sub>, we
 further remove the 10% Same subwords prediction, essentially predicting only the masked subwords.
+Prior work has proven it is effective to combine Entity Prediction with MLM for cross-lingual transfer ([Jiang et al., 2020](https://aclanthology.org/2020.emnlp-main.479/)), therefore we investigate the
 combination of the Entity Prediction objectives together with MLM on non-entity subwords. Specifically, when combined with MLM, we lower the
 entity masking probability (p) to 50% to roughly keep the same overall masking percentage.
 This results into the following objectives: WEP + MLM, PEP<sub>MRS</sub> + MLM, PEP<sub>MS</sub> + MLM, PEP<sub>M</sub> + MLM
 ## How to Get Started with the Model
+Use the code below to get started with training: https://github.com/huawei-noah/noah-research/tree/master/NLP/EntityCS
 ## Citation