metadata
license: apache-2.0
language:
- ja
Leia-Swallow-7B
LEIA is a training technique for autoregressive LLMs that effectively improves their performance in languages other than English by enhancing cross-lingual knowledge transfer from English to a target language. This model is constructed by applying LEIA to Swallow, a Japanese-English bilingual LLM based on LLaMA 2. The model achieves enhanced performance on six Japanese question-answering benchmarks, as reported below.
Please refer to our paper or blog post (in Japanese) for further technical details.
- LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation (arxiv.org)
- LEIA: 言語間転移学習でLLMを賢くする新しい方法 (zenn.dev)
Model List
Empirical Results
The model is assessed using the following six question answering benchmarks:
- X-CODAH
- X-CSQA
- JCommonsenseQA
- NIILC
- JEMHopQA
- JAQKET v2
Model | X-CODAH | X-CSQA | JCommonsenseQA | NIILC | JEMHopQA | JAQKET v2 |
---|---|---|---|---|---|---|
Swallow | 42.0 | 41.0 | 80.3 | 59.5 | 50.8 | 86.2 |
LEIA | 42.7 | 42.4 | 80.6 | 60.3 | 54.7 | 86.5 |
For further details of this experiment, please refer to our paper.
Contributors
- Ikuya Yamada (Studio Ousia, RIKEN)
- Ryokan Ri (LY Corporation, SB Intuitions)