Safetensors
llama
bingo123122121 commited on
Commit
a845bf4
·
verified ·
1 Parent(s): 2dbd9f4

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -3
README.md CHANGED
@@ -1,3 +1,41 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 💡Model Description
2
+
3
+ Official model repository for our **ACL 2026 Main Conference** paper "*Language on Demand, Knowledge at Core*: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality".
4
+
5
+ ## ✨XBridge-base
6
+
7
+ [`XBridge-base`](https://huggingface.co/ICTNLP/XBridge-base) is trained with stage 1 (cross-model alignment) using trilingual translation data, composing [`LLaMA3-8B`](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with [`NLLB-200-1.3B`](https://huggingface.co/facebook/nllb-200-1.3B). Training is conducted on 10 languages:
8
+
9
+ > Bn, De, En, Es, Fr, Ja, Ru, Sw, Th, Zh
10
+
11
+ Despite being trained on a limited set of languages, we observe in our analysis that **stage 1 learns a language-agnostic cross-model alignment**, which generalizes well beyond the seen languages.
12
+
13
+ ## ✨XBridge-SFT
14
+
15
+ [`XBridge-SFT`](https://huggingface.co/ICTNLP/XBridge-SFT) further extends `XBridge-base` by training stage 2 (encoder-side adaptation) and stage 3 (decoder-side adaptation) for instruction-following tasks. Notably, we directly scale to 50 languages in these stages. This design is motivated by our finding of cross-model generalization. We train on the multilingual instruction-following dataset [`Bactrian-X`](https://huggingface.co/datasets/MBZUAI/Bactrian-X), and expand to the following additional languages:
16
+
17
+ > Af, Ar, Az, Cs, El, Et, Fa, Fi, Gl, Gu, He, Hi, Hr, Id, It, Ka, Kk, Km, Lt, Lv, Mk, Ml, Mn, Mr, My, Ne, Nl, Pl, Ps, Pt, Ro, Sl, Sv, Ta, Te, Tr, Uk, Ur, Vi, Xh
18
+
19
+ Empirically, we find that this direct scaling strategy achieves strong performance, demonstrating the robustness and generalization ability of the stage 1 alignment.
20
+
21
+ See our [paper](https://arxiv.org/abs/2603.17512) for more details, and try our Gradio demo in the [github repository](https://github.com/ictnlp/XBridge)!
22
+
23
+ # 📚Citation
24
+
25
+ If you find this model or our work useful, please cite:
26
+
27
+ ```tex
28
+ @misc{bu2026languagedemandknowledgecore,
29
+ title={Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality},
30
+ author={Mengyu Bu and Yang Feng},
31
+ year={2026},
32
+ eprint={2603.17512},
33
+ archivePrefix={arXiv},
34
+ primaryClass={cs.CL},
35
+ url={https://arxiv.org/abs/2603.17512},
36
+ }
37
+ ```
38
+
39
+ # 📮Contact
40
+
41
+ For questions, please contact: `bumengyu23z@ict.ac.cn`