Edit model card

Model Sources

Model Description

🔥 LLaMAX-7B-X-NLI is a NLI model with multilingual capability, which is fully fine-tuned the powerful multilingual model LLaMAX-7B on MultiNLI dataset.

🔥 Compared with fine-tuning Llama-2 on the same setting, LLaMAX-7B-X-CSQA improves the average accuracy up to 5.6% on the XNLI dataset.

Experiments

XNLI Avg. Sw Ur Hi Th Ar Tr El Vi Zh Ru Bg De Fr Es En
Llama2-7B-X-XNLI 70.6 44.6 55.1 62.2 58.4 64.7 64.9 65.6 75.4 75.9 78.9 78.6 80.7 81.7 83.1 89.5
LLaMAX-7B-X-XNLI 76.2 66.7 65.3 69.1 66.2 73.6 71.8 74.3 77.4 78.3 80.3 81.6 82.2 83.0 84.1 89.7

Model Usage

Code Example:

from transformers import AutoTokenizer, LlamaForCausalLM

model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)

query = "Premise: She doesn’t really understand. Hypothesis: Actually, she doesn’t get it. Label:"
inputs = tokenizer(query, return_tensors="pt")

generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
# =>  Entailment

Citation

if our model helps your work, please cite this paper:

@misc{lu2024llamaxscalinglinguistichorizons,
      title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages}, 
      author={Yinquan Lu and Wenhao Zhu and Lei Li and Yu Qiao and Fei Yuan},
      year={2024},
      eprint={2407.05975},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2407.05975}, 
}
Downloads last month
33
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.