LLaMAX2-7B-X-CSQA / README.md
LLaMAX's picture
Update README.md
05472ed verified
|
raw
history blame
No virus
2.49 kB
metadata
tags:
  - Multilingual

Model Sources

Model Description

🔥 LLaMAX-7B-X-CSQA is a commonsense reasoning model with multilingual capability, which is fully fine-tuned the powerful multilingual model LLaMAX-7B on five English commonsense reasoning dataset to train LLaMAX-7B-X-CSQA, including X-CSQA, ARC-Easy, ARC-Challenge, OpenBookQA, and QASC.

🔥 Compared with fine-tuning Llama-2 on the same setting, LLaMAX-7B-X-CSQA improves the average accuracy up to 4.2% on the X-CSQA dataset.

Experiments

X-CSQA Avg. Sw Ur Hi Ar Vi Ja Pl Zh Nl Ru It De Pt Fr Es En
Llama2-7B-X-CSQA 50.9 23.2 24.7 32.9 32.4 51.0 50.0 51.5 55.6 56.9 55.8 58.8 59.9 60.4 61.8 61.9 78.1
LLaMAX-7B-X-CSQA 55.1 43.5 39.0 44.1 45.1 54.0 49.9 54.6 58.2 58.9 57.1 59.1 59.0 60.9 61.6 62.7 74.0

Model Usage

Code Example:

from transformers import AutoTokenizer, LlamaForCausalLM

model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)

query = "What is someone operating a vehicle likely to be accused of after becoming inebriated? \n Options: A.punish \t B. arrest \t C. automobile accidents \t D. talking nonsense \t E.drunk
driving \n Answer:"
inputs = tokenizer(query, return_tensors="pt")

generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
# => E

Citation

if our model helps your work, please cite this paper:

@misc{lu2024llamaxscalinglinguistichorizons,
      title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages}, 
      author={Yinquan Lu and Wenhao Zhu and Lei Li and Yu Qiao and Fei Yuan},
      year={2024},
      eprint={2407.05975},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2407.05975}, 
}