LLaMAX
/

LLaMAX3-8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

huangzixian commited on 6 days ago

Commit

d8949ca

•

1 Parent(s): 739b1b5

update readme

Files changed (1) hide show

README.md +4 -35

README.md CHANGED Viewed

@@ -6,45 +6,14 @@
 - **Repository**: https://github.com/CONE-MT/LLaMAX/
 ### Model Description
-LLaMAX is a multilingual language model, developed through continued pre-training on Llama3, and supports over 100 languages.
-Its translation capabilities far exceed general models of the same scale, and it can serve as a base model to support downstream multilingual tasks.
-### 🔥 Effortless Multilingual Translation with a Simple Prompt
-LLaMAX supports translation between more than 100 languages, surpassing the performance of similarly scaled LLMs.
-```angular2html
-def Prompt_template(query, src_language, trg_language):
-    instruction = f'Translate the following sentences from {src_language} to {trg_language}.'
-    prompt = (
-        'Below is an instruction that describes a task, paired with an input that provides further context. '
-        'Write a response that appropriately completes the request.\n'
-        f'### Instruction:\n{instruction}\n'
-        f'### Input:\n{query}\n### Response:'
-    )
-    return prompt
-```
-And then run the following codes to execute translation:
-```angular2html
-from transformers import AutoTokenizer, LlamaForCausalLM
-model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
-tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)
-query = "你好，今天是个好日子"
-prompt = Prompt_template(query, 'Chinese', 'English')
-inputs = tokenizer(prompt, return_tensors="pt")
-generate_ids = model.generate(inputs.input_ids, max_length=30)
-tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
-# => "Hello, today is a good day"
-```
 ### 🔥 Effective Base Model for Multilingual Task
-LLaMAX preserves its efficacy in general tasks and improves the performance on multilingual tasks.
 We fine-tuned LLaMAX using only the English training set of downstream task, which also shows significant improvements in non-English. We provide fine-tuning LLaMAX models for the following three tasks:
 - **Math Reasoning**: https://huggingface.co/LLaMAX/LLaMAX2-7B-MetaMath

 - **Repository**: https://github.com/CONE-MT/LLaMAX/
 ### Model Description
+LLaMAX3-8B is a multilingual language base model, developed through continued pre-training on Llama3, and supports over 100 languages.
+LLaMAX3-8B can serve as a base model to support downstream multilingual tasks but without instruct-following capability.
+We further fine-tuned LLaMAX2-7B on Alpaca dataset to enhance its instruct-following capabilities. The model is available at https://huggingface.co/LLaMAX/LLaMAX3-8B-Alpaca.
 ### 🔥 Effective Base Model for Multilingual Task
+LLaMAX2-7B preserves its efficacy in general tasks and improves the performance on multilingual tasks.
 We fine-tuned LLaMAX using only the English training set of downstream task, which also shows significant improvements in non-English. We provide fine-tuning LLaMAX models for the following three tasks:
 - **Math Reasoning**: https://huggingface.co/LLaMAX/LLaMAX2-7B-MetaMath