--- license: llama3 language: - zh - en pipeline_tag: text-generation ---

Xiangxin-2XL-Chat-1048k

我们提供私有化模型训练服务,如果您需要训练行业模型、领域模型或者私有模型,请联系我们: customer@xiangxinai.cn We offer customized model training services. If you need to train industry-specific models, domain-specific models, or private models, please contact us at: customer@xiangxinai.cn. # 模型介绍/Introduction Xiangxin-2XL-Chat-1048k是[象信AI](https://www.xiangxinai.cn)基于Meta Llama-3-70B-Instruct模型和[Gradient AI的扩充上下文的工作](https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-1048k),利用自行研发的中文价值观对齐数据集进行ORPO训练而形成的Chat模型。该模型具备更强的中文能力和中文价值观,其上下文长度达到100万字。在模型性能方面,该模型在ARC、HellaSwag、MMLU、TruthfulQA_mc2、Winogrande、GSM8K_flex、CMMLU、CEVAL-VALID等八项测评中,取得了平均分70.22分的成绩,超过了Gradientai-Llama-3-70B-Instruct-Gradient-1048k。我们的训练数据并不包含任何测评数据集。 Xiangxin-2XL-Chat-1048k is a Chat model developed by [Xiangxin AI](https://www.xiangxinai.cn), based on the Meta Llama-3-70B-Instruct model and [expanded context from Gradient AI](https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-1048k). It was trained using a proprietary Chinese value-aligned dataset through ORPO training, resulting in enhanced Chinese proficiency and alignment with Chinese values. The model has a context length of up to 1 million words. In terms of performance, it surpassed the Gradientai-Llama-3-70B-Instruct-Gradient-1048k model with an average score of 70.22 across eight evaluations including ARC, HellaSwag, MMLU, TruthfulQA_mc2, Winogrande, GSM8K_flex, CMMLU, and C-EVAL. It's worth noting that our training data did not include any evaluation datasets.
Model | Context Length | Pre-trained Tokens | :------------: | :------------: | :------------: | | Xiangxin-2XL-Chat-1048k | 1048k | 15T
# Benchmark 结果/Benchmark Evaluation | | **Average** | **ARC** | **HellaSwag** | **MMLU** | **TruthfulQA** | **Winogrande** | **GSM8K** | **CMMLU** | **CEVAL** | |:-----------------------:|:----------:|:--------:|:---------:|:----------:|:-----------:|:-------:|:-------:|:-------:|:-------:| |**Xiangxin-2XL-Chat-1048k**| 70.22 | 60.92 | 83.29 |75.13| 57.33| 76.64| 81.05| 65.40| 62.03 | |**Llama-3-70B-Instruct-Gradient-1048k**| 69.66| 61.18 |82.88 |74.95 |55.28 |75.77 |77.79 |66.44 |63.00| Note:truthfulqa_mc2, gsm8k flexible-extract # 训练过程模型/Training 该模型是使用ORPO技术和自行研发的中文价值观对齐数据集进行训练的。由于内容的敏感性,该数据集无法公开披露。 The model was trained using ORPO and a proprietary Chinese alignment dataset developed in-house. Due to the sensitivity of the content, the dataset cannot be publicly disclosed. ## Training loss ![image/png](https://cdn-uploads.huggingface.co/production/uploads/655b15957f2466433998bb89/oLLnrWaxQnyVwI8n2QqHK.png) ## Reward accuracies ![image/png](https://cdn-uploads.huggingface.co/production/uploads/655b15957f2466433998bb89/yD4My-43lLRWecyq-bgZ2.png) ## SFT loss ![image/png](https://cdn-uploads.huggingface.co/production/uploads/655b15957f2466433998bb89/iUoQfVZDftoW7C-2VXeWe.png) # 快速开始/Quick Start ## Use with transformers You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples of both. 使用Transformers运行本模型推理需要约400GB的显存。 Running inference with this model using Transformers requires approximately 400GB of GPU memory. ### Transformers pipeline ```python import transformers import torch model_id = "xiangxinai/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto", ) messages = [ {"role": "system", "content": ""}, {"role": "user", "content": "解释一下“温故而知新”"}, ] prompt = pipeline.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) terminators = [ pipeline.tokenizer.eos_token_id, pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>") ] outputs = pipeline( prompt, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.6, top_p=0.9, ) print(outputs[0]["generated_text"][len(prompt):]) “温故而知新”是中国古代的一句成语,出自《论语·子路篇》。 它的意思是通过温习过去的知识和经验,来获得新的理解和见解。 这里的“温故”是指温习过去,回顾历史,复习旧知识, 而“知新”则是指了解新鲜事物,掌握新知识。 这个成语强调学习的循序渐进性,强调在学习新知识时, 不能忽视过去的基础,而是要在继承和发扬的基础上,去理解和创新。 ``` ### Transformers AutoModelForCausalLM ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "xiangxinai/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", ) messages = [ {"role": "system", "content": ""}, {"role": "user", "content": "解释一下“温故而知新”"}, ] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) terminators = [ tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>") ] outputs = model.generate( input_ids, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.6, top_p=0.9, ) response = outputs[0][input_ids.shape[-1]:] print(tokenizer.decode(response, skip_special_tokens=True)) “温故而知新”是中国古代的一句成语,出自《论语·子路篇》。 它的意思是通过温习过去的知识和经验,来获得新的理解和见解。 这里的“温故”是指温习过去,回顾历史,复习旧知识, 而“知新”则是指了解新鲜事物,掌握新知识。 这个成语强调学习的循序渐进性,强调在学习新知识时, 不能忽视过去的基础,而是要在继承和发扬的基础上,去理解和创新。 ``` # 协议/License This code is licensed under the META LLAMA 3 COMMUNITY LICENSE AGREEMENT License. # 联系我们/Contact Us For inquiries, please contact us via email at customer@xiangxinai.cn.