duoqi
/

Nanbeige2-16B-Chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Nanbeige2-16B-Chat / README.md

duoqi's picture

Update README.md

7c83b56 verified about 1 month ago

|

history blame contribute delete

No virus

893 Bytes

	---
	language:
	- en
	- zh
	pipeline_tag: text-generation
	license: apache-2.0
	tags:
	- llm
	---
	Introduction

	A Llama version for [Nanbeige-16B-Chat](https://huggingface.co/Nanbeige/Nanbeige2-16B-Chat), which could be loaded by LlamaForCausalLM.

	The Nanbeige2-16B-Chat is the latest 16B model developed by the Nanbeige Lab, which utilized 4.5T tokens of high-quality training data during the training phase. During the alignment phase, we initially trained our model using 1 million samples through Supervised Fine-Tuning (SFT). We then engaged in curriculum learning with 400,000 high-quality samples that presented a greater level of difficulty. Subsequently, we incorporated human feedback through the Direct Preference Optimization (DPO), culminating in the development of Nanbeige2-16B-Chat. Nanbeige2-16B-Chat has achieved superior performance across various authoritative benchmark datasets.