IndexTeam
/

Index-1.9B-Pure

Text Generation

Model card Files Files and versions Community

Index-1.9B-Pure / README.md

AsirAsir's picture

Update README.md

93a1211 verified 4 months ago

|

3.13 kB

	---
	license: other
	license_name: license
	license_link: LICENSE
	---
	<div align="center">
	<h1>
	Index-1.9B
	</h1>
	</div>

	## Model Introduction

	We are excited to announce the release of a lightweight version from the Index series models: the Index-1.9B series.
	The open-source Index-1.9B series includes the following models:
	- Index-1.9B base: The base model, with 1.9 billion non-embedding parameters, pre-trained on a 2.8T corpus mainly in Chinese and English. It leads in multiple evaluation benchmarks compared to models of the same level.
	- Index-1.9B pure (this repository's model) : A control version of the base model with the same parameters and training strategy, but strictly filtered out all instruction-related data from the corpus to verify the impact of instructions on benchmarks.
	- Index-1.9B chat: A dialogue model aligned with SFT and DPO based on the Index-1.9B base. We found that due to the introduction of a lot of internet community corpus in our pre-training, the model has significantly more interesting chatting capabilities.
	- Index-1.9B character : Introduces RAG on top of SFT and DPO to achieve few-shots role-playing customization.

	Note: This is the Base model, capable only of continuation and further training alignment, and cannot be directly interacted with.
	- For the Chat model, see [Index-1.9B-Chat](https://huggingface.co/IndexTeam/Index-1.9B-Chat)
	- For the Role-playing model, see [Index-1.9B-Character](https://huggingface.co/IndexTeam/Index-1.9B-Character)

	For more details, see our [GitHub](https://github.com/bilibili/Index-1.9B) and [Index-1.9B Technical Report](https://github.com/bilibili/Index-1.9B/blob/main/Index-1.9B%20%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A.pdf)

	## Evaluation Results
	The Index-1.9B shows excellent performance in general understanding evaluations, leading compared to recently open-sourced small models and comparable to some 7B and models larger than 10B.
	\|Model\|Average score\|Average English score\|MMLU\|CEVAL\|CMMLU\|HellaSwag\|Arc-C\|Arc-E\|
	\|----\|----\|----\|----\|----\|----\|----\|----\|----\|
	\|Google Gemma 2B\|41.58\|46.77\|41.81\|31.36\|31.02\|66.82\|36.39\|42.07\|
	\|Phi-2 (2.7B)\|58.89\|72.54\|57.61\|31.12\|32.05\|70.94\|74.51\|87.1\|
	\|Qwen1.5-1.8B\|58.96\|59.28\|47.05\|59.48\|57.12\|58.33\|56.82\|74.93\|
	\|Qwen2-1.5B(report)\|65.17\|62.52 \|56.5\|70.6\|70.3\|66.6\|43.9\|83.09\|
	\|MiniCPM-2.4B-SFT\|62.53\|68.75\|53.8\|49.19\|50.97\|67.29\|69.44\|84.48\|
	\|Index-1.9B-Pure\|49.55 \|52.83 \|43.75\|42.35\|43.61\|63.21\|42.75\|61.61\|
	\|Index-1.9B\|64.92 \|69.93\|52.53\|57.01\|52.79\|80.69\|65.15\|81.35\|
	\|Llama2-7B\|50.79\|60.31\|44.32\|32.42\|31.11\|76\|46.3\|74.6\|
	\|Mistral-7B (report) \|/\|69.23\|60.1\|/\|/\|81.3\|55.5\|80\|
	\|Baichuan2-7B\|54.53\|53.51\|54.64\|56.19\|56.95\|25.04\|57.25\|77.12\|
	\|Llama2-13B\|57.51\|66.61\|55.78\|39.93\|38.7\|76.22\|58.88\|75.56\|
	\|Baichuan2-13B\|68.90\|71.69\|59.63\|59.21\|61.27\|72.61\|70.04\|84.48\|
	\|MPT-30B (report)\|/\|63.48\|46.9\|/\|/\|79.9\|50.6\|76.5\|
	\|Falcon-40B (report)\|/\|68.18\|55.4\|/\|/\|83.6\|54.5\|79.2\|

	Evaluation code is based on [OpenCompass](https://github.com/open-compass/opencompass) with compatibility modifications. See the [evaluate](./evaluate/) folder for details.