|
--- |
|
license: other |
|
license_name: license |
|
license_link: LICENSE |
|
--- |
|
<div align="center"> |
|
<h1> |
|
Index-1.9B |
|
</h1> |
|
</div> |
|
|
|
## Model Introduction |
|
|
|
We are excited to announce the release of a lightweight version from the Index series models: the Index-1.9B series. |
|
The open-source Index-1.9B series includes the following models: |
|
- Index-1.9B base: The base model, with 1.9 billion non-embedding parameters, pre-trained on a 2.8T corpus mainly in Chinese and English. It leads in multiple evaluation benchmarks compared to models of the same level. |
|
- **Index-1.9B pure (this repository's model)** : A control version of the base model with the same parameters and training strategy, but strictly filtered out all instruction-related data from the corpus to verify the impact of instructions on benchmarks. |
|
- Index-1.9B chat: A dialogue model aligned with SFT and DPO based on the Index-1.9B base. We found that due to the introduction of a lot of internet community corpus in our pre-training, the model has significantly more interesting chatting capabilities. |
|
- Index-1.9B character : Introduces RAG on top of SFT and DPO to achieve few-shots role-playing customization. |
|
|
|
**Note: This is the Base model, capable only of continuation and further training alignment, and cannot be directly interacted with.** |
|
- For the **Chat model**, see [Index-1.9B-Chat](https://huggingface.co/IndexTeam/Index-1.9B-Chat) |
|
- For the **Role-playing model**, see [Index-1.9B-Character](https://huggingface.co/IndexTeam/Index-1.9B-Character) |
|
|
|
For more details, see our [GitHub](https://github.com/bilibili/Index-1.9B) and [Index-1.9B Technical Report](https://github.com/bilibili/Index-1.9B/blob/main/Index-1.9B%20%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A.pdf) |
|
|
|
## Evaluation Results |
|
The Index-1.9B shows excellent performance in general understanding evaluations, leading compared to recently open-sourced small models and comparable to some 7B and models larger than 10B. |
|
|Model|Average score|Average English score|MMLU|CEVAL|CMMLU|HellaSwag|Arc-C|Arc-E| |
|
|----|----|----|----|----|----|----|----|----| |
|
|Google Gemma 2B|41.58|46.77|41.81|31.36|31.02|66.82|36.39|42.07| |
|
|Phi-2 (2.7B)|58.89|**72.54**|57.61|31.12|32.05|70.94|74.51|87.1| |
|
|Qwen1.5-1.8B|58.96|59.28|47.05|59.48|57.12|58.33|56.82|74.93| |
|
|Qwen2-1.5B(report)|**65.17**|62.52 |56.5|70.6|70.3|66.6|43.9|83.09| |
|
|MiniCPM-2.4B-SFT|62.53|68.75|53.8|49.19|50.97|67.29|69.44|84.48| |
|
|**Index-1.9B-Pure**|49.55 |52.83 |43.75|42.35|43.61|63.21|42.75|61.61| |
|
|**Index-1.9B**|**64.92** |**69.93**|52.53|57.01|52.79|80.69|65.15|81.35| |
|
|Llama2-7B|50.79|60.31|44.32|32.42|31.11|76|46.3|74.6| |
|
|Mistral-7B (report) |/|**69.23**|60.1|/|/|81.3|55.5|80| |
|
|Baichuan2-7B|54.53|53.51|54.64|56.19|56.95|25.04|57.25|77.12| |
|
|Llama2-13B|57.51|66.61|55.78|39.93|38.7|76.22|58.88|75.56| |
|
|Baichuan2-13B|68.90|71.69|59.63|59.21|61.27|72.61|70.04|84.48| |
|
|MPT-30B (report)|/|63.48|46.9|/|/|79.9|50.6|76.5| |
|
|Falcon-40B (report)|/|68.18|55.4|/|/|83.6|54.5|79.2| |
|
|
|
Evaluation code is based on [OpenCompass](https://github.com/open-compass/opencompass) with compatibility modifications. See the [evaluate](./evaluate/) folder for details. |
|
|
|
|
|
|