Spaces:
Running
Running
File size: 4,923 Bytes
758f7bd 3de0c5f b99a0ed 3de0c5f 9e6d70c d6bbbdc 3de0c5f 8d20685 3de0c5f 8143d70 e3a4463 8143d70 3de0c5f d0f751c 3de0c5f 9e6d70c 2a6496c 3de0c5f d6bbbdc 3de0c5f 9e6d70c 3de0c5f d0f751c 9911ac3 3de0c5f 9e6d70c 3de0c5f 9e6d70c 3de0c5f fdb20dd 3de0c5f d0f751c fdb20dd 3de0c5f 9e6d70c 3de0c5f d0f751c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
---
title: README
emoji: π
colorFrom: pink
colorTo: red
sdk: static
pinned: false
---
<p align="center" width="100%">
</p>
<div id="top" align="center">
<p style="font-size: 40px; font-weight: bold;">Knowledge Fusion of Large Language Models</p>
<h4> |<a href="https://arxiv.org/abs/2401.10491"> π FuseLLM Paper @ICLR2024 </a> |
<a href="https://arxiv.org/abs/2408.07990"> π FuseChat Tech Report </a> |
<a href="https://huggingface.co/FuseAI"> π€ HuggingFace Repo </a> |
<a href="https://github.com/fanqiwan/FuseLLM"> π± GitHub Repo </a> |
</h4>
<p align="center">
<img src="logo.png" width="60%"> <br>
</p>
</div>
## FuseAI
FuseAI is an open-source research community focused on model fusion topics.
The community members currently applying model fusion on Foundation and Chat LLMs, with future plans to fuse Agent/MoE LLMs.
Welcome to join us!
## News
### FuseChat [SOTA 7B LLM on MT-Bench]
- **Aug 16, 2024:** π₯π₯π₯π₯ We update the [FuseChat tech report](https://arxiv.org/abs/2408.07990) and release [FuseChat-7B-v2.0](https://huggingface.co/FuseAI/FuseChat-7B-v2.0), which is the fusion of six prominent chat LLMs with diverse architectures and scales, namely [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5), [Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), [InternLM2-Chat-20B](https://huggingface.co/internlm/internlm2-chat-20b), [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and [Qwen1.5-Chat-72B](https://huggingface.co/Qwen/Qwen1.5-72B-Chat). FuseChat-7B-v2.0 achieves an average performance of **7.38** on MT-Bench (GPT-4-0125-Preview as judge LLM), which is comparable to [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) and approaches [GPT-3.5-Turbo-1106](https://platform.openai.com/docs/models/gpt-3-5-turbo).
- **Mar 13, 2024:** π₯π₯π₯ We release a HuggingFace Space for [FuseChat-7B](https://huggingface.co/spaces/FuseAI/FuseChat-7B), try it now!
- **Feb 26, 2024:** π₯π₯ We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), and [Tulu-2-DPO-70B](https://huggingface.co/allenai/tulu-2-dpo-70b), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
- **Feb 25, 2024:** π₯ We release [FuseChat-Mixture](https://huggingface.co/datasets/FuseAI/FuseChat-Mixture), which is a comprehensive training dataset covers different styles and capabilities, featuring both human-written and model-generated, and spanning general instruction-following and specific skills.
<p align="center">
<img src="tab0.png" width="60%"> <br>
</p>
<p align="center">
<img src="tab1.png" width="95%"> <br>
</p>
### FuseLLM [Surpassing Llama-2-7B]
- **Jan 22, 2024:** π₯ We release [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B), which is the fusion of three open-source foundation LLMs with distinct architectures, including [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b).
<p align="center">
<img src="fig0.png" width="95%"> <br>
</p>
<p align="center">
<img src="fig1.png" width="95%"> <br>
</p>
## Citation
Please cite the following paper if you reference our model, code, data, or paper related to FuseLLM.
```
@inproceedings{wan2024knowledge,
title={Knowledge Fusion of Large Language Models},
author={Fanqi Wan and Xinting Huang and Deng Cai and Xiaojun Quan and Wei Bi and Shuming Shi},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/pdf?id=jiDsk12qcz}
}
```
Please cite the following paper if you reference our model, code, data, or paper related to FuseChat.
```
@article{wan2024fusechat,
title={FuseChat: Knowledge Fusion of Chat Models},
author={Fanqi Wan and Longguang Zhong and Ziyi Yang and Ruijun Chen and Xiaojun Quan},
journal={arXiv preprint arXiv:2408.07990},
year={2024}
}
```
|