LM-cocktail Mistral 7B v1

This is a 50%-50% model of two best Mistral models

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

https://huggingface.co/xDAN-AI/xDAN-L1-Chat-RL-v1

both claimed to be better than chatgpt-3.5-turbo in almost all metrics.

Alpaca Eval

I am thrilled to announce that ChatGPT has ranked LMCocktail 7B as the second best model next to GPT4 on AlpcaEval in my local community run, even greater than my previously best LMCocktail-10.7B-v1 model. You can also check the leaderboard at ./Alpaca_eval/chatgpt_fn_--LMCocktail-Mistral-7B-v1/

                        win_rate  standard_error  n_total  avg_length
gpt4                       73.79            1.54      805        1365
LMCocktail-7B-v1(new)      73.54            1.55      805        1870
LMCocktail-10.7B-v1(new)   73.45            1.56      804        1203
claude                     70.37            1.60      805        1082
chatgpt                    66.09            1.66      805         811
wizardlm-13b               65.16            1.67      805         985
vicuna-13b                 64.10            1.69      805        1037
guanaco-65b                62.36            1.71      805        1249
oasst-rlhf-llama-33b       62.05            1.71      805        1079
alpaca-farm-ppo-human      60.25            1.72      805         803
falcon-40b-instruct        56.52            1.74      805         662
text_davinci_003           50.00            0.00      805         307
alpaca-7b                  45.22            1.74      805         396
text_davinci_001           28.07            1.56      805         296

Code

The LM-cocktail is novel technique for merging multiple models https://arxiv.org/abs/2311.13534

Code is backed up by this repo https://github.com/FlagOpen/FlagEmbedding.git

Merging scripts available under the ./scripts folder

Yhyu13
/

LMCocktail-Mistral-7B-v1

LM-cocktail Mistral 7B v1

Alpaca Eval

Code

Space using Yhyu13/LMCocktail-Mistral-7B-v1 1