--- license: apache-2.0 --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63cf23cffbd0cc580bc65c73/Kludqn78R4zztPL48g6QM.png) My first successful Dare-Ties merge. Because of the tokenizer difference of the model types (also bf16 vs f16), Had to use Slerp as well. Seems to perform well! Did a local lm-eval and HellaSWAG gives me around 84.5, which seems decent. will be submitting this for eval on the openLLM leaderboard as well. Preset for this should be ChatML, but standard default presets should work ok too. --- base_model: - senseable/WestLake-7B-v2 - cognitivecomputations/dolphin-2.8-mistral-7b-v02 library_name: transformers tags: - mergekit - merge --- # Noodlz_DolphinLake-DARE_TIE_SLERP-tokenwest This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [cognitivecomputations/dolphin-2.8-mistral-7b-v02](https://huggingface.co/cognitivecomputations/dolphin-2.8-mistral-7b-v02) as a base. ### Models Merged The following models were included in the merge: * [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: dare_ties parameters: int8_mask: true t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 # fallback for rest of tensors embed_slerp: true models: - model: cognitivecomputations/dolphin-2.8-mistral-7b-v02 # No parameters necessary for base model - model: senseable/WestLake-7B-v2 parameters: density: 0.58 weight: 0.8 base_model: cognitivecomputations/dolphin-2.8-mistral-7b-v02 tokenizer_source: model:senseable/WestLake-7B-v2 dtype: bfloat16 ```