--- license: cc-by-nc-4.0 tags: - merge - conversational - multi-task pipeline_tag: text-generation base_model: - paulml/OmniBeagleSquaredMBX-v3-7B - ZySec-AI/ZySec-7B-v1 - liminerity/Omningotex-7b-slerp - localfultonextractor/Erosumika-7B - KatyTheCutie/LemonadeRP-4.5.3 - cgato/Thespis-Krangled-7b - CorticalStack/pastiche-crown-clown-7b-dare - snorkelai/Snorkel-Mistral-PairRM-DPO - MTSAIR/multi_verse_model model-index: - name: winter-garden-7b-alpha - "Smart Assistant" results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 65.19 name: normalized accuracy source: url: >- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maldv/winter-garden-7b-alpha name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 85.36 name: normalized accuracy source: url: >- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maldv/winter-garden-7b-alpha name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 65.2 name: accuracy source: url: >- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maldv/winter-garden-7b-alpha name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 50.94 source: url: >- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maldv/winter-garden-7b-alpha name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 80.35 name: accuracy source: url: >- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maldv/winter-garden-7b-alpha name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 54.44 name: accuracy source: url: >- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maldv/winter-garden-7b-alpha name: Open LLM Leaderboard --- # Winter Garden 7B - α - "Smart Assistant" It was mentioned that we are in the open ai dark winter; so I thought I would make myself a nice winter garden. ## An experiment I've merged four partitions successfully in the past, so lets go for 9! I started with: * Mistral-7B-v0.1 and merged in * OmniBeagleSquaredMBX-v3-7B * ZySec-7B-v1 * Omningotex-7b-slerp * Erosumika-7B * LemonadeRP-4.5.3 * Thespis-Krangled-7b * pastiche-crown-clown-7b-dare * Snorkel-Mistral-PairRM-DPO * multi_verse_model ### 9-partition merge All of the layers were partitioned in to 9 random bins. Alternating models were slerped at [0...1], and [1...0] gradients; except attention, which was slerped at 0.03. This means that the model is still predominantly ordered around base mistral - including half of the input and output layers, and 28% of attention. ### Other Includes fast tokenizer. ## Chat Template I put a conversational chat template, which takes "name", "to" (optional), and "content" as the turns. It is designed to follow a transcript style chat which is used by some of the models. This type of use-case is best done by outlining a scene and creating a character card. ``` ### {% title %} {% metadata %} USER: Hello ASSISTANT: Hi, how are you? ``` It leans to being a coder when given an `### Instruction`, follows `[INST][/INST]`, and likes `<|user|>`, `<|assistant|>` as well. A quite cheery and intelligent model. Very good with science and math, but still capable of a decent amount of creativity for a 7b model. ## Scores Metric | Score ---|--- Average | 66.91 ARC | 65.19 HellaSwag | 85.36 MMLU | 65.2 TruthfulQA | 50.94 Winogrande | 80.35 GSM8K | 54.44 [Details](https://huggingface.co/datasets/open-llm-leaderboard/details_maldv__winter-garden-7b-alpha)