--- license: apache-2.0 inference: parameters: temperature: 0.79 widget: - messages: - role: user content: How to gain more money? model-index: - name: TinyllamaMix-1.1B results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 31.48 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Aryanne/TinyllamaMix-1.1B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 48.39 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Aryanne/TinyllamaMix-1.1B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 25.05 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Aryanne/TinyllamaMix-1.1B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 33.45 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Aryanne/TinyllamaMix-1.1B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 58.48 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Aryanne/TinyllamaMix-1.1B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 1.06 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Aryanne/TinyllamaMix-1.1B name: Open LLM Leaderboard --- This a TinyLlama mix merge, experimental, using a custom merge method. Should be better at RP. ```yaml merge_method: task_swapping base_model: Doctor-Shotgun/TinyLlama-1.1B-32k models: - model: cognitivecomputations/TinyDolphin-2.8.2-1.1b-laser parameters: weight: 0.75 diagonal_offset: 5 - model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 parameters: weight: 0.85 diagonal_offset: 17 invert_offset: True dtype: bfloat16 name: bye --- merge_method: task_swapping base_model: Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct models: - model: vihangd/DopeyTinyLlama-1.1B-v1 parameters: weight: 0.8 diagonal_offset: 3 invert_offset: False dtype: bfloat16 name: hello --- merge_method: task_arithmetic base_model: Doctor-Shotgun/TinyLlama-1.1B-32k models: - model: hello parameters: weight: 0.66 - model: bye+Anarchist/PIPPA_LORA_TinyLlama parameters: weight: 0.5 dtype: bfloat16 ``` # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Aryanne__TinyllamaMix-1.1B) | Metric |Value| |---------------------------------|----:| |Avg. |32.99| |AI2 Reasoning Challenge (25-Shot)|31.48| |HellaSwag (10-Shot) |48.39| |MMLU (5-Shot) |25.05| |TruthfulQA (0-shot) |33.45| |Winogrande (5-shot) |58.48| |GSM8k (5-shot) | 1.06|