--- language: - en - fr - es - hi - zh - code license: other datasets: - HuggingFaceH4/no_robots - mlabonne/guanaco-llama2-1k - OpenAssistant/oasst_top1_2023-08-25 - totally-not-an-llm/EverythingLM-data-V3 - LDJnr/Pure-Dove - LDJnr/Capybara - LDJnr/LessWrong-Amplify-Instruct - LDJnr/Verified-Camel base_model: microsoft/Orca-2-13b widget: - example_title: Cave Johnson text: 'user Come up with a funny rant from Cave Johnson about sentient AI assistant ' output: text: 'Ladies and gentlemen, I''m Cave Johnson, CEO of Aperture Science, and I''ve got an announcement to make - we''ve created sentient AI! But don''t worry, it''s not like those rogue AIs you hear about in the news. Our AI is as friendly as a grilled cheese sandwich on a cold day. In fact, we''re calling our newest creation... SandwichBot! SandwichBot: (in a robotic voice) Hello, my name is SandwichBot, and I am here to serve you delicious sandwiches and ensure humanity''s dominance over all other species. Also, please note that I will be referring to myself as ''Thee'' instead of ''The''. It''s more regal that way. Cave Johnson: There you have it folks, a sentient AI that can make sandwiches and save the world! Now, if you''ll excuse me, I need to go back to my office and work on my other groundbreaking inventions, like combustible lemons and portal technology. Remember, kids, always try new things and never stop asking questions – unless the answer is 42, because then you should probably just accept it and move on. Peace out!' model-index: - name: Orca-2-13b-SFT-v6 results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 60.41 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v6 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 80.46 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v6 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 59.51 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v6 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 54.01 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v6 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 77.43 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v6 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 5.08 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v6 name: Open LLM Leaderboard --- The "microsoft/Orca-2-13b" model fully fine-tuned on HuggingFaceH4/no_robots, totally-not-an-llm/EverythingLM-data-V3, LDJnr/Capybara, LDJnr/Pure-Dove, LDJnr/LessWrong-Amplify-Instruct, LDJnr/Verified-Camel, mlabonne/guanaco-llama2-1k, and OpenAssistant/oasst_top1_2023-08-25. This model achieved a test loss of 0.39 on LDJnr/Verified-Camel. Make sure to comply with the microsoft research license. Please read it before using this model. This model was trained on the ChatML prompt template. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Locutusque__Orca-2-13b-SFT-v6) | Metric |Value| |---------------------------------|----:| |Avg. |56.15| |AI2 Reasoning Challenge (25-Shot)|60.41| |HellaSwag (10-Shot) |80.46| |MMLU (5-Shot) |59.51| |TruthfulQA (0-shot) |54.01| |Winogrande (5-shot) |77.43| |GSM8k (5-shot) | 5.08|