a merge of a lot of different models, like hermes, beluga, airoboros, chronos.. limarp
significantly better quality than my previous chronos-beluga merge.
Huginn is intended as a general purpose model, that maintains a lot of good knowledge, can perform logical thought and accurately follow instructions, and hold the prose and creativity of more writing oriented models, this makes this model great for roleplays, while still being good as a normal chatbot or assistant
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 54.89 |
AI2 Reasoning Challenge (25-Shot) | 60.58 |
HellaSwag (10-Shot) | 82.53 |
MMLU (5-Shot) | 53.71 |
TruthfulQA (0-shot) | 54.46 |
Winogrande (5-shot) | 73.72 |
GSM8k (5-shot) | 4.32 |
- Downloads last month
- 3,379
Spaces using The-Face-Of-Goonery/Huginn-13b-FP16 14
Evaluation results
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard60.580
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard82.530
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard53.710
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard54.460
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard73.720
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard4.320