Quantization made by Richard Erkhov.

alpaca-dragon-72b-v1 - GGUF

Model creator: https://huggingface.co/ibivibiv/
Original model: https://huggingface.co/ibivibiv/alpaca-dragon-72b-v1/

Name	Quant method	Size
alpaca-dragon-72b-v1.Q2_K.gguf	Q2_K	25.22GB
alpaca-dragon-72b-v1.IQ3_XS.gguf	IQ3_XS	27.88GB
alpaca-dragon-72b-v1.IQ3_S.gguf	IQ3_S	29.4GB
alpaca-dragon-72b-v1.Q3_K_S.gguf	Q3_K_S	29.4GB
alpaca-dragon-72b-v1.IQ3_M.gguf	IQ3_M	30.98GB
alpaca-dragon-72b-v1.Q3_K.gguf	Q3_K	32.85GB
alpaca-dragon-72b-v1.Q3_K_M.gguf	Q3_K_M	32.85GB
alpaca-dragon-72b-v1.Q3_K_L.gguf	Q3_K_L	35.85GB
alpaca-dragon-72b-v1.IQ4_XS.gguf	IQ4_XS	36.41GB
alpaca-dragon-72b-v1.Q4_0.gguf	Q4_0	38.19GB
alpaca-dragon-72b-v1.IQ4_NL.gguf	IQ4_NL	38.42GB
alpaca-dragon-72b-v1.Q4_K_S.gguf	Q4_K_S	38.45GB
alpaca-dragon-72b-v1.Q4_K.gguf	Q4_K	40.77GB
alpaca-dragon-72b-v1.Q4_K_M.gguf	Q4_K_M	40.77GB
alpaca-dragon-72b-v1.Q4_1.gguf	Q4_1	42.32GB
alpaca-dragon-72b-v1.Q5_0.gguf	Q5_0	46.46GB
alpaca-dragon-72b-v1.Q5_K_S.gguf	Q5_K_S	46.46GB
alpaca-dragon-72b-v1.Q5_K.gguf	Q5_K	47.79GB
alpaca-dragon-72b-v1.Q5_K_M.gguf	Q5_K_M	47.79GB
alpaca-dragon-72b-v1.Q5_1.gguf	Q5_1	50.59GB
alpaca-dragon-72b-v1.Q6_K.gguf	Q6_K	55.24GB
alpaca-dragon-72b-v1.Q8_0.gguf	Q8_0	71.55GB

Original model description:

language: - en license: other library_name: transformers model-index: - name: alpaca-dragon-72b-v1 results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 73.89 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibivibiv/alpaca-dragon-72b-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 88.16 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibivibiv/alpaca-dragon-72b-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 77.4 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibivibiv/alpaca-dragon-72b-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 72.69 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibivibiv/alpaca-dragon-72b-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 86.03 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibivibiv/alpaca-dragon-72b-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 77.63 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibivibiv/alpaca-dragon-72b-v1 name: Open LLM Leaderboard

Model Card for Alpaca Dragon 72B V1

Fine tune of Smaug 72b v0.1 using an alpaca data set I have handy. The data is of planning and reasoning, which I use to help allow a model to break down a set of asks into a logical plan. For some odd reason it bumps the mmlu and winogrande? I would have expected the ARC to go up over those two, but this is often more of an artform than a science at times. All thanks to Abacus.AI for sharing their work.

I used the same dataset in training one of my owl series Strix Rufipes 70B, which has worked well for planning out development tasks and other technical work.

LICENSE

Note the license points back to SMAUG base license as it is a fine tune of their model only. Respect and abide by their conditions. Again, many thanks to Abacus for making their work open and use that as inspiration to keep your work open and respect their license agreements. License Link

How to Get Started with the Model

Use the code below to get started with the model.

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ibivibiv/alpaca-dragon-72b-v1")
model = AutoModelForCausalLM.from_pretrained("ibivibiv/alpaca-dragon-72b-v1")

inputs = tokenizer("### Instruction: Create a plan for developing the game of snake in python using pygame.\n### Response:\n", return_tensors="pt", return_attention_mask=False)

outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)

Evaluation

Test Name	Accuracy (%)
All	77.31
arc:challenge	70.82
hellaswag	69.84
hendrycksTest-abstract_algebra	42.00
hendrycksTest-anatomy	71.85
hendrycksTest-astronomy	86.84
hendrycksTest-business_ethics	82.00
hendrycksTest-clinical_knowledge	84.53
hendrycksTest-college_biology	93.06
hendrycksTest-college_chemistry	54.00
hendrycksTest-college_computer_science	65.00
hendrycksTest-college_mathematics	52.00
hendrycksTest-college_medicine	75.14
hendrycksTest-college_physics	55.88
hendrycksTest-computer_security	82.00
hendrycksTest-conceptual_physics	80.43
hendrycksTest-econometrics	60.53
hendrycksTest-electrical_engineering	79.31
hendrycksTest-elementary_mathematics	70.37
hendrycksTest-formal_logic	58.73
hendrycksTest-global_facts	54.00
hendrycksTest-high_school_biology	88.39
hendrycksTest-high_school_chemistry	66.01
hendrycksTest-high_school_computer_science	82.00
hendrycksTest-high_school_european_history	84.24
hendrycksTest-high_school_geography	94.44
hendrycksTest-high_school_government_and_politics	98.96
hendrycksTest-high_school_macroeconomics	82.05
hendrycksTest-high_school_mathematics	45.93
hendrycksTest-high_school_microeconomics	86.13
hendrycksTest-high_school_physics	54.97
hendrycksTest-high_school_psychology	92.84
hendrycksTest-high_school_statistics	68.98
hendrycksTest-high_school_us_history	91.67
hendrycksTest-high_school_world_history	89.87
hendrycksTest-human_aging	78.03
hendrycksTest-human_sexuality	89.31
hendrycksTest-international_law	90.91
hendrycksTest-jurisprudence	87.96
hendrycksTest-logical_fallacies	84.05
hendrycksTest-machine_learning	58.93
hendrycksTest-management	87.38
hendrycksTest-marketing	95.30
hendrycksTest-medical_genetics	86.00
hendrycksTest-miscellaneous	92.21
hendrycksTest-moral_disputes	83.53
hendrycksTest-moral_scenarios	69.72
hendrycksTest-nutrition	85.62
hendrycksTest-philosophy	83.60
hendrycksTest-prehistory	87.04
hendrycksTest-professional_accounting	65.96
hendrycksTest-professional_law	60.69
hendrycksTest-professional_medicine	82.72
hendrycksTest-professional_psychology	81.86
hendrycksTest-public_relations	75.45
hendrycksTest-security_studies	82.04
hendrycksTest-sociology	88.56
hendrycksTest-us_foreign_policy	94.00
hendrycksTest-virology	57.23
hendrycksTest-world_religions	89.47
truthfulqa:mc	72.6
winogrande	86.03
gsm8k	77.63

Environmental Impact

Hardware Type: [A100's..... more than I wanted to use since its all on my $$$]
Hours used: [8]
Cloud Provider: [runpod.io]
Compute Region: [US]
Carbon Emitted: [?]

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	79.30
AI2 Reasoning Challenge (25-Shot)	73.89
HellaSwag (10-Shot)	88.16
MMLU (5-Shot)	77.40
TruthfulQA (0-shot)	72.69
Winogrande (5-shot)	86.03
GSM8k (5-shot)	77.63