Description

4 bit quantization of meta-llama/Meta-Llama-3-8B-Instruct using GPTQ. We use the config below for quantization/evaluation and HuggingFaceH4/ultrachat_200k as the calibration data. The code is available under this repository.

bits: 4
damp_percent: 0.01
desc_act: true
exllama_config:
 version: 2
group_size: 128
quant_method: gptq
static_groups: false
sym: true
true_sequential: true

Evaluations

Below is a comprehensive evaluation and also comparison with casperhansen/llama-3-8b-instruct-awq using the awesome mosaicml/llm-foundry.

model_name	core_average	world_knowledge	commonsense_reasoning	language_understanding	symbolic_problem_solving	reading_comprehension
ISTA-DASLab/Llama-3-8B-Instruct-GPTQ-4bit	0.552944	0.584061	0.547598	0.663904	0.431017	0.538141
casperhansen/llama-3-8b-instruct-awq	0.531504	0.557663	0.528201	0.657211	0.391476	0.522971

Category	Benchmark	Subtask	Accuracy GPTQ	Accuracy AWQ	Number few shot
symbolic_problem_solving	gsm8k		0.721759	0.59818	0-shot
commonsense_reasoning	copa		0.85	0.84	0-shot
commonsense_reasoning	commonsense_qa		0.78706	0.782146	0-shot
commonsense_reasoning	piqa		0.784004	0.781828	0-shot
commonsense_reasoning	bigbench_strange_stories		0.764368	0.752874	0-shot
commonsense_reasoning	bigbench_strategy_qa		0.680647	0.659677	0-shot
language_understanding	lambada_openai		0.716476	0.717834	0-shot
language_understanding	hellaswag		0.750647	0.753137	0-shot
reading_comprehension	coqa		0.198797	0.109733	0-shot
reading_comprehension	boolq		0.8263	0.836391	0-shot
world_knowledge	triviaqa_sm_sub		0.590667	0.511333	3-shot
world_knowledge	jeopardy	Average	0.4975	0.489451	3-shot
world_knowledge		american_history	0.535109	0.544794	3-shot
world_knowledge		literature	0.622449	0.626531	3-shot
world_knowledge		science	0.420168	0.390756	3-shot
world_knowledge		word_origins	0.293151	0.271233	3-shot
world_knowledge		world_history	0.616622	0.613941	3-shot
world_knowledge	bigbench_qa_wikidata		0.684366	0.644358	3-shot
world_knowledge	arc_easy		0.808923	0.808081	3-shot
world_knowledge	arc_challenge		0.571672	0.571672	3-shot
commonsense_reasoning	siqa		0.827533	0.814227	3-shot
language_understanding	winograd		0.871795	0.860806	3-shot
symbolic_problem_solving	bigbench_operators		0.547619	0.552381	3-shot
reading_comprehension	squad		0.581552	0.58789	3-shot
symbolic_problem_solving	svamp		0.68	0.57	5-shot
world_knowledge	mmlu	Average	0.668279	0.645874	5-shot
world_knowledge		abstract_algebra	0.29	0.33	5-shot
world_knowledge		anatomy	0.681481	0.651852	5-shot
world_knowledge		astronomy	0.703947	0.671053	5-shot
world_knowledge		business_ethics	0.67	0.68	5-shot
world_knowledge		clinical_knowledge	0.750943	0.701887	5-shot
world_knowledge		college_biology	0.784722	0.729167	5-shot
world_knowledge		college_chemistry	0.47	0.46	5-shot
world_knowledge		college_computer_science	0.56	0.54	5-shot
world_knowledge		college_mathematics	0.36	0.28	5-shot
world_knowledge		college_medicine	0.653179	0.635838	5-shot
world_knowledge		college_physics	0.5	0.431373	5-shot
world_knowledge		computer_security	0.78	0.75	5-shot
world_knowledge		conceptual_physics	0.548936	0.557447	5-shot
world_knowledge		econometrics	0.45614	0.482456	5-shot
world_knowledge		electrical_engineering	0.668966	0.586207	5-shot
world_knowledge		elementary_mathematics	0.439153	0.417989	5-shot
world_knowledge		formal_logic	0.47619	0.412698	5-shot
world_knowledge		global_facts	0.37	0.41	5-shot
world_knowledge		high_school_biology	0.790323	0.754839	5-shot
world_knowledge		high_school_chemistry	0.581281	0.507389	5-shot
world_knowledge		high_school_computer_science	0.71	0.74	5-shot
world_knowledge		high_school_european_history	0.745455	0.775758	5-shot
world_knowledge		high_school_geography	0.823232	0.823232	5-shot
world_knowledge		high_school_government_and_politics	0.917098	0.875648	5-shot
world_knowledge		high_school_macroeconomics	0.635897	0.620513	5-shot
world_knowledge		high_school_mathematics	0.407407	0.392593	5-shot
world_knowledge		high_school_microeconomics	0.726891	0.714286	5-shot
world_knowledge		high_school_physics	0.423841	0.410596	5-shot
world_knowledge		high_school_psychology	0.842202	0.838532	5-shot
world_knowledge		high_school_statistics	0.592593	0.513889	5-shot
world_knowledge		high_school_us_history	0.852941	0.852941	5-shot
world_knowledge		high_school_world_history	0.843882	0.831224	5-shot
world_knowledge		human_aging	0.717489	0.713004	5-shot
world_knowledge		human_sexuality	0.763359	0.70229	5-shot
world_knowledge		international_law	0.793388	0.77686	5-shot
world_knowledge		jurisprudence	0.814815	0.768519	5-shot
world_knowledge		logical_fallacies	0.754601	0.773006	5-shot
world_knowledge		machine_learning	0.553571	0.508929	5-shot
world_knowledge		management	0.84466	0.834951	5-shot
world_knowledge		marketing	0.92735	0.888889	5-shot
world_knowledge		medical_genetics	0.81	0.78	5-shot
world_knowledge		miscellaneous	0.825032	0.799489	5-shot
world_knowledge		moral_disputes	0.739884	0.722543	5-shot
world_knowledge		moral_scenarios	0.437989	0.38324	5-shot
world_knowledge		nutrition	0.764706	0.735294	5-shot
world_knowledge		philosophy	0.733119	0.713826	5-shot
world_knowledge		prehistory	0.719136	0.719136	5-shot
world_knowledge		professional_accounting	0.475177	0.485816	5-shot
world_knowledge		professional_law	0.480443	0.449153	5-shot
world_knowledge		professional_medicine	0.709559	0.676471	5-shot
world_knowledge		professional_psychology	0.694444	0.676471	5-shot
world_knowledge		public_relations	0.7	0.6	5-shot
world_knowledge		security_studies	0.730612	0.718367	5-shot
world_knowledge		sociology	0.830846	0.845771	5-shot
world_knowledge		us_foreign_policy	0.86	0.85	5-shot
world_knowledge		virology	0.542169	0.518072	5-shot
world_knowledge		world_religions	0.812865	0.795322	5-shot
symbolic_problem_solving	bigbench_dyck_languages		0.086	0.045	5-shot
language_understanding	winogrande		0.764009	0.759274	5-shot
symbolic_problem_solving	agi_eval_lsat_ar		0.3	0.278261	5-shot
symbolic_problem_solving	simple_arithmetic_nospaces		0.466	0.458	5-shot
symbolic_problem_solving	simple_arithmetic_withspaces		0.502	0.496	5-shot
reading_comprehension	agi_eval_lsat_rc		0.731343	0.708955	5-shot
reading_comprehension	agi_eval_lsat_lr		0.554902	0.560784	5-shot
reading_comprehension	agi_eval_sat_en		0.81068	0.805825	5-shot
world_knowledge	arc_challenge		0.582765	0.591297	25-shot
commonsense_reasoning	openbook_qa		0.478	0.468	10-shot
language_understanding	hellaswag		0.769468	0.771062	10-shot
	bigbench_cs_algorithms		0.715151	0.687879	10-shot
symbolic_problem_solving	bigbench_elementary_math_qa		0.533569	0.530922	1-shot