Update README.md

1e59da5 verified 6 months ago

13.8 kB

	---
	license: apache-2.0
	inference: false
	---

	# Description
	4 bit quantization of [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0) using GPTQ. We use the config below for quantization/evaluation and [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) as the calibration data. The code is available under [this repository.](https://github.com/IST-DASLab/marlin/tree/2f6d7c10e124b3c5fa29ff8d77d568bd7af3274c/gptq)

	```yaml
	bits: 4
	damp_percent: 0.01
	desc_act: true
	exllama_config:
	version: 2
	group_size: 128
	quant_method: gptq
	static_groups: false
	sym: true
	true_sequential: true
	```

	## Evaluations

	Below is a comprehensive evaluation using the awesome [mosaicml/llm-foundry](https://github.com/mosaicml/llm-foundry/tree/main/scripts/eval).

	\| model_name \| core_average \| world_knowledge \| commonsense_reasoning \| language_understanding \| symbolic_problem_solving \| reading_comprehension \|
	\|:----------------------------------\|---------------:\|------------------:\|------------------------:\|-------------------------:\|---------------------------:\|------------------------:\|
	\| upstage/SOLAR-10.7B-Instruct-v1.0 \| 0.594131 \| 0.602579 \| 0.600195 \| 0.747605 \| 0.406245 \| 0.614029 \|

	\| Category \| Benchmark \| Subtask \| Accuracy \| Number few shot \|
	\| :----------------------- \| :--------------------------- \| :---------------------------------- \| -------: \| :-------------- \|
	\| symbolic_problem_solving \| gsm8k \| \| 0.638362 \| 0-shot \|
	\| commonsense_reasoning \| copa \| \| 0.84 \| 0-shot \|
	\| commonsense_reasoning \| commonsense_qa \| \| 0.841933 \| 0-shot \|
	\| commonsense_reasoning \| piqa \| \| 0.818281 \| 0-shot \|
	\| commonsense_reasoning \| bigbench_strange_stories \| \| 0.793103 \| 0-shot \|
	\| commonsense_reasoning \| bigbench_strategy_qa \| \| 0.66623 \| 0-shot \|
	\| language_understanding \| lambada_openai \| \| 0.735882 \| 0-shot \|
	\| language_understanding \| hellaswag \| \| 0.855208 \| 0-shot \|
	\| reading_comprehension \| coqa \| \| 0.222723 \| 0-shot \|
	\| reading_comprehension \| boolq \| \| 0.893884 \| 0-shot \|
	\| world_knowledge \| triviaqa_sm_sub \| \| 0.628333 \| 3-shot \|
	\| world_knowledge \| jeopardy \| Average \| 0.500792 \| 3-shot \|
	\| world_knowledge \| \| american_history \| 0.581114 \| 3-shot \|
	\| world_knowledge \| \| literature \| 0.655102 \| 3-shot \|
	\| world_knowledge \| \| science \| 0.371849 \| 3-shot \|
	\| world_knowledge \| \| word_origins \| 0.271233 \| 3-shot \|
	\| world_knowledge \| \| world_history \| 0.624665 \| 3-shot \|
	\| world_knowledge \| bigbench_qa_wikidata \| \| 0.669209 \| 3-shot \|
	\| world_knowledge \| arc_easy \| \| 0.815657 \| 3-shot \|
	\| world_knowledge \| arc_challenge \| \| 0.650171 \| 3-shot \|
	\| commonsense_reasoning \| siqa \| \| 0.881781 \| 3-shot \|
	\| language_understanding \| winograd \| \| 0.897436 \| 3-shot \|
	\| symbolic_problem_solving \| bigbench_operators \| \| 0.595238 \| 3-shot \|
	\| reading_comprehension \| squad \| \| 0.626395 \| 3-shot \|
	\| symbolic_problem_solving \| svamp \| \| 0.603333 \| 5-shot \|
	\| world_knowledge \| mmlu \| Average \| 0.647028 \| 5-shot \|
	\| world_knowledge \| \| abstract_algebra \| 0.29 \| 5-shot \|
	\| world_knowledge \| \| anatomy \| 0.577778 \| 5-shot \|
	\| world_knowledge \| \| astronomy \| 0.710526 \| 5-shot \|
	\| world_knowledge \| \| business_ethics \| 0.73 \| 5-shot \|
	\| world_knowledge \| \| clinical_knowledge \| 0.701887 \| 5-shot \|
	\| world_knowledge \| \| college_biology \| 0.729167 \| 5-shot \|
	\| world_knowledge \| \| college_chemistry \| 0.39 \| 5-shot \|
	\| world_knowledge \| \| college_computer_science \| 0.5 \| 5-shot \|
	\| world_knowledge \| \| college_mathematics \| 0.31 \| 5-shot \|
	\| world_knowledge \| \| college_medicine \| 0.66474 \| 5-shot \|
	\| world_knowledge \| \| college_physics \| 0.411765 \| 5-shot \|
	\| world_knowledge \| \| computer_security \| 0.72 \| 5-shot \|
	\| world_knowledge \| \| conceptual_physics \| 0.582979 \| 5-shot \|
	\| world_knowledge \| \| econometrics \| 0.473684 \| 5-shot \|
	\| world_knowledge \| \| electrical_engineering \| 0.565517 \| 5-shot \|
	\| world_knowledge \| \| elementary_mathematics \| 0.470899 \| 5-shot \|
	\| world_knowledge \| \| formal_logic \| 0.460317 \| 5-shot \|
	\| world_knowledge \| \| global_facts \| 0.33 \| 5-shot \|
	\| world_knowledge \| \| high_school_biology \| 0.770968 \| 5-shot \|
	\| world_knowledge \| \| high_school_chemistry \| 0.448276 \| 5-shot \|
	\| world_knowledge \| \| high_school_computer_science \| 0.71 \| 5-shot \|
	\| world_knowledge \| \| high_school_european_history \| 0.830303 \| 5-shot \|
	\| world_knowledge \| \| high_school_geography \| 0.848485 \| 5-shot \|
	\| world_knowledge \| \| high_school_government_and_politics \| 0.896373 \| 5-shot \|
	\| world_knowledge \| \| high_school_macroeconomics \| 0.646154 \| 5-shot \|
	\| world_knowledge \| \| high_school_mathematics \| 0.348148 \| 5-shot \|
	\| world_knowledge \| \| high_school_microeconomics \| 0.722689 \| 5-shot \|
	\| world_knowledge \| \| high_school_physics \| 0.344371 \| 5-shot \|
	\| world_knowledge \| \| high_school_psychology \| 0.833028 \| 5-shot \|
	\| world_knowledge \| \| high_school_statistics \| 0.523148 \| 5-shot \|
	\| world_knowledge \| \| high_school_us_history \| 0.852941 \| 5-shot \|
	\| world_knowledge \| \| high_school_world_history \| 0.827004 \| 5-shot \|
	\| world_knowledge \| \| human_aging \| 0.713004 \| 5-shot \|
	\| world_knowledge \| \| human_sexuality \| 0.755725 \| 5-shot \|
	\| world_knowledge \| \| international_law \| 0.768595 \| 5-shot \|
	\| world_knowledge \| \| jurisprudence \| 0.796296 \| 5-shot \|
	\| world_knowledge \| \| logical_fallacies \| 0.723926 \| 5-shot \|
	\| world_knowledge \| \| machine_learning \| 0.508929 \| 5-shot \|
	\| world_knowledge \| \| management \| 0.825243 \| 5-shot \|
	\| world_knowledge \| \| marketing \| 0.871795 \| 5-shot \|
	\| world_knowledge \| \| medical_genetics \| 0.73 \| 5-shot \|
	\| world_knowledge \| \| miscellaneous \| 0.814815 \| 5-shot \|
	\| world_knowledge \| \| moral_disputes \| 0.736994 \| 5-shot \|
	\| world_knowledge \| \| moral_scenarios \| 0.43352 \| 5-shot \|
	\| world_knowledge \| \| nutrition \| 0.728758 \| 5-shot \|
	\| world_knowledge \| \| philosophy \| 0.700965 \| 5-shot \|
	\| world_knowledge \| \| prehistory \| 0.765432 \| 5-shot \|
	\| world_knowledge \| \| professional_accounting \| 0.507092 \| 5-shot \|
	\| world_knowledge \| \| professional_law \| 0.487614 \| 5-shot \|
	\| world_knowledge \| \| professional_medicine \| 0.727941 \| 5-shot \|
	\| world_knowledge \| \| professional_psychology \| 0.661765 \| 5-shot \|
	\| world_knowledge \| \| public_relations \| 0.718182 \| 5-shot \|
	\| world_knowledge \| \| security_studies \| 0.669388 \| 5-shot \|
	\| world_knowledge \| \| sociology \| 0.81592 \| 5-shot \|
	\| world_knowledge \| \| us_foreign_policy \| 0.89 \| 5-shot \|
	\| world_knowledge \| \| virology \| 0.518072 \| 5-shot \|
	\| world_knowledge \| \| world_religions \| 0.789474 \| 5-shot \|
	\| symbolic_problem_solving \| bigbench_dyck_languages \| \| 0.458 \| 5-shot \|
	\| language_understanding \| winogrande \| \| 0.826361 \| 5-shot \|
	\| symbolic_problem_solving \| agi_eval_lsat_ar \| \| 0.269565 \| 5-shot \|
	\| symbolic_problem_solving \| simple_arithmetic_nospaces \| \| 0.372 \| 5-shot \|
	\| symbolic_problem_solving \| simple_arithmetic_withspaces \| \| 0.367 \| 5-shot \|
	\| reading_comprehension \| agi_eval_lsat_rc \| \| 0.794776 \| 5-shot \|
	\| reading_comprehension \| agi_eval_lsat_lr \| \| 0.641176 \| 5-shot \|
	\| reading_comprehension \| agi_eval_sat_en \| \| 0.849515 \| 5-shot \|
	\| world_knowledge \| arc_challenge \| \| 0.670648 \| 25-shot \|
	\| commonsense_reasoning \| openbook_qa \| \| 0.56 \| 10-shot \|
	\| language_understanding \| hellaswag \| \| 0.866461 \| 10-shot \|
	\| \| bigbench_cs_algorithms \| \| 0.652273 \| 10-shot \|
	\| symbolic_problem_solving \| bigbench_elementary_math_qa \| \| 0.392453 \| 1-shot \|

	---
	license: apache-2.0
	inference: false
	---

	# Description
	4 bit quantization of [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0) using GPTQ. We use the config below for quantization/evaluation and [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) as the calibration data. The code is available under [this repository.](https://github.com/IST-DASLab/marlin/tree/2f6d7c10e124b3c5fa29ff8d77d568bd7af3274c/gptq)

	```yaml
	bits: 4
	damp_percent: 0.01
	desc_act: true
	exllama_config:
	version: 2
	group_size: 128
	quant_method: gptq
	static_groups: false
	sym: true
	true_sequential: true
	```

	## Evaluations

	Below is a comprehensive evaluation using the awesome [mosaicml/llm-foundry](https://github.com/mosaicml/llm-foundry/tree/main/scripts/eval).

	\| model_name \| core_average \| world_knowledge \| commonsense_reasoning \| language_understanding \| symbolic_problem_solving \| reading_comprehension \|
	\|:----------------------------------\|---------------:\|------------------:\|------------------------:\|-------------------------:\|---------------------------:\|------------------------:\|
	\| upstage/SOLAR-10.7B-Instruct-v1.0 \| 0.594131 \| 0.602579 \| 0.600195 \| 0.747605 \| 0.406245 \| 0.614029 \|

	\| Category \| Benchmark \| Subtask \| Accuracy \| Number few shot \|
	\| :----------------------- \| :--------------------------- \| :---------------------------------- \| -------: \| :-------------- \|
	\| symbolic_problem_solving \| gsm8k \| \| 0.638362 \| 0-shot \|
	\| commonsense_reasoning \| copa \| \| 0.84 \| 0-shot \|
	\| commonsense_reasoning \| commonsense_qa \| \| 0.841933 \| 0-shot \|
	\| commonsense_reasoning \| piqa \| \| 0.818281 \| 0-shot \|
	\| commonsense_reasoning \| bigbench_strange_stories \| \| 0.793103 \| 0-shot \|
	\| commonsense_reasoning \| bigbench_strategy_qa \| \| 0.66623 \| 0-shot \|
	\| language_understanding \| lambada_openai \| \| 0.735882 \| 0-shot \|
	\| language_understanding \| hellaswag \| \| 0.855208 \| 0-shot \|
	\| reading_comprehension \| coqa \| \| 0.222723 \| 0-shot \|
	\| reading_comprehension \| boolq \| \| 0.893884 \| 0-shot \|
	\| world_knowledge \| triviaqa_sm_sub \| \| 0.628333 \| 3-shot \|
	\| world_knowledge \| jeopardy \| Average \| 0.500792 \| 3-shot \|
	\| world_knowledge \| \| american_history \| 0.581114 \| 3-shot \|
	\| world_knowledge \| \| literature \| 0.655102 \| 3-shot \|
	\| world_knowledge \| \| science \| 0.371849 \| 3-shot \|
	\| world_knowledge \| \| word_origins \| 0.271233 \| 3-shot \|
	\| world_knowledge \| \| world_history \| 0.624665 \| 3-shot \|
	\| world_knowledge \| bigbench_qa_wikidata \| \| 0.669209 \| 3-shot \|
	\| world_knowledge \| arc_easy \| \| 0.815657 \| 3-shot \|
	\| world_knowledge \| arc_challenge \| \| 0.650171 \| 3-shot \|
	\| commonsense_reasoning \| siqa \| \| 0.881781 \| 3-shot \|
	\| language_understanding \| winograd \| \| 0.897436 \| 3-shot \|
	\| symbolic_problem_solving \| bigbench_operators \| \| 0.595238 \| 3-shot \|
	\| reading_comprehension \| squad \| \| 0.626395 \| 3-shot \|
	\| symbolic_problem_solving \| svamp \| \| 0.603333 \| 5-shot \|
	\| world_knowledge \| mmlu \| Average \| 0.647028 \| 5-shot \|
	\| world_knowledge \| \| abstract_algebra \| 0.29 \| 5-shot \|
	\| world_knowledge \| \| anatomy \| 0.577778 \| 5-shot \|
	\| world_knowledge \| \| astronomy \| 0.710526 \| 5-shot \|
	\| world_knowledge \| \| business_ethics \| 0.73 \| 5-shot \|
	\| world_knowledge \| \| clinical_knowledge \| 0.701887 \| 5-shot \|
	\| world_knowledge \| \| college_biology \| 0.729167 \| 5-shot \|
	\| world_knowledge \| \| college_chemistry \| 0.39 \| 5-shot \|
	\| world_knowledge \| \| college_computer_science \| 0.5 \| 5-shot \|
	\| world_knowledge \| \| college_mathematics \| 0.31 \| 5-shot \|
	\| world_knowledge \| \| college_medicine \| 0.66474 \| 5-shot \|
	\| world_knowledge \| \| college_physics \| 0.411765 \| 5-shot \|
	\| world_knowledge \| \| computer_security \| 0.72 \| 5-shot \|
	\| world_knowledge \| \| conceptual_physics \| 0.582979 \| 5-shot \|
	\| world_knowledge \| \| econometrics \| 0.473684 \| 5-shot \|
	\| world_knowledge \| \| electrical_engineering \| 0.565517 \| 5-shot \|
	\| world_knowledge \| \| elementary_mathematics \| 0.470899 \| 5-shot \|
	\| world_knowledge \| \| formal_logic \| 0.460317 \| 5-shot \|
	\| world_knowledge \| \| global_facts \| 0.33 \| 5-shot \|
	\| world_knowledge \| \| high_school_biology \| 0.770968 \| 5-shot \|
	\| world_knowledge \| \| high_school_chemistry \| 0.448276 \| 5-shot \|
	\| world_knowledge \| \| high_school_computer_science \| 0.71 \| 5-shot \|
	\| world_knowledge \| \| high_school_european_history \| 0.830303 \| 5-shot \|
	\| world_knowledge \| \| high_school_geography \| 0.848485 \| 5-shot \|
	\| world_knowledge \| \| high_school_government_and_politics \| 0.896373 \| 5-shot \|
	\| world_knowledge \| \| high_school_macroeconomics \| 0.646154 \| 5-shot \|
	\| world_knowledge \| \| high_school_mathematics \| 0.348148 \| 5-shot \|
	\| world_knowledge \| \| high_school_microeconomics \| 0.722689 \| 5-shot \|
	\| world_knowledge \| \| high_school_physics \| 0.344371 \| 5-shot \|
	\| world_knowledge \| \| high_school_psychology \| 0.833028 \| 5-shot \|
	\| world_knowledge \| \| high_school_statistics \| 0.523148 \| 5-shot \|
	\| world_knowledge \| \| high_school_us_history \| 0.852941 \| 5-shot \|
	\| world_knowledge \| \| high_school_world_history \| 0.827004 \| 5-shot \|
	\| world_knowledge \| \| human_aging \| 0.713004 \| 5-shot \|
	\| world_knowledge \| \| human_sexuality \| 0.755725 \| 5-shot \|
	\| world_knowledge \| \| international_law \| 0.768595 \| 5-shot \|
	\| world_knowledge \| \| jurisprudence \| 0.796296 \| 5-shot \|
	\| world_knowledge \| \| logical_fallacies \| 0.723926 \| 5-shot \|
	\| world_knowledge \| \| machine_learning \| 0.508929 \| 5-shot \|
	\| world_knowledge \| \| management \| 0.825243 \| 5-shot \|
	\| world_knowledge \| \| marketing \| 0.871795 \| 5-shot \|
	\| world_knowledge \| \| medical_genetics \| 0.73 \| 5-shot \|
	\| world_knowledge \| \| miscellaneous \| 0.814815 \| 5-shot \|
	\| world_knowledge \| \| moral_disputes \| 0.736994 \| 5-shot \|
	\| world_knowledge \| \| moral_scenarios \| 0.43352 \| 5-shot \|
	\| world_knowledge \| \| nutrition \| 0.728758 \| 5-shot \|
	\| world_knowledge \| \| philosophy \| 0.700965 \| 5-shot \|
	\| world_knowledge \| \| prehistory \| 0.765432 \| 5-shot \|
	\| world_knowledge \| \| professional_accounting \| 0.507092 \| 5-shot \|
	\| world_knowledge \| \| professional_law \| 0.487614 \| 5-shot \|
	\| world_knowledge \| \| professional_medicine \| 0.727941 \| 5-shot \|
	\| world_knowledge \| \| professional_psychology \| 0.661765 \| 5-shot \|
	\| world_knowledge \| \| public_relations \| 0.718182 \| 5-shot \|
	\| world_knowledge \| \| security_studies \| 0.669388 \| 5-shot \|
	\| world_knowledge \| \| sociology \| 0.81592 \| 5-shot \|
	\| world_knowledge \| \| us_foreign_policy \| 0.89 \| 5-shot \|
	\| world_knowledge \| \| virology \| 0.518072 \| 5-shot \|
	\| world_knowledge \| \| world_religions \| 0.789474 \| 5-shot \|
	\| symbolic_problem_solving \| bigbench_dyck_languages \| \| 0.458 \| 5-shot \|
	\| language_understanding \| winogrande \| \| 0.826361 \| 5-shot \|
	\| symbolic_problem_solving \| agi_eval_lsat_ar \| \| 0.269565 \| 5-shot \|
	\| symbolic_problem_solving \| simple_arithmetic_nospaces \| \| 0.372 \| 5-shot \|
	\| symbolic_problem_solving \| simple_arithmetic_withspaces \| \| 0.367 \| 5-shot \|
	\| reading_comprehension \| agi_eval_lsat_rc \| \| 0.794776 \| 5-shot \|
	\| reading_comprehension \| agi_eval_lsat_lr \| \| 0.641176 \| 5-shot \|
	\| reading_comprehension \| agi_eval_sat_en \| \| 0.849515 \| 5-shot \|
	\| world_knowledge \| arc_challenge \| \| 0.670648 \| 25-shot \|
	\| commonsense_reasoning \| openbook_qa \| \| 0.56 \| 10-shot \|
	\| language_understanding \| hellaswag \| \| 0.866461 \| 10-shot \|
	\| \| bigbench_cs_algorithms \| \| 0.652273 \| 10-shot \|
	\| symbolic_problem_solving \| bigbench_elementary_math_qa \| \| 0.392453 \| 1-shot \|