sigmareaver
/

flan-ul2-4bit-128g-gptq

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

flan-ul2-4bit-128g-gptq / README.md

sigmareaver's picture

Update README.md

1f0de3a over 1 year ago

|

history blame contribute delete

1.58 kB

	---
	language:
	- en
	- fr
	- ro
	- de
	- multilingual
	thumbnail: "url to a thumbnail used in social sharing"
	license: apache-2.0
	metrics:
	- mmlu
	---
	# flan-ul2 4-bit 128-groupsize GPTQ
	Quantized using qwopqwop200's GPTQ-for-Llama repo on the t5 branch.<br>
	Original model can be found here: [Google/flan-ul2](https://huggingface.co/google/flan-ul2)

	Quantization command:
	```
	PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt
	```
	Benchmark command:
	```
	python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq2.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu
	```
	Results :
	```
	Average accuracy 0.289 - math
	Average accuracy 0.562 - health
	Average accuracy 0.416 - physics
	Average accuracy 0.780 - business
	Average accuracy 0.610 - biology
	Average accuracy 0.446 - chemistry
	Average accuracy 0.461 - computer science
	Average accuracy 0.513 - economics
	Average accuracy 0.538 - engineering
	Average accuracy 0.455 - philosophy
	Average accuracy 0.622 - other
	Average accuracy 0.703 - history
	Average accuracy 0.707 - geography
	Average accuracy 0.718 - politics
	Average accuracy 0.653 - psychology
	Average accuracy 0.711 - culture
	Average accuracy 0.447 - law
	Average accuracy 0.416 - STEM
	Average accuracy 0.501 - humanities
	Average accuracy 0.643 - social sciences
	Average accuracy 0.613 - other (business, health, misc.)
	MMLU Average accuracy: 0.540
	```