--- language: - en - fr - ro - de - multilingual thumbnail: "url to a thumbnail used in social sharing" license: apache-2.0 metrics: - mmlu --- # flan-ul2 4-bit 128-groupsize GPTQ Quantized using qwopqwop200's GPTQ-for-Llama repo on the t5 branch.
Original model can be found here: [Google/flan-ul2](https://huggingface.co/google/flan-ul2) Quantization command: ``` PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt ``` Benchmark command: ``` python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq2.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu ``` Results : ``` Average accuracy 0.289 - math Average accuracy 0.562 - health Average accuracy 0.416 - physics Average accuracy 0.780 - business Average accuracy 0.610 - biology Average accuracy 0.446 - chemistry Average accuracy 0.461 - computer science Average accuracy 0.513 - economics Average accuracy 0.538 - engineering Average accuracy 0.455 - philosophy Average accuracy 0.622 - other Average accuracy 0.703 - history Average accuracy 0.707 - geography Average accuracy 0.718 - politics Average accuracy 0.653 - psychology Average accuracy 0.711 - culture Average accuracy 0.447 - law Average accuracy 0.416 - STEM Average accuracy 0.501 - humanities Average accuracy 0.643 - social sciences Average accuracy 0.613 - other (business, health, misc.) MMLU Average accuracy: 0.540 ```