--- library_name: transformers tags: [] --- # yujiepan/llama-3-tiny-random-gptq-w4 4-bit weight only quantization by AutoGPTQ on [yujiepan/llama-3-tiny-random](https://huggingface.co/yujiepan/llama-3-tiny-random) ```python from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig import torch model_id = "yujiepan/llama-3-tiny-random" tokenizer = AutoTokenizer.from_pretrained(model_id) quantization_config = GPTQConfig( bits=4, group_size=-1, dataset="c4", tokenizer=tokenizer, ) model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", quantization_config=quantization_config, ) ```