--- base_model: facebook/opt-350m language: - en license: other model_name: opt-350m pipeline_tag: text-generation quantized_by: iproskurina tags: - gptq - 4-bit base_model_relation: quantized inference: false model_creator: facebook model_type: opt --- # OPT-350M - GPTQ - Model creator: [Meta AI](https://huggingface.co/facebook) - Original model: [OPT-350M](https://huggingface.co/facebook/opt-350m) The model published in this repo was quantized to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ). **Quantization details** **All quantization parameters were taken from [GPTQ paper](https://arxiv.org/abs/2210.17323).** GPTQ calibration data consisted of 128 random 2048 token segments from the [C4 dataset](https://huggingface.co/datasets/c4). The grouping size used for quantization is equal to 128. ## How to use this GPTQ model from Python code ### Install the necessary packages Requires: Transformers 4.33.0 or later, Optimum 1.12.0 or later, and AutoGPTQ 0.4.2 or later. ```shell pip3 install --upgrade transformers optimum # If using PyTorch 2.1 + CUDA 12.x: pip3 install --upgrade auto-gptq # or, if using PyTorch 2.1 + CUDA 11.x: pip3 install --upgrade auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/ ``` If you are using PyTorch 2.0, you will need to install AutoGPTQ from source. Likewise if you have problems with the pre-built wheels, you should try building from source: ```shell pip3 uninstall -y auto-gptq git clone https://github.com/PanQiWei/AutoGPTQ cd AutoGPTQ git checkout v0.5.1 pip3 install . ``` ### You can then use the following code ```python from transformers import AutoTokenizer, TextGenerationPipeline,AutoModelForCausalLM from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig pretrained_model_dir = "iproskurina/opt-350m-gptq-4bit" tokenizer = AutoTokenizer.from_pretrained(pretrained_model_dir, use_fast=True) model = AutoGPTQForCausalLM.from_quantized(pretrained_model_dir, device="cuda:0", model_basename="model") pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer) print(pipeline("auto-gptq is")[0]["generated_text"]) ``` [**LICENSE**](https://huggingface.co/facebook/opt-350m/blob/main/LICENSE.md)