Model Details: INT8 GPT-J 6B

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.

This int8 ONNX model is generated by neural-compressor and the fp32 model can be exported with below command:

python -m transformers.onnx --model=EleutherAI/gpt-j-6B onnx_gptj/ --framework pt --opset 13 --feature=causal-lm-with-past
Model Detail Description
Model Authors - Company Intel
Date April 10, 2022
Version 1
Type Text Generation
Paper or Other Resources -
License Apache 2.0
Questions or Comments Community Tab
Intended Use Description
Primary intended uses You can use the raw model for text generation inference
Primary intended users Anyone doing text generation inference
Out-of-scope uses This model in most cases will need to be fine-tuned for your particular task. The model should not be used to intentionally create hostile or alienating environments for people.

How to use

Download the model and script by cloning the repository:

git clone https://huggingface.co/Intel/gpt-j-6B-int8-dynamic

Then you can do inference based on the model and script 'evaluation.ipynb'.

Metrics (Model Performance):

Model Model Size (GB) Lambada Acc
FP32 23 0.7954
INT8 6 0.7926
Downloads last month
18
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Intel/gpt-j-6B-int8-dynamic-inc

Collection including Intel/gpt-j-6B-int8-dynamic-inc