Edit model card

EleutherAI-gpt-neox-20b-ov-int8

This is the EleutherAI/gpt-neox-20b model converted to OpenVINO, for accelerated inference. Model weights are compressed to INT8 with weight compression using nncf.

Use optimum-intel for inference (documentation).

Downloads last month
2