yujiepan's picture
Create README.md
5200729
metadata
pipeline_tag: text-generation
inference: true
widget:
  - text: Hello!
    example_title: Hello world
    group: Python
library_name: transformers

yujiepan/opt-6.7b-w8a8-unstructured50

This model is w8a8 quantized & unstructually sparsified by OpenVINO, exported from facebook/opt-6.7b.

This model is not tuned for accuracy.

  • Quantization: 8-bit symmetric for weights & activations
  • Unstructured sparsity in transformer block linear layers: 50%

Codes for export: https://gist.github.com/yujiepan-work/1e6dd9f9c2aac0e9ecaf2ed4d82d1158