Seed-Coder-8B-Instruct — OpenVINO int4 (channel-wise symmetric)

ByteDance-Seed/Seed-Coder-8B-Instruct converted to the OpenVINO™ IR format with weights compressed to INT4 by NNCF.

Quantization recipe

optimum-cli export openvino \
  --model ByteDance-Seed/Seed-Coder-8B-Instruct \
  --task text-generation-with-past \
  --weight-format int4 --sym --group-size -1 --ratio 1.0 \
  --awq --scale-estimation --dataset wikitext2 \
  Seed-Coder-8B-Instruct-int4-cw-ov
  • Channel-wise symmetric int4 (--sym --group-size -1) — keeps the model eligible for the OpenVINO NPU plugin, which requires symmetric int4 weights.
  • AWQ + scale estimation calibrated on wikitext2.

Use with OpenVINO GenAI

import openvino_genai as ov_genai

pipe = ov_genai.LLMPipeline("Seed-Coder-8B-Instruct-int4-cw-ov", "GPU")  # or "CPU" / "NPU"
print(pipe.generate("def fibonacci(n):", max_new_tokens=128))

Model notes

  • Architecture: LlamaForCausalLM, 32K context, GQA (8 KV heads). Supports native Fill-in-the-Middle.
  • License: MIT, inherited from the base model.
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HarmenWessels/Seed-Coder-8B-Instruct-int4-cw-ov