LFM2.5-1.2B-Instruct - OpenVINO INT4

OpenVINO IR conversion of LiquidAI/LFM2.5-1.2B-Instruct (hybrid conv+attention, on-device class), quantized to INT4 asymmetric, group size 128 (data-free / NNCF defaults) for Intel Core Ultra iGPUs.

optimum-cli export openvino -m LiquidAI/LFM2.5-1.2B-Instruct \
  --task text-generation-with-past --weight-format int4 --group-size 128 --ratio 1.0 \
  LFM2.5-1.2B-Instruct-int4-ov

Usage (OpenVINO GenAI)

import openvino_genai as ov_genai

pipe = ov_genai.LLMPipeline("LFM2.5-1.2B-Instruct-int4-ov", "GPU", CACHE_DIR="./.ovcache")
print(pipe.generate("Write a Python function that reverses a string.", max_new_tokens=128))

Fast, lightweight autocomplete/chat backend; tested end-to-end as an OpenAI-compatible Continue.dev backend via core-ultra-llm-server.

Provenance

  • Base model: LiquidAI/LFM2.5-1.2B-Instruct (Liquid AI)
  • Recipe: INT4 asymmetric, group_size 128, ratio 1.0, data-free (NNCF defaults)
  • No finetuning - weights are a direct quantization of the original
Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HarmenWessels/LFM2.5-1.2B-Instruct-int4-ov

Quantized
(60)
this model