Image-Text-to-Text
MLX
Safetensors
English
idefics3
mlx-vlm
quantized
8-bit precision
conversational

CodeFormulaV2-mlx-q8

8-bit quantized MLX conversion of docling-project/CodeFormulaV2, produced with mlx_vlm.convert --quantize --q-bits 8. The text decoder linear layers are quantized to 8 bits per weight; the vision encoder stays at bf16 (the mlx-vlm convention via skip_multimodal_module).

Architecture, training data, and intended use are described on the upstream model page. This repo only re-encodes the weights at lower precision; no retraining or modification of behaviour was performed.

A bf16 variant is also available: mlx-community/CodeFormulaV2-mlx-bf16.

Usage

pip install mlx-vlm
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template

model, processor = load("mlx-community/CodeFormulaV2-mlx-q8")
prompt = apply_chat_template(processor, model.config, "<formula>", num_images=1)
result = generate(
    model, processor,
    prompt=prompt,
    image="path/to/image.png",
    temperature=0.0,
)
print(result.text)

Use "<formula>" as the prompt for a math-expression image, "<code>" for a code-block image, per the upstream model card.

License and attribution

This is a derivative of docling-project/CodeFormulaV2, redistributed under the same Community Data License Agreement – Permissive 2.0 (CDLA-Permissive-2.0).

Please cite the upstream work:

@techreport{Docling,
  author = {Deep Search Team},
  month = {8},
  title = {{Docling Technical Report}},
  url = {https://arxiv.org/abs/2408.09869},
  eprint = {2408.09869},
  year = {2024}
}
Downloads last month
23
Safetensors
Model size
0.2B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/CodeFormulaV2-mlx-q8

Quantized
(2)
this model

Datasets used to train mlx-community/CodeFormulaV2-mlx-q8

Paper for mlx-community/CodeFormulaV2-mlx-q8