maia3-79m-fp16

This repository contains an fp16 reduced-precision release derived from the original UofTCSSLab/Maia3-79M checkpoint.

Base model relationship

Repository metadata is configured to establish a derived-model relationship on the Hugging Face Hub.

Although Hugging Face uses quantized as the repository relation type, this release performs precision reduction rather than integer quantization.

The original checkpoint weights were converted from float32 (fp32) to float16 (fp16).

Conversion:

float32 → float16

Characteristics:

Reduced model size (~50% relative to fp32)
Lower memory usage during loading and inference
Increased throughput on hardware with efficient half-precision support
Minimal expected quality degradation compared to the original checkpoint
Weights remain floating-point values rather than integer-quantized representations

This release does not use:

import torch

state_dict = torch.load(
    "maia3-79m-fp16.pt",
    map_location="cpu"
)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Quantized

(1)

this model