VisualEars FastConformer FA Full AB - QAT W2 Materialized

This repo contains the current 2-bit QAT materialized NeMo checkpoint from the VisualEars FastConformer Persian ASR quantization experiments.

Important: this is a materialized float NeMo checkpoint, not a packed 2-bit deployment artifact yet. The selected QAT linear weights are projected to their 2-bit values and saved back into normal NeMo/torch tensors, so the file size is still close to the full precision .nemo.

Source

Base model: Reza2kn/visualears-fastconformer-fa-full-ab

Training run: onebit_cotraining_fast_no_urls_lower3_20260612_161154

Quantized modules: feed-forward linear layers in encoder layers 0-2, decoder/CTC/attention/convolution left unquantized.

VisualEars269 score

Default decoding:

model	WER	CER
FP baseline	32.64%	13.30%
W2 materialized	37.47%	16.69%

Forced CTC:

model	WER	CER
FP baseline	34.96%	14.36%
W2 materialized	40.65%	17.78%

This is useful as an experimental checkpoint, but it is not yet parity and not yet compressed/bit-packed.

Downloads last month: 55

Model tree for Reza2kn/visualears-fastconformer-fa-full-ab-qat-w2-materialized

Base model

nvidia/stt_fa_fastconformer_hybrid_large

Finetuned

Reza2kn/visualears-fastconformer-fa-full-ab

Finetuned

(1)

this model