LFM2.5-Audio-1.5B-JP GGUF

GGUF quantizations of LiquidAI/LFM2.5-Audio-1.5B-JP for CrispASR.

LFM2.5-Audio is Liquid AI's end-to-end multimodal speech model supporting ASR (speech-to-text), TTS (text-to-speech), and speech-to-speech in a single 1.5B parameter model. This is the Japanese variant.

Architecture

Component Details
Encoder 17-layer FastConformer (512-dim, 8 heads, rel-pos attention, dw-striding 8x subsampling)
Adapter LayerNorm + Linear(512->2048) + GELU + Linear(2048->2048)
Backbone 16-layer LFM2 hybrid conv+attention (2048-dim, 32 heads / 8 KV heads, RoPE theta=1M)
Depthformer 6-layer transformer (1024-dim) with 8-codebook Mimi audio token generation
Audio codec Mimi (8 codebooks, 24 kHz)
Parameters 1.5B total

Available quantizations

File Quant Size Notes
lfm2-audio-1.5b-jp-f16.gguf F16 ~3.1 GB Full precision reference
lfm2-audio-1.5b-jp-q8_0.gguf Q8_0 ~1.7 GB High quality
lfm2-audio-1.5b-jp-q5_k.gguf Q5_K ~1.6 GB Good quality
lfm2-audio-1.5b-jp-q4_k.gguf Q4_K ~1.5 GB Recommended (verified identical output on Japanese audio)

Usage with CrispASR

# Transcribe Japanese audio
./crispasr -m lfm2-audio-1.5b-jp-q4_k.gguf -f audio.wav -l ja

# Or with auto-download
./crispasr --backend lfm2-audio -m auto -f audio.wav

Conversion

Converted from the original safetensors using:

python models/convert-lfm2-audio-to-gguf.py \
    --input LiquidAI/LFM2.5-Audio-1.5B-JP \
    --output lfm2-audio-1.5b-jp-f16.gguf

# Quantize
./crispasr-quantize lfm2-audio-1.5b-jp-f16.gguf lfm2-audio-1.5b-jp-q4_k.gguf q4_k

License

LFM Open License v1.0 - Commercial use permitted for entities with annual revenue under $10M USD. See the upstream license for full terms.

Components include: Apache-2.0 (NVIDIA NeMo), MIT (Kyutai Moshi), CC-BY-4.0 (Canary checkpoint).

Credits

Downloads last month
281
GGUF
Model size
78.5M params
Architecture
lfm2-audio-detok
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cstr/lfm2-audio-1.5b-jp-GGUF

Quantized
(2)
this model