LFM2.5-Audio-1.5B-JP GGUF
GGUF quantizations of LiquidAI/LFM2.5-Audio-1.5B-JP for CrispASR.
LFM2.5-Audio is Liquid AI's end-to-end multimodal speech model supporting ASR (speech-to-text), TTS (text-to-speech), and speech-to-speech in a single 1.5B parameter model. This is the Japanese variant.
Architecture
| Component | Details |
|---|---|
| Encoder | 17-layer FastConformer (512-dim, 8 heads, rel-pos attention, dw-striding 8x subsampling) |
| Adapter | LayerNorm + Linear(512->2048) + GELU + Linear(2048->2048) |
| Backbone | 16-layer LFM2 hybrid conv+attention (2048-dim, 32 heads / 8 KV heads, RoPE theta=1M) |
| Depthformer | 6-layer transformer (1024-dim) with 8-codebook Mimi audio token generation |
| Audio codec | Mimi (8 codebooks, 24 kHz) |
| Parameters | 1.5B total |
Available quantizations
| File | Quant | Size | Notes |
|---|---|---|---|
lfm2-audio-1.5b-jp-f16.gguf |
F16 | ~3.1 GB | Full precision reference |
lfm2-audio-1.5b-jp-q8_0.gguf |
Q8_0 | ~1.7 GB | High quality |
lfm2-audio-1.5b-jp-q5_k.gguf |
Q5_K | ~1.6 GB | Good quality |
lfm2-audio-1.5b-jp-q4_k.gguf |
Q4_K | ~1.5 GB | Recommended (verified identical output on Japanese audio) |
Usage with CrispASR
# Transcribe Japanese audio
./crispasr -m lfm2-audio-1.5b-jp-q4_k.gguf -f audio.wav -l ja
# Or with auto-download
./crispasr --backend lfm2-audio -m auto -f audio.wav
Conversion
Converted from the original safetensors using:
python models/convert-lfm2-audio-to-gguf.py \
--input LiquidAI/LFM2.5-Audio-1.5B-JP \
--output lfm2-audio-1.5b-jp-f16.gguf
# Quantize
./crispasr-quantize lfm2-audio-1.5b-jp-f16.gguf lfm2-audio-1.5b-jp-q4_k.gguf q4_k
License
LFM Open License v1.0 - Commercial use permitted for entities with annual revenue under $10M USD. See the upstream license for full terms.
Components include: Apache-2.0 (NVIDIA NeMo), MIT (Kyutai Moshi), CC-BY-4.0 (Canary checkpoint).
Credits
- Downloads last month
- 281
Hardware compatibility
Log In to add your hardware
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for cstr/lfm2-audio-1.5b-jp-GGUF
Base model
LiquidAI/LFM2-1.2B Finetuned
LiquidAI/LFM2.5-Audio-1.5B Finetuned
LiquidAI/LFM2.5-Audio-1.5B-JP