Fish-Audio-S2-Pro-MLX-fp16

This is an fp16 artifact export of fishaudio/s2-pro for Apple Silicon runtime evaluation.

Runtime support is not implied by this model card. The bundle preserves upstream key names and is intended for a speech-swift Swift/MLX port.

Model

Field Value
Source fishaudio/s2-pro
Source revision 1de9996b6be38b745688de084d87a5633f714e4e
Format MLX fp16 safetensors
License posture research/non-commercial
Readiness benchmark-only
Sample rate See upstream/runtime implementation.
Voice conditioning Fish Speech reference / speaker conditioning stack
Runtime status benchmark artifact; not for default product integration

Emotion Control

Field Value
Marker syntax free-form inline bracket tags
Supported markers [pause], [emphasis], [laughing], [excited], [angry], [whisper], [screaming], [shouting], [surprised], [sad]

Files

  • config.json - root config for Hugging Face download tracking and runtime metadata
  • soniqo_manifest.json - export manifest with source, marker, readiness, and file metadata
  • *.safetensors - fp16-converted model weights with upstream key names preserved

Notes

  • Strong benchmark for free-form inline prosody/emotion tags.
  • Public weights are research/non-commercial; commercial use requires a separate license.
  • The codec checkpoint is converted from PyTorch pickle to safetensors when this model is exported.

Usage

These artifacts are for runtime implementation and evaluation. They are not a drop-in Transformers checkpoint.

git clone https://huggingface.co/aufklarer/Fish-Audio-S2-Pro-MLX-fp16

Source

Links

Downloads last month
-
Safetensors
Model size
5B params
Tensor type
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/Fish-Audio-S2-Pro-MLX-fp16

Base model

fishaudio/s2-pro
Finetuned
(10)
this model