marcosremar2/gemini-dataset-erinome
Viewer β’ Updated β’ 10k β’ 44
How to use marcosremar2/iaratts-sft-v1 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-to-speech", model="marcosremar2/iaratts-sft-v1", trust_remote_code=True) # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("marcosremar2/iaratts-sft-v1", trust_remote_code=True, dtype="auto")Brazilian Portuguese TTS β full SFT of MOSS-TTS-Nano-100M on Erinome dataset.
| Model | WER (50-prompt holdout) | N |
|---|---|---|
| Baseline MOSS-TTS-Nano-100M | 0.5316 | 50 |
| IaraTTS-SFT-v1 (this checkpoint) | 0.1537 | 50 |
| Ξ | β0.3779 (β71% relative) |
Whisper-base ASR for round-trip eval, language=pt, fp16, jiwer for WER.
| Hyperparam | Value |
|---|---|
| Base model | OpenMOSS-Team/MOSS-TTS-Nano-100M |
| Codec | OpenMOSS-Team/MOSS-Audio-Tokenizer-Nano |
| Dataset | marcosremar2/gemini-dataset-erinome (4929 valid text+wav pairs) |
| per_device_batch_size | 8 |
| gradient_accumulation_steps | 4 |
| global_batch_size | 32 |
| epochs | 3 (465 steps) |
| learning_rate | 5e-5 cosine, warmup 5% |
| mixed_precision | bf16 |
| attn_implementation | sdpa |
| GPU | RTX 4090 (Vast.ai) |
| Wall time | ~10 min |
| Loss | 5.5 β 4.7 |
This is Phase 2 of a multi-phase IaraTTS roadmap targeting browser deploy at β€150M params. Subsequent phases (in companion repo iaratts-demo):
<laugh>/<sigh>) + IndexTTS2 instruction LM.See full roadmap in companion repo.
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("marcosremar2/iaratts-sft-v1", trust_remote_code=True)
# Use upstream MOSS-TTS-Nano `infer.py` with this checkpoint:
# python infer.py --checkpoint ./iaratts-sft-v1 \
# --audio-tokenizer-pretrained-name-or-path OpenMOSS-Team/MOSS-Audio-Tokenizer-Nano \
# --text "Hoje a tarde estΓ‘ ensolarada." \
# --output-audio-path out.wav --mode continuation --seed 42
MIT β same as upstream MOSS-TTS-Nano.
@misc{iaratts-sft-v1,
author = {marcosremar2},
title = {IaraTTS SFT v1 β pt-BR fine-tune of MOSS-TTS-Nano-100M on Erinome},
year = {2026},
url = {https://huggingface.co/marcosremar2/iaratts-sft-v1}
}
Base model
OpenMOSS-Team/MOSS-TTS-Nano-100M