Moss-TTS-Nano.AXERA

MOSS-TTS-Nano AX650 板端推理demo，包含板端 Python 推理脚本、配置文件、ONNX 对照模型和 AX650 可执行的 axmodel。

当前状态

支持内置音色。
当前推理未加入音色克隆。原因是量化后音色克隆链路损失较大。
推理稳定性有待验证，当前板端推理 RTF 约为 3，实时率偏大，流式step输出效果无法对齐，RTF也在1.5-2左右，后续加入音色克隆模型推理耗时会更多。
当前音频解码实际使用全量 codec decode，一次性还原。

目录结构

Moss-TTS-Nano.AXERA/
├── configs/
├── models/
│   ├── axmodels/
│   └── onnxmodels/
├── scripts/
├── infer_board_axmodel_decode.py
├── infer_board_onnx_decode.py
├── requirements.txt
└── README.md

环境

安装 pyaxengine

pyaxengine Releases 下载对应版本的 .whl 文件，然后安装：

pip install axengine-x.x.x-py3-none-any.whl

Python 依赖：

pip install -r requirements.txt

推理

中文

python3 infer_board_axmodel_decode.py \
  --text "你好，今天是美好的一天" \
  --voice Junhao \
  --seed 1234 \
  --max-new-frames 128 \
  --output-audio-path outputs/zh.wav

结果：

=======================================================
 板端全 AXModel 推理结果
=======================================================
  输出:         /path/to/Moss-TTS-Nano.AXERA/outputs/zh.wav
  音频时长:     3.20s
  推理耗时:     9.34s
  RTF:          2.9180
=======================================================

英文

python3 infer_board_axmodel_decode.py \
  --text "hello,today is a good day." \
  --voice Ava \
  --seed 1234 \
  --max-new-frames 128 \
  --output-audio-path outputs/en.wav

结果：

=======================================================
 板端全 AXModel 推理结果
=======================================================
  输出:         /path/to/Moss-TTS-Nano.AXERA/outputs/en.wav
  音频时长:     3.04s
  推理耗时:     8.77s
  RTF:          2.8861
=======================================================

--voice 可选18种（Junhao、Zhiming、等）

内置音色

名称	描述	类型
Junhao	CN 欢迎关注模思智能	Chinese Male
Zhiming	CN 京味胡同闲聊	Chinese Male
Weiguo	CN 说书	Chinese Male
Xiaoyu	CN 明星	Chinese Female
Yuewen	CN 机车	Chinese Female
Lingyu	CN 深夜电台	Chinese Female
Trump	EN Trump	English Male
Ava	EN The Bitter Lesson	English Female
Bella	EN A Gentle Reminder	English Female
Adam	EN English News	English Male
Nathan	EN The Quiet Motion of the World	English Male
Soyo	JP Soyo	Japanese Female
Saki	JP Saki	Japanese Female
Mortis	JP Mortis	Japanese Female
Umiri	JP Umiri	Japanese Female
Mei	JP Togawa	Japanese Female
Anon	JP Anon	Japanese Female
Arisa	JP Arisa	Japanese Female

ONNX decode_step 对比推理

注：infer_board_onnx_decode.py 和 infer_board_axmodel_decode.py 的差别是 decode_step 使用了 ONNX 模型（效果好一些），其余模型仍然使用 axmodel。

执行命令：

python3 infer_board_onnx_decode.py \
  --text "你好，今天是美好的一天" \
  --voice Junhao \
  --seed 1234 \
  --max-new-frames 128 \
  --output-audio-path outputs/onnx_decode_step.wav

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support