MeloTTS-ES / README.md
qaihm-bot's picture
v0.51.0
1316d66 verified
metadata
library_name: pytorch
license: other
tags:
  - real_time
  - android
pipeline_tag: text-to-audio

MeloTTS-ES: Optimized for Qualcomm Devices

MeloTTS is a high-quality multi-lingual text-to-speech library for English, Chinese and Spanish language.

This is based on the implementation of MeloTTS-ES found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
VOICE_AI mixed_with_float Snapdragon® 8 Elite Gen 5 Mobile QAIRT 2.45 Download
VOICE_AI mixed_with_float Snapdragon® X2 Elite QAIRT 2.45 Download
VOICE_AI mixed_with_float Snapdragon® X Elite QAIRT 2.45 Download
VOICE_AI mixed_with_float Snapdragon® 8 Gen 3 Mobile QAIRT 2.45 Download
VOICE_AI mixed_with_float qualcomm-qcs8275 QAIRT 2.45 Download
VOICE_AI mixed_with_float Qualcomm® QCS8550 (Proxy) QAIRT 2.45 Download
VOICE_AI mixed_with_float Qualcomm® SA8775P QAIRT 2.45 Download
VOICE_AI mixed_with_float Snapdragon® 8 Elite For Galaxy Mobile QAIRT 2.45 Download
VOICE_AI mixed_with_float Qualcomm® SA7255P QAIRT 2.45 Download
VOICE_AI mixed_with_float Qualcomm® SA8295P QAIRT 2.45 Download
VOICE_AI mixed_with_float Qualcomm® QCS9075 QAIRT 2.45 Download
VOICE_AI mixed_with_float Qualcomm® QCS8450 (Proxy) QAIRT 2.45 Download

For more device-specific assets and performance metrics, visit MeloTTS-ES on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for MeloTTS-ES on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.audio_generation

Model Stats:

  • Model checkpoint: myshell-ai/MeloTTS-Spanish
  • Max decoded sequence length: 512 tokens
  • Number of parameters (encoder): 8.36M
  • Model size (encoder) (float): 32.0 MB
  • Number of parameters (flow): 20.1M
  • Model size (flow) (float): 76.9 MB
  • Number of parameters (decoder): 14.5M
  • Model size (decoder) (float): 55.5 MB
  • Number of parameters (t5_encoder): 15.1M
  • Model size (t5_encoder) (float): 57.5 MB
  • Number of parameters (t5_decoder): 5.72M
  • Model size (t5_decoder) (float): 21.8 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
decoder VOICE_AI mixed_with_float Snapdragon® 8 Elite Gen 5 Mobile 43.603 ms 0 - 10 MB NPU
decoder VOICE_AI mixed_with_float Snapdragon® X2 Elite 40.761 ms 0 - 0 MB NPU
decoder VOICE_AI mixed_with_float Snapdragon® X Elite 82.709 ms 0 - 0 MB NPU
decoder VOICE_AI mixed_with_float Snapdragon® 8 Gen 3 Mobile 61.145 ms 0 - 8 MB NPU
decoder VOICE_AI mixed_with_float Qualcomm® QCS8275 (Proxy) 134.528 ms 0 - 9 MB NPU
decoder VOICE_AI mixed_with_float Qualcomm® QCS8550 (Proxy) 85.581 ms 0 - 2 MB NPU
decoder VOICE_AI mixed_with_float Qualcomm® SA8775P 83.845 ms 0 - 9 MB NPU
decoder VOICE_AI mixed_with_float Qualcomm® QCS9075 83.281 ms 0 - 2 MB NPU
decoder VOICE_AI mixed_with_float Qualcomm® QCS8450 (Proxy) 113.955 ms 1 - 9 MB NPU
decoder VOICE_AI mixed_with_float Qualcomm® SA7255P 134.528 ms 0 - 9 MB NPU
decoder VOICE_AI mixed_with_float Qualcomm® SA8295P 100.024 ms 0 - 6 MB NPU
decoder VOICE_AI mixed_with_float Snapdragon® 8 Elite For Galaxy Mobile 47.962 ms 0 - 9 MB NPU
encoder VOICE_AI mixed_with_float Snapdragon® 8 Elite Gen 5 Mobile 19.517 ms 4 - 13 MB NPU
encoder VOICE_AI mixed_with_float Snapdragon® X2 Elite 20.57 ms 4 - 4 MB NPU
encoder VOICE_AI mixed_with_float Snapdragon® X Elite 32.975 ms 4 - 4 MB NPU
encoder VOICE_AI mixed_with_float Snapdragon® 8 Gen 3 Mobile 25.204 ms 4 - 11 MB NPU
encoder VOICE_AI mixed_with_float Qualcomm® QCS8275 (Proxy) 52.553 ms 2 - 10 MB NPU
encoder VOICE_AI mixed_with_float Qualcomm® QCS8550 (Proxy) 34.626 ms 4 - 5 MB NPU
encoder VOICE_AI mixed_with_float Qualcomm® SA8775P 36.213 ms 2 - 11 MB NPU
encoder VOICE_AI mixed_with_float Qualcomm® QCS9075 35.679 ms 4 - 9 MB NPU
encoder VOICE_AI mixed_with_float Qualcomm® QCS8450 (Proxy) 44.062 ms 4 - 13 MB NPU
encoder VOICE_AI mixed_with_float Qualcomm® SA7255P 52.553 ms 2 - 10 MB NPU
encoder VOICE_AI mixed_with_float Qualcomm® SA8295P 40.778 ms 0 - 5 MB NPU
encoder VOICE_AI mixed_with_float Snapdragon® 8 Elite For Galaxy Mobile 20.302 ms 2 - 15 MB NPU
flow VOICE_AI mixed_with_float Snapdragon® 8 Elite Gen 5 Mobile 64.613 ms 2 - 11 MB NPU
flow VOICE_AI mixed_with_float Snapdragon® X2 Elite 61.925 ms 2 - 2 MB NPU
flow VOICE_AI mixed_with_float Snapdragon® X Elite 122.177 ms 2 - 2 MB NPU
flow VOICE_AI mixed_with_float Snapdragon® 8 Gen 3 Mobile 90.362 ms 2 - 10 MB NPU
flow VOICE_AI mixed_with_float Qualcomm® QCS8275 (Proxy) 235.037 ms 2 - 11 MB NPU
flow VOICE_AI mixed_with_float Qualcomm® QCS8550 (Proxy) 123.207 ms 3 - 5 MB NPU
flow VOICE_AI mixed_with_float Qualcomm® SA8775P 121.083 ms 2 - 11 MB NPU
flow VOICE_AI mixed_with_float Qualcomm® QCS9075 120.456 ms 4 - 8 MB NPU
flow VOICE_AI mixed_with_float Qualcomm® QCS8450 (Proxy) 205.615 ms 2 - 12 MB NPU
flow VOICE_AI mixed_with_float Qualcomm® SA7255P 235.037 ms 2 - 11 MB NPU
flow VOICE_AI mixed_with_float Qualcomm® SA8295P 150.571 ms 0 - 5 MB NPU
flow VOICE_AI mixed_with_float Snapdragon® 8 Elite For Galaxy Mobile 79.131 ms 2 - 11 MB NPU
t5_decoder VOICE_AI mixed_with_float Snapdragon® 8 Elite Gen 5 Mobile 0.256 ms 0 - 9 MB NPU
t5_decoder VOICE_AI mixed_with_float Snapdragon® X2 Elite 0.328 ms 1 - 1 MB NPU
t5_decoder VOICE_AI mixed_with_float Snapdragon® X Elite 0.429 ms 1 - 1 MB NPU
t5_decoder VOICE_AI mixed_with_float Snapdragon® 8 Gen 3 Mobile 0.305 ms 0 - 8 MB NPU
t5_decoder VOICE_AI mixed_with_float Qualcomm® QCS8275 (Proxy) 0.983 ms 0 - 9 MB NPU
t5_decoder VOICE_AI mixed_with_float Qualcomm® QCS8550 (Proxy) 0.401 ms 1 - 2 MB NPU
t5_decoder VOICE_AI mixed_with_float Qualcomm® SA8775P 0.667 ms 0 - 10 MB NPU
t5_decoder VOICE_AI mixed_with_float Qualcomm® QCS9075 0.512 ms 1 - 3 MB NPU
t5_decoder VOICE_AI mixed_with_float Qualcomm® QCS8450 (Proxy) 0.578 ms 1 - 10 MB NPU
t5_decoder VOICE_AI mixed_with_float Qualcomm® SA7255P 0.983 ms 0 - 9 MB NPU
t5_decoder VOICE_AI mixed_with_float Qualcomm® SA8295P 0.796 ms 0 - 5 MB NPU
t5_decoder VOICE_AI mixed_with_float Snapdragon® 8 Elite For Galaxy Mobile 0.271 ms 0 - 9 MB NPU
t5_encoder VOICE_AI mixed_with_float Snapdragon® 8 Elite Gen 5 Mobile 0.482 ms 0 - 10 MB NPU
t5_encoder VOICE_AI mixed_with_float Snapdragon® X2 Elite 0.654 ms 0 - 0 MB NPU
t5_encoder VOICE_AI mixed_with_float Snapdragon® X Elite 1.052 ms 0 - 0 MB NPU
t5_encoder VOICE_AI mixed_with_float Snapdragon® 8 Gen 3 Mobile 0.636 ms 0 - 7 MB NPU
t5_encoder VOICE_AI mixed_with_float Qualcomm® QCS8275 (Proxy) 2.839 ms 0 - 9 MB NPU
t5_encoder VOICE_AI mixed_with_float Qualcomm® QCS8550 (Proxy) 0.877 ms 0 - 1 MB NPU
t5_encoder VOICE_AI mixed_with_float Qualcomm® SA8775P 1.275 ms 0 - 9 MB NPU
t5_encoder VOICE_AI mixed_with_float Qualcomm® QCS9075 1.118 ms 0 - 2 MB NPU
t5_encoder VOICE_AI mixed_with_float Qualcomm® QCS8450 (Proxy) 1.358 ms 0 - 9 MB NPU
t5_encoder VOICE_AI mixed_with_float Qualcomm® SA7255P 2.839 ms 0 - 9 MB NPU
t5_encoder VOICE_AI mixed_with_float Qualcomm® SA8295P 1.721 ms 0 - 5 MB NPU
t5_encoder VOICE_AI mixed_with_float Snapdragon® 8 Elite For Galaxy Mobile 0.523 ms 0 - 9 MB NPU

License

  • The license for the original implementation of MeloTTS-ES can be found here.

References

Community