Fasee7 Najdi Small โ ูุตูุญ ูุฌุฏู
Model Summary
Fasee7 Najdi Small is the first public release of the Fasee7 model family, developed by Wittify.ai.
It is an open Arabic Text-to-Speech (TTS) model that generates natural-sounding Najdi (Saudi) dialect speech, trained on a high-quality in-house Najdi Arabic dataset.
Unlike many Arabic TTS systems that focus exclusively on Modern Standard Arabic (MSA), Fasee7 Najdi Small is designed and optimized for real conversational Najdi speech.
The model supports Arabic text with diacritics (ุชุดููู) to improve pronunciation accuracy and naturalness.
Model Details
| Model Name | Fasee7 Najdi Small |
| Developer | Wittify.ai |
| Task | Text-to-Speech (TTS) |
| Language | Arabic Dialects |
| Dialect | Najdi Arabic (Saudi Arabia) |
| Architecture | Based on the Chatterbox Multilingual TTS architecture, implemented via VoxCPM2 + LoRA adapter |
| Base Model | openbmb/VoxCPM2 (2B parameters) |
| Fine-tuning | LoRA |
| Training Data | High-quality in-house Najdi Arabic dataset |
Audio Samples
Sample 1
Sample 2
Usage
Try the model in Google Colab โ no local setup required:
The notebook downloads the base model and LoRA adapter, runs Najdi voice-cloning inference, and lets you listen to and download the generated audio.
Known Limitations
Repetition
In some cases the model may repeat words or phrases. To reduce this:
- Increase
--inference_timesteps(e.g. 20โ30) for more stable output - Adjust
--cfg_value(recommended2.0; lower values allow more variation, higher values follow conditioning more strictly)
Dialect Coverage
This release is fine-tuned for Najdi Arabic only. Other dialects or heavily MSA-style text may sound unnatural or inconsistent.
Voice Cloning Quality
Output quality depends on the reference audio. For best results:
- Use a clean Najdi reference clip (3โ10 seconds)
- Provide an exact transcript of the reference audio
- Avoid noisy, clipped, or heavily compressed files
Mixed Text
Arabic/English mixed text, numbers, abbreviations, and unusual spellings may produce inconsistent pronunciation.
Catastrophic Forgetting
The base VoxCPM2 model was pre-trained on multilingual data. This fine-tune was trained exclusively on Arabic dialect data, with no multilingual data included in the training mix. As a result, the model may suffer from catastrophic forgetting โ its ability to synthesize speech in languages other than Arabic has likely degraded significantly compared to the base model.
Intended Use
Suitable for:
- Najdi Arabic text-to-speech synthesis
- Voice cloning with a Najdi reference speaker
- Research and prototyping for Saudi Arabic voice applications
Not suitable for:
- Impersonation, fraud, or any use without voice-owner consent
- Safety-critical applications without human review
License
Released under the Apache 2.0 license, consistent with the base VoxCPM2 model.
Acknowledgements
- Downloads last month
- 178
Model tree for Wittify/Fasee7-Najdi-Small
Base model
openbmb/VoxCPM2