Fasee7 Najdi Small โ€” ูุตูŠุญ ู†ุฌุฏูŠ

Model Summary

Fasee7 Najdi Small is the first public release of the Fasee7 model family, developed by Wittify.ai.

It is an open Arabic Text-to-Speech (TTS) model that generates natural-sounding Najdi (Saudi) dialect speech, trained on a high-quality in-house Najdi Arabic dataset.

Unlike many Arabic TTS systems that focus exclusively on Modern Standard Arabic (MSA), Fasee7 Najdi Small is designed and optimized for real conversational Najdi speech.

The model supports Arabic text with diacritics (ุชุดูƒูŠู„) to improve pronunciation accuracy and naturalness.


Model Details

Model Name Fasee7 Najdi Small
Developer Wittify.ai
Task Text-to-Speech (TTS)
Language Arabic Dialects
Dialect Najdi Arabic (Saudi Arabia)
Architecture Based on the Chatterbox Multilingual TTS architecture, implemented via VoxCPM2 + LoRA adapter
Base Model openbmb/VoxCPM2 (2B parameters)
Fine-tuning LoRA
Training Data High-quality in-house Najdi Arabic dataset

Audio Samples

Sample 1

Sample 2


Usage

Try the model in Google Colab โ€” no local setup required:

Open In Colab

The notebook downloads the base model and LoRA adapter, runs Najdi voice-cloning inference, and lets you listen to and download the generated audio.


Known Limitations

Repetition

In some cases the model may repeat words or phrases. To reduce this:

  • Increase --inference_timesteps (e.g. 20โ€“30) for more stable output
  • Adjust --cfg_value (recommended 2.0; lower values allow more variation, higher values follow conditioning more strictly)

Dialect Coverage

This release is fine-tuned for Najdi Arabic only. Other dialects or heavily MSA-style text may sound unnatural or inconsistent.

Voice Cloning Quality

Output quality depends on the reference audio. For best results:

  • Use a clean Najdi reference clip (3โ€“10 seconds)
  • Provide an exact transcript of the reference audio
  • Avoid noisy, clipped, or heavily compressed files

Mixed Text

Arabic/English mixed text, numbers, abbreviations, and unusual spellings may produce inconsistent pronunciation.

Catastrophic Forgetting

The base VoxCPM2 model was pre-trained on multilingual data. This fine-tune was trained exclusively on Arabic dialect data, with no multilingual data included in the training mix. As a result, the model may suffer from catastrophic forgetting โ€” its ability to synthesize speech in languages other than Arabic has likely degraded significantly compared to the base model.


Intended Use

Suitable for:

  • Najdi Arabic text-to-speech synthesis
  • Voice cloning with a Najdi reference speaker
  • Research and prototyping for Saudi Arabic voice applications

Not suitable for:

  • Impersonation, fraud, or any use without voice-owner consent
  • Safety-critical applications without human review

License

Released under the Apache 2.0 license, consistent with the base VoxCPM2 model.


Acknowledgements

Built on VoxCPM2 by OpenBMB.


Downloads last month
178
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Wittify/Fasee7-Najdi-Small

Base model

openbmb/VoxCPM2
Finetuned
(16)
this model