Spaces:

Akjava
/

AIGamingVoice-Japanese

Running on Zero

App Files Files Community

AIGamingVoice-Japanese / README.md

Akjava

add images

0aa6a4a 4 days ago

preview code

raw

history blame contribute delete

3.49 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: AIGamingVoice Japanese
emoji: 🐠
colorFrom: gray
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
license: mit
short_description: TTS voice for AI (Currently Matcha-TTS)

AIGamingVoice - Japanese / 日本語

High-quality, lightweight Japanese Text-to-Speech specifically tuned for AI gaming characters. Running on ONNX Runtime for fast inference

AIゲームキャラクター向けに調整された高品質・軽量な日本語音声合成システムです。 ONNX Runtime上で動作します。

🌟 Features / 特徴

⚡ Fast & Lightweight: Pure ONNX Runtime implementation
- 高速・軽量: 純粋なONNX Runtime実装です。
🖼️ Visual Speaker Selection: Select speakers intuitively from an image gallery.
- 視覚的な話者選択: 画像ギャラリーから直感的にキャラクター（話者）を選択できます。
🇯🇵 Japanese Optimization: Uses pyopenjtalk for accurate Japanese phoneme generation.
- 日本語最適化: pyopenjtalk を使用し、正確な日本語読み上げを実現しています。

🛠️ Installation & Local Usage / インストールとローカルでの使用方法

Clone the repository / リポジトリをクローン

git clone https://huggingface.co/spaces/YOUR_USERNAME/AIGamingVoice-Japanese
cd AIGamingVoice-Japanese

Install dependencies / 依存関係のインストール
```
pip install -r requirements.txt
```
Note: You need cmake installed for pyopenjtalk. 注: pyopenjtalkのインストールには cmake が必要です。
Prepare Models / モデルの準備 Place your .onnx models in the models/ directory. models/ ディレクトリに .onnx モデルファイルを配置してください。
Prepare Speaker Images (Optional) / 話者画像の準備（オプション） Place images (0.jpg, 1.jpg, ...) in the imgs/ directory to enable the visual selector. imgs/ ディレクトリに画像ファイル（0.jpg, 1.jpg ...）を配置すると、画像による話者選択機能が有効になります。
Run the application / アプリケーションの実行
```
python app.py
```
Access http://localhost:7860 in your browser. ブラウザで http://localhost:7860 にアクセスしてください。

🎮 How to Use / 使い方

Select Model: Choose a voice model from the dropdown.
- モデル選択: ドロップダウンから音声モデルを選択します。
Select Speaker: Click on a character image or enter the Speaker ID.
- 話者選択: キャラクター画像をクリックするか、Speaker IDを入力します。
Input Text: Enter Japanese text to synthesize.
- テキスト入力: 読み上げたい日本語テキストを入力します。
Adjust Settings: Tweak Temperature (randomness) and Speaking Rate (speed).
- 設定調整: Temperature（ランダム性）やSpeaking Rate（話速）を調整できます。
Synthesize: Click the button to generate audio.
- 音声生成: ボタンをクリックして音声を生成します。

🤝 Credits / クレジット

Matcha-TTS: Architecture based on Matcha-TTS.
ONNX Runtime: Inference engine.
pyopenjtalk: Japanese text processing frontend.

Created for AI Gaming Voice Project