File size: 6,093 Bytes
306e52a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# π ebook2audiobook
Convert eBooks to audiobooks with chapters and metadata using Calibre and Coqui XTTS. Supports optional voice cloning and multiple languages!
## π Features
- π Converts eBooks to text format with Calibre.
- π Splits eBook into chapters for organized audio.
- ποΈ High-quality text-to-speech with Coqui XTTS.
- π£οΈ Optional voice cloning with your own voice file.
- π Supports multiple languages (English by default).
- π₯οΈ Designed to run on 4GB RAM.
## π οΈ Requirements
- Python 3.x
- `coqui-tts` Python package
- Calibre (for eBook conversion)
- FFmpeg (for audiobook creation)
- Optional: Custom voice file for voice cloning
### π§ Installation Instructions
1. **Install Python 3.x** from [Python.org](https://www.python.org/downloads/).
2. **Install Calibre**:
- **Ubuntu**: `sudo apt-get install -y calibre`
- **macOS**: `brew install calibre`
- **Windows** (Admin Powershell): `choco install calibre`
3. **Install FFmpeg**:
- **Ubuntu**: `sudo apt-get install -y ffmpeg`
- **macOS**: `brew install ffmpeg`
- **Windows** (Admin Powershell): `choco install ffmpeg`
4. **Optional: Install Mecab** (for non-Latin languages):
- **Ubuntu**: `sudo apt-get install -y mecab libmecab-dev mecab-ipadic-utf8`
- **macOS**: `brew install mecab`, `brew install mecab-ipadic`
- **Windows**: [mecab-website-to-install-manually](https://taku910.github.io/mecab/#download) (Note: Japanese support is limited)
5. **Install Python packages**:
```bash
pip install tts==0.21.3 pydub nltk beautifulsoup4 ebooklib tqdm
python -m nltk.downloader punkt
```
**For non-Latin languages**:
```bash
pip install mecab mecab-python3 unidic
python -m unidic download
```
## π Supported Languages
- **English (en)**
- **Spanish (es)**
- **French (fr)**
- **German (de)**
- **Italian (it)**
- **Portuguese (pt)**
- **Polish (pl)**
- **Turkish (tr)**
- **Russian (ru)**
- **Dutch (nl)**
- **Czech (cs)**
- **Arabic (ar)**
- **Chinese (zh-cn)**
- **Japanese (ja)**
- **Hungarian (hu)**
- **Korean (ko)**
Specify the language code when running the script.
## π Usage
### π₯οΈ Gradio Web Interface
1. **Run the Script**:
```bash
python custom_model_ebook2audiobookXTTS_gradio.py
```
2. **Open the Web App**: Click the URL provided in the terminal to access the web app and convert eBooks.
### π Basic Usage
```bash
python ebook2audiobook.py <path_to_ebook_file> [path_to_voice_file] [language_code]
```
- **<path_to_ebook_file>**: Path to your eBook file.
- **[path_to_voice_file]**: Optional for voice cloning.
- **[language_code]**: Optional to specify language.
### 𧩠Custom XTTS Model
```bash
python custom_model_ebook2audiobookXTTS.py <ebook_file_path> <target_voice_file_path> <language> <custom_model_path> <custom_config_path> <custom_vocab_path>
```
- **<ebook_file_path>**: Path to your eBook file.
- **<target_voice_file_path>**: Optional for voice cloning.
- **<language>**: Optional to specify language.
- **<custom_model_path>**: Path to `model.pth`.
- **<custom_config_path>**: Path to `config.json`.
- **<custom_vocab_path>**: Path to `vocab.json`.
### π³ Using Docker
You can also use Docker to run the eBook to Audiobook converter. This method ensures consistency across different environments and simplifies setup.
#### π Running the Docker Container
To run the Docker container and start the Gradio interface, use the following command:
-Run with CPU only
```powershell
docker run -it --rm -p 7860:7860 --platform=linux/amd64 athomasson2/ebook2audiobookxtts:huggingface python app.py
```
-Run with GPU Speedup (Nvida graphics cards only)
```powershell
docker run -it --rm --gpus all -p 7860:7860 --platform=linux/amd64 athomasson2/ebook2audiobookxtts:huggingface python app.py
```
This command will start the Gradio interface on port 7860.(localhost:7860)
#### π₯οΈ Docker GUI
<img width="1401" alt="Screenshot 2024-08-25 at 10 08 40β―AM" src="https://github.com/user-attachments/assets/78cfd33e-cd46-41cc-8128-3820160a5e40">
<img width="1406" alt="Screenshot 2024-08-25 at 10 08 51β―AM" src="https://github.com/user-attachments/assets/dbfad9f6-e6e5-4cad-b248-adb76c5434f3">
### π οΈ For Custom Xtts Models
Models built to be better at a specific voice. Check out my Hugging Face page [here](https://huggingface.co/drewThomasson).
To use a custom model, paste the link of the `Finished_model_files.zip` file like this:
[David Attenborough fine tuned Finished_model_files.zip](https://huggingface.co/drewThomasson/xtts_David_Attenborough_fine_tune/resolve/main/Finished_model_files.zip?download=true)
More details can be found at the [Dockerfile Hub Page]([https://github.com/DrewThomasson/ebook2audiobookXTTS](https://hub.docker.com/repository/docker/athomasson2/ebook2audiobookxtts/general)).
## π Fine Tuned Xtts models
To find already fine-tuned XTTS models, visit [this Hugging Face link](https://huggingface.co/drewThomasson) π. Search for models that include "xtts fine tune" in their names.
## π₯ Demos
https://github.com/user-attachments/assets/8486603c-38b1-43ce-9639-73757dfb1031
## π€ [Huggingface space demo](https://huggingface.co/spaces/drewThomasson/ebook2audiobookXTTS)
- Huggingface space is running on free cpu tier so expect very slow or timeout lol, just don't give it giant files is all
- Best to duplicate space or run locally.
## π Supported eBook Formats
- `.epub`, `.pdf`, `.mobi`, `.txt`, `.html`, `.rtf`, `.chm`, `.lit`, `.pdb`, `.fb2`, `.odt`, `.cbr`, `.cbz`, `.prc`, `.lrf`, `.pml`, `.snb`, `.cbc`, `.rb`, `.tcr`
- **Best results**: `.epub` or `.mobi` for automatic chapter detection
## π Output
- Creates an `.m4b` file with metadata and chapters.
- **Example Output**: ![Example](https://github.com/DrewThomasson/VoxNovel/blob/dc5197dff97252fa44c391dc0596902d71278a88/readme_files/example_in_app.jpeg)
## π Special Thanks
- **Coqui TTS**: [Coqui TTS GitHub](https://github.com/coqui-ai/TTS)
- **Calibre**: [Calibre Website](https://calibre-ebook.com)
|