Update README.md

by tiansz - opened Apr 10, 2023

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

-2

tiansz

Apr 10, 2023

The original code has a small problem, I made a slight change, run the notebook as follows:https://www.kaggle.com/code/tiansztianszs/facebook-tts-transformer-zh-cv7-css10/notebook

Update README.md588532ec

Viking714

May 31, 2023

Hi dear, very glad to see your post, that can work, thank you very much. But I have a new question, wish get your support. How I can save the wav and rate to a wave or mp3 file? I search a method by "torchaudio.save(filepath="test.wav", src=wav, sample_rate=rate)", but I got an error "RuntimeError: Input tensor has to be 2D." I am not very familiar with this one. Could you help me? Thank you . Wish you have a great day.

tiansz

May 31, 2023

Hello, please add the following code after the example I gave:

import torchaudio

# 将一维张量转换为二维张量
wav = wav.unsqueeze(0)

# 保存为wav文件
torchaudio.save('audio_save.wav', wav, rate)

# 保存为mp3文件
torchaudio.save('audio_save.mp3', wav, rate, format='mp3')

Viking714

Jun 2, 2023

Hi dear, Thank you very much. By the way, I recently meet a new problem, I don't know anyone ever face. That's if my input text is a little long, maybe several tens of Chinese characters, the output speech is not ok. At the last part it will repeat some part and miss some part.
Thank you once more. Have a great day.

tiansz

Jun 2, 2023

Yes, the models work poorly, my recommendation is to use the following models (ranked according to the performance below):

是的，模型效果很差，我的建议是使用以下模型（根据性能排名）：

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment