Spaces:

mrfakename
/

E2-F5-TTS

Running on Zero

App Files Files Community

Why did it became gibberish?

#48

by PiyushDeshmukh - opened Jan 15

Discussion

PiyushDeshmukh

Jan 15

Why the synthesized voice is gibberish after 10-15 tries?? like it is combining the prompt i gave it previously

mrfakename

Owner Jan 15

Hi, sorry about that! Would you mind attaching the reference audio you used, the text, and the generated audio?
Thanks!

PiyushDeshmukh changed discussion status to closed Jan 17

PiyushDeshmukh changed discussion status to open Jan 17

PiyushDeshmukh

Jan 17

Sorry for late reply, good thing is that it's fixed now somehow (maybe was just a glitch) however, I can provide you the reference audio and the text but i don't have the generated one cuz why would I even download the glitched one right 🤷‍♀️.

(There are moments in history when one person’s decision changes everything... When loyalty is tested... And the truth, is exposed... This, is the story of how a CIA agent betrayed the government... and revealed the deepest secrets of the U.S. surveillance state...)

However, I just wanna appreciate how good this thing works. Thank you for this service.

bigbrotherr

Jan 17

When I started using this TTS, it was wonderful but now it's no longer working at all. It's gibberish and garbage (13 tries). What happened?

mrfakename

Owner Jan 17

@bigbrotherr would you mind attaching audio samples? Thanks!

gmirsky2

4 days ago

Hey @MrFakeName , I'm experiencing the same type of issue.

I’m using the same audio and text that worked before, but after not using it since 1/10/25, the output is now just gibberish.

Example:

Input:

Output:
"Stocks slip as selling pressure continues on Wall Street"

System Info:

Torch: 2.6.0+cu124, CUDA Available: True  
torch.version.cuda: 12.4  
TorchAudio: 2.6.0+cu124  
CUDA Path: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\nvcc.exe  
CUDA Compilation Tools: release 12.4, V12.4.131  
Driver Version: 566.36, CUDA Runtime Version: 12.7  
GPU: NVIDIA GeForce GTX 1660 Ti (6GB VRAM)  
GPU Utilization: 8%

I’ve tested both methods for running the TTS generation and still end up with gibberish output. Here's how I'm doing it:

1. CLI with TOML Configuration File:

f5-tts_infer-cli -c F5.toml

F5.toml configuration

# F5-TTS | E2-TTS Configuration File

model = "F5-TTS"
remove_silence = false
output_dir = "output"
output_file = "output.wav"
gen_file = "output\\output.txt"

[voices.Kevin]
ref_audio = "wavs\\15-Kevin.wav"
ref_text = "Canadians are in the British parliamentary system, so I have a very short fuse here. The Liberals, of which Trudeau was the leader, had a majority mandate a decade ago. They're about to get wiped out, and they're going to have sixty days to find a new leader."

2. Using a Python Script to Generate the Audio:

def generate_voiceover(text, output_dir="voiceover_output"):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    
    # Get path to reference
    script_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ref_audio_path = os.path.join(script_dir, "src", "ref.wav")
    ref_text_path = os.path.join(script_dir, "src", "ref.txt")
    if not os.path.exists(ref_audio_path):
        raise FileNotFoundError(f"Reference audio file not found: {ref_audio_path}")
    output_file = "output.wav"
    output_path = os.path.join(output_dir, output_file)
    
    cmd = [
        'f5-tts_infer-cli',
        '-m', 'F5-TTS',
        '-r', ref_audio_path,
        '-s', ref_text_path,
        '-t', text,
        '-o', output_dir,
        '-w', output_file,
        '--vocoder_name', 'bigvgan'
    ]
    
    print(f"Running: {subprocess.list2cmdline(cmd)}")
    
    try:
        result = subprocess.run(cmd, capture_output=True, text=True)
        if result.stdout:
            print(result.stdout)
        if result.stderr:
            print(f"Error: {result.stderr}")
        if result.returncode != 0:
            print(f"Command failed with exit code {result.returncode}")
    except Exception as e:
        print(f"Exception during TTS generation: {e}")
    
    # Verify audio was created
    if not os.path.exists(output_path):
        print("Warning: Output audio file not created!")
    else:
        print(f"Audio generated successfully: {output_path}")
        
    return output_path

What I've Tried:

Generating the reference text automatically
Providing the reference text manually
Using three different reference audios
Reinstalling F5-TTS from scratch in a new folder
Vocos and BigVGAN as vocoder

No matter what I do, the output is just gibberish—it either sounds like a different language or maybe multiple prompts are being combined.

Did something change? Is there something I should check?

mrfakename

Owner 4 days ago

Hi, just to confirm you are running the model locally (not on the demo)? If so, did you redownload the F5 repo on 1/10?

gmirsky2

4 days ago

•

edited 4 days ago

Hi, thanks for following up! I just wanted to apologize for posting here instead of on GitHub—I find GitHub a bit overwhelming, and I’m still trying to fully understand how to navigate it. I found this post about my issue, so I posted here.

To clarify, I am running the model locally (not on the demo). I originally downloaded the F5 repository in November/December 2024. Windows shows the folder's 'last used' date as Jan 10, 2024, but I did not update or redownload the model until March 10, 2025. On March 10, I created a new folder and attempted to get it working again, but no matter what I do—even when reverting to what I know was working—it outputs gibberish.

I also tried using the demo, and it works perfectly with my input WAV file. So it seems the issue is related to my local setup rather than my input files.

mrfakename

Owner 3 days ago

Hi,
No worries about reporting the bug here - just a note: I'm not the author of F5-TTS itself, just the demo.
Could you try re-cloning the repo? In a fresh directory + venv/env:

pip install git+https://github.com/SWivid/F5-TTS
cd F5-TTS
f5-tts_infer-cli

Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment