unable to load the fine-tuned model

#1
by ti3x - opened

Error:

tortoise-tts# python tortoise/do_tts.py --text "hello world." --voice stephanie --preset high_quality --model_dir=./tortoise-tts-finetuned-lj/models/
Traceback (most recent call last):
  File "tortoise/do_tts.py", line 27, in <module>
    tts = TextToSpeech(models_dir=args.model_dir)
  File "/home/ubuntu/transfer-tts/tortoise-tts/tortoise/api.py", line 231, in __init__
    self.autoregressive.load_state_dict(torch.load(get_model_path('autoregressive.pth', models_dir)))
  File "/opt/conda/envs/tts/lib/python3.8/site-packages/torch/serialization.py", line 795, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/opt/conda/envs/tts/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

Also, would love to learn if there are instructions on how to fine-tune the model. Would it improve the quality? What I am looking for as improvement is kind f whether it can produce less TTS-like clear pronunciations and learn some special speech patterns if given more data. (Like someone may pronounce certain words non-standard with an accent)

Hey Tim,
I've been doing most maintenance (that I can find time for) on GitHub - I'd recommend you pull the code from that version. This repo is for storing the weights.

Regarding fine-tuning - it is extremely effective with this model, to the point where you can almost perfectly reproduce anyones voice with ~1 hour or so of source material. For this reason, I have withheld a few details you would need to do this. I do not intend to release this information, apologies.

ti3x changed discussion status to closed

Ah, I apologize! I didn't realize this was the LJ repo. This should work.. can you confirm you are using the code from github?

@jbetker yes...I was not able to find LJ repo on Github, latest code from the https://github.com/neonbjb/tortoise-tts, and I did not change anything. My goal is just trying these released weights and test the difference.

Just tested these weights and they work fine on my computer. Maybe your download of the AR weights got corrupted somehow? An easy test is:

import torch
torch.load('autoregressive.pth')

That should yield a dict of weight values of len=470.

@jbetker thank you for the quick response, data download seems to have been the issue. Thank you again for the amazing work.

Sign up or log in to comment