can't save the audio ,AttributeError: 'torch.dtype' object has no attribute 'kind'
#1
by
hellos
- opened
scipy.io.wavfile.write("techno.wav", rate=model.config.sampling_rate, data=output)
solve
import scipy
import numpy as np
# Convert the PyTorch tensor to a NumPy array
output_np = output.cpu().numpy()
# Normalize the audio data to the range [-1, 1] if needed
output_np = np.interp(output_np, (output_np.min(), output_np.max()), (-1, 1))
# Specify the output file path
output_file = "techno.wav"
# Set the desired bit depth (e.g., 16 bits)
bit_depth = 16
# Convert the audio data to the specified bit depth
output_np = (output_np * 32767).astype(np.int16)
# Write the WAV file using SciPy
scipy.io.wavfile.write(output_file, rate=model.config.sampling_rate, data=output_np)
hellos
changed discussion title from
can't save the audio
to can't save the audio ,AttributeError: 'torch.dtype' object has no attribute 'kind'
Resolved in https://huggingface.co/facebook/mms-tts-hin/commit/1d83b223ec78e30b944f7d96bd117eb3d7023303. In short, we need to convert the audio output to a numpy array before saving it.
sanchit-gandhi
changed discussion status to
closed