error: ushort format requires 0 <= number <= 65535

#42
by pod9199 - opened

I tried to run the second cell from the tutorial on an Apple Silicon Mac, I got this error
```The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:10000 for open-end generation.

error Traceback (most recent call last)
Cell In[5], line 8
4 synthesiser = pipeline("text-to-speech", "suno/bark")
6 speech = synthesiser("Hello, my dog is cooler than you!", forward_params={"do_sample": True})
----> 8 scipy.io.wavfile.write("bark_out.wav", rate=speech["sampling_rate"], data=speech["audio"])

File /opt/homebrew/lib/python3.11/site-packages/scipy/io/wavfile.py:796, in write(filename, rate, data)
793 bytes_per_second = fs*(bit_depth // 8)*channels
794 block_align = channels * (bit_depth // 8)
--> 796 fmt_chunk_data = struct.pack('<HHIIHH', format_tag, channels, fs,
797 bytes_per_second, block_align, bit_depth)
798 if not (dkind == 'i' or dkind == 'u'):
799 # add cbSize field for non-PCM files
800 fmt_chunk_data += b'\x00\x00'

error: ushort format requires 0 <= number <= 65535```

I was getting that error, doing this fixed it for me:

import scipy
from transformers import AutoProcessor, BarkModel
processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
inputs = processor(""" Hello how are you """, voice_preset="v2/en_speaker_5")
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()
sample_rate = model.generation_config.sample_rate
scipy.io.wavfile.write("bark_out.wav",rate=sample_rate, data=audio_array)

Sign up or log in to comment