The voice for pipeline batch call is not consistent

#4
by slavik004 - opened

The voice returned for each string in the pipeline call is not consistent.
Sometime it's male. Sometime it's female.
Is there any example about how to assign voice_preset in the pipeline?

Hey, you can use the following snippet:

from transformers import (
    AutoProcessor,
    pipeline,
)
# define pipeline and processor
processor = AutoProcessor.from_pretrained("suno/bark")
# https://huggingface.co/docs/transformers/v4.32.1/en/main_classes/pipelines#transformers.TextToAudioPipeline
pipe = pipeline(task="text-to-audio", model="suno/bark-small", framework="pt", device=0)

# get voice preset from processor
blank_inp = processor("blank", voice_preset = "v2/en_speaker_1")
# get voice_preset
history_prompt = blank_inp["history_prompt"]
forward_params = {
    "history_prompt": history_prompt,
    ### you can add other generate parameters
    "num_beams":6,
    "do_sample":True,

}
output = pipe("I'd like to see what diversity I can get using beam search", forward_params = forward_params)

Sign up or log in to comment