Transcription and Translation In the same call
#81
by
saalnlp
- opened
Hi,
I am trying to find out if there is anyway I can have whisper translate ( to english only ) and transcribe in the same call for the same audio.
I was using openAI APIs with async and everything was good. However, after going offline, It looks like I need to wait the double time for the same audio to get both the transcription and translation.
Here is what I am having now
pipe = pipeline(
"automatic-speech-recognition",
model=whisper_model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
max_new_tokens=128,
chunk_length_s=30,
batch_size=16,
return_timestamps=True,
torch_dtype=torch_dtype,
device=SECOND_DEVICE
)```
```transcribe = pipe(mp3_audio_path, generate_kwargs={"task": "transcribe"})```
```translate = pipe(mp3_audio_path, generate_kwargs={"task": "translate"})```
The only solution I can see now is to duplicate the Model on 2 GPUs, but I am only having 1 GPU and it is already loaded with other models.
Also, Is there any way I can return the detected language for the transcribe pipeline ?
Thank you
You can put both requests in a batch and run batched inference.
You will have to use the model.generate method and manually pass the decoder input ids