torch torchaudio transformers librosa >= 0.8 pip>=23.2 gradio_client==0.2.7 invisible_watermark safetensors soundfile accelerate mdtex2html transformers_stream_generator einops tiktoken openai-whisper