numpy scipy lxml pydub fastapi soundfile pyrubberband omegaconf pypinyin pandas vector_quantize_pytorch einops omegaconf~=2.3.0 tqdm huggingface_hub>=0.22.2,<1.0 vocos==0.0.1 transformers==4.41.2 torch torchvision torchaudio gradio emojiswitch python-dotenv zhon mistune==3.0.2 cn2an python-box