metadata
description: Set up conda environment for speech-to-text fine-tuning
tags:
- python
- conda
- stt
- whisper
- speech
- ai
- fine-tuning
- project
- gitignored
You are helping the user set up a conda environment for speech-to-text (STT) fine-tuning.
Process
Create base environment
conda create -n stt-finetune python=3.11 -y conda activate stt-finetuneInstall PyTorch with ROCm
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0Install Whisper and related libraries
pip install openai-whisper pip install faster-whisper # Optimized inference pip install whisperx # Advanced featuresInstall Hugging Face libraries
pip install transformers pip install datasets pip install accelerate pip install evaluate pip install peft # For LoRA fine-tuningInstall audio processing libraries
pip install librosa # Audio analysis pip install soundfile # Audio I/O pip install pydub # Audio manipulation pip install sox # Audio processing conda install -c conda-forge ffmpeg -y # Audio conversionInstall speech-specific tools
pip install jiwer # Word Error Rate calculation pip install speechbrain # Speech toolkit pip install pyannote.audio # Speaker diarizationInstall data processing tools
pip install pandas pip install numpy pip install scipy pip install matplotlib pip install seaborn # VisualizationInstall monitoring and experimentation
pip install wandb # Experiment tracking pip install tensorboardInstall Jupyter for interactive work
conda install -c conda-forge jupyter jupyterlab ipywidgets -yTest installation
import torch
import whisper
import librosa
from transformers import WhisperProcessor, WhisperForConditionalGeneration
print(f"PyTorch: {torch.__version__}")
print(f"GPU available: {torch.cuda.is_available()}")
print("All libraries imported successfully!")
- Suggest common datasets
- Common Voice (Mozilla)
- LibriSpeech
- TEDLIUM
- Custom datasets
- Create example script
- Offer to create
~/scripts/whisper-finetune-example.pywith basic setup
Output
Provide a summary showing:
- Environment name and setup status
- Installed libraries grouped by purpose
- GPU detection status
- Available VRAM for training
- Suggested datasets for fine-tuning
- Example commands for testing
- Links to documentation/tutorials