--- title: Test emoji: 😻 colorFrom: blue colorTo: gray sdk: streamlit sdk_version: 1.21.0 app_file: app.py pinned: false --- Translate and Transcribe Audio The "Translate and Transcribe Audio" program is a Streamlit web application that allows users to transcribe audio files, translate the transcribed English text to Hindi, and listen to the original and translated audio. Requirements Before running the application, make sure you have the following dependencies installed. You can install them using pip and the provided requirements.txt file: transformers git+https://github.com/openai/whisper.git sentencepiece pydub whisper streamlit sounddevice soundfile audio-recorder-streamlit Running the Application To run the "Translate and Transcribe Audio" application, follow these steps: Save the app.py and requirements.txt files in a directory on your machine. Install the required dependencies using the requirements.txt file, as mentioned in the Requirements section. Open a terminal or command prompt and navigate to the directory containing app.py and requirements.txt. Run the Streamlit application using the following command: streamlit run app.py A local development server will start, and the application will be accessible in your web browser at http://localhost:8501. Usage The "Translate and Transcribe Audio" application provides two ways to transcribe and translate audio: Audio Recording: Click on the "Mic" button and start speaking. The audio will be recorded and transcribed. The original English text and its translation in Hindi will be displayed. Upload Audio File: You can also upload a WAV format audio file. The application will transcribe the audio and provide the corresponding translation. How It Works The application uses the "audio-recorder-streamlit" library to record audio through the computer's microphone. The recorded audio is transcribed using the "whisper" library, which is an automatic speech recognition (ASR) system. The transcribed English text is then translated to Hindi using the "transformers" library, which employs the Helsinki-NLP's "opus-mt-en-hi" model for machine translation. The translated text and the original audio are displayed to the user. Note The "whisper" library is used for speech recognition, and its performance may vary based on the audio quality and accents in the recorded audio. The "transformers" library uses machine learning models for translation, and the translation accuracy may not be perfect, but it should provide a reasonable translation in most cases. The program saves the audio to a file named "audio.mp3" for the recording option and "uploaded_audio.wav" for the uploaded file option. These files are temporary and are deleted after processing. Limitations The performance of automatic speech recognition (ASR) and machine translation heavily depends on the quality of the audio and the complexity of the language being translated. The accuracy of the results may not be perfect, but the libraries and models used in this application aim to provide reasonable outputs. For best results, try to use clear and high-quality audio files when using the upload option. Conclusion The "Translate and Transcribe Audio" application demonstrates the power of combining ASR and machine translation to convert spoken English into written Hindi. It can be used for various purposes, including language learning, accessibility, and communication across language barriers. Please note that this application is provided as a demonstration and may require further optimization or customization based on specific use cases. For any issues or suggestions, please feel free to reach out to the developer. Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference