--- title: KevinGeng/Laronix_voice_quality_checking_system emoji: 📊 colorFrom: pink colorTo: gray sdk: gradio sdk_version: 4.26.0 app_file: app.py license: apache-2.0 --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Laronix_voice_quality_checking_system This is Laronix voice quality checking system for SLP evaluation. # How to use There are 3 blocks in the system, which are **Input Block**, **Output Block** and **Example Block**. ## Input Block In this block, you can input audio files and corresponding text files. The text files will sever as the reference for the system to evaluate the audio files. The system will output the evaluation results in the **Output Block**. Please note that .wav file and .mp3 files are recommended. Even .mp4 files are supported, but the system will only use the audio part of the video, which is not guaranteed to work properly. ## Output Block In this block you will see 3 textboxs representing the naturalness, intelligibility and recognition hypothesis of the audio file. ### Naturalness: The naturalness score of the audio file. The scores are in the range of 0 to 5, and the higher the better. This system use U-Tokyo MOS model to evaluate the naturalness of the audio file. For more information, please refer to [this paper](https://arxiv.org/abs/2204.02152). ### Intelligibility: The intelligibility score of the audio file. The scores are in the range of 1 to 100, and the higher the better. Intelligibility is calculated based on WER (Word Error Rate) of given audio and text. The ASR model used in this system is based whisper-medium model, but fine-tuned on Laronix dataset. For more information, please refer to this [Hungingface model hub](https://huggingface.co/KevinGeng/whipser_medium_en_PAL300_step25). > Please noted that to give better view of scores, the system uses projection to project the scores to the range of 1 to 100. > + Projection for Naturalness Score: ![Plot](./local/nat2avaMOS.png) > + Projection for Intelligibility Score: ![Plot](./local/WER2INTELI.png) ## Example Block In this block, you can see the example audio files and text files. You can click one line of example to load and generate the evaluation results.