--- license: apache-2.0 metrics: - accuracy pipeline_tag: audio-classification --- ## 说话识别 针对通话场景中的声音如: | sound | description | | :------- | :------- | | bell | 响铃 | | music | 音乐 | | mute | 静音(完全没有声音) | | noise | 噪音(声音比较大的噪音) | | noise_mute | 环境音(其实也是噪音, 但声音比较小) | | voice | 语音(用户说话的声音, 但如果是远场说话则被认为是环境音) | | voicemail | 语音信箱(运营商播报的语音信箱) | | white_noise | 白噪声(一般是电话线路导致的, 嗡嗡的声音) | 些模型将以上声音区分为 "non_voice", "voice" 两种. 如下: | sound | label | | :------- | :------- | | bell | non_voice | | music | non_voice | | mute | non_voice | | noise | non_voice | | noise_mute | non_voice | | voice | voice | | voicemail | voice | | white_noise | voice | 准确率: | sound | accuracy | | :------- | :------- | | non_voice | 95.27% | | voice | 95.48% | | total | 95.35% |