File size: 1,184 Bytes
0a9c6e2 427f968 0a9c6e2 427f968 0a9c6e2 2c8fd68 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
license: apache-2.0
metrics:
- accuracy
pipeline_tag: audio-classification
---
## 说话识别
针对通话场景中的声音如:
| sound | description |
| :------- | :------- |
| bell | 响铃 |
| music | 音乐 |
| mute | 静音(完全没有声音) |
| noise | 噪音(声音比较大的噪音) |
| noise_mute | 环境音(其实也是噪音, 但声音比较小) |
| voice | 语音(用户说话的声音, 但如果是远场说话则被认为是环境音) |
| voicemail | 语音信箱(运营商播报的语音信箱) |
| white_noise | 白噪声(一般是电话线路导致的, 嗡嗡的声音) |
些模型将以上声音区分为 "non_voice", "voice" 两种. 如下:
| sound | label |
| :------- | :------- |
| bell | non_voice |
| music | non_voice |
| mute | non_voice |
| noise | non_voice |
| noise_mute | non_voice |
| voice | voice |
| voicemail | voice |
| white_noise | voice |
准确率:
| sound | accuracy |
| :------- | :------- |
| non_voice | 95.27% |
| voice | 95.48% |
| total | 95.35% |
|