File size: 1,184 Bytes
0a9c6e2
 
 
 
 
 
 
 
 
 
 
427f968
0a9c6e2
 
 
 
 
 
 
 
 
 
 
 
427f968
0a9c6e2
 
 
 
 
 
 
 
 
2c8fd68
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
license: apache-2.0
metrics:
- accuracy
pipeline_tag: audio-classification
---
## 说话识别

针对通话场景中的声音如: 

|    sound    |   description   |
|  :-------   |    :-------     |
|    bell     | 响铃  |
|    music    | 音乐  |
|    mute     | 静音(完全没有声音)  |
|    noise    | 噪音(声音比较大的噪音)  |
|  noise_mute | 环境音(其实也是噪音, 但声音比较小)  |
|    voice    | 语音(用户说话的声音, 但如果是远场说话则被认为是环境音) |
|  voicemail  | 语音信箱(运营商播报的语音信箱) |
| white_noise | 白噪声(一般是电话线路导致的, 嗡嗡的声音) |

些模型将以上声音区分为 "non_voice", "voice" 两种. 如下: 

|    sound    |    label   |
|  :-------   |  :-------  |
|    bell     | non_voice  |
|    music    | non_voice  |
|    mute     | non_voice  |
|    noise    | non_voice  |
|  noise_mute | non_voice  |
|    voice    |   voice    |
|  voicemail  |   voice    |
| white_noise |   voice    |

准确率:

|    sound    |  accuracy  |
|  :-------   |  :-------  |
|  non_voice  |   95.27%   |
|    voice    |   95.48%   |
|    total    |   95.35%   |