Voice Activity Detection
ONNX
speech-processing
semantic-vad
multilingual
marcus-daily commited on
Commit
f766f81
ยท
1 Parent(s): 16c8130

Smart Turn v3.2

Browse files
benchmarks/smart-turn-v3.0.md CHANGED
@@ -2,60 +2,64 @@
2
 
3
  **Model:** `/data/smart-turn-v3.0.onnx`
4
 
5
- **Generated:** 2025-12-03 16:04:09 UTC
6
 
7
  ## Accuracy Results
8
 
9
- **Total Samples:** 31,473
10
 
11
  **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
 
13
- **Unique Datasets:** chirp3_1, chirp3_2, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
 
15
  ### Overall Performance
16
- | Metric | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
17
- |--------|--------------|--------------|---------------------|---------------------|
18
- | Overall | 31,473 | 91.60 | 4.68 | 3.72 |
 
19
 
20
  ### Performance by Language
21
- | Language | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
22
- |----------|--------------|--------------|---------------------|---------------------|
23
- | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 97.10 | 1.66 | 1.24 |
24
- | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 96.88 | 1.92 | 1.20 |
25
- | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 890 | 96.74 | 1.12 | 2.13 |
26
- | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 96.22 | 2.42 | 1.36 |
27
- | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,253 | 96.17 | 1.52 | 2.31 |
28
- | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,401 | 96.15 | 2.00 | 1.86 |
29
- | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 95.42 | 2.79 | 1.79 |
30
- | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 94.88 | 3.07 | 2.05 |
31
- | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 94.85 | 3.17 | 1.98 |
32
- | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 94.54 | 4.12 | 1.34 |
33
- | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 94.51 | 2.80 | 2.69 |
34
- | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 976 | 94.47 | 2.87 | 2.66 |
35
- | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 93.98 | 3.55 | 2.47 |
36
- | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,470 | 93.54 | 3.33 | 3.13 |
37
- | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,295 | 93.36 | 4.40 | 2.24 |
38
- | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 93.07 | 4.88 | 2.05 |
39
- | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 88.60 | 6.97 | 4.44 |
40
- | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 945 | 88.57 | 4.76 | 6.67 |
41
- | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,722 | 88.31 | 6.00 | 5.70 |
42
- | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 87.47 | 8.53 | 4.01 |
43
- | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,791 | 86.71 | 4.69 | 8.60 |
44
- | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 84.10 | 10.90 | 5.00 |
45
- | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 81.57 | 14.94 | 3.49 |
 
46
 
47
  ### Performance by Dataset
48
- | Dataset | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
49
- |---------|--------------|--------------|---------------------|---------------------|
50
- | rime_2 | 396 | 99.75 | 0.00 | 0.25 |
51
- | human_5 | 402 | 96.27 | 1.00 | 2.74 |
52
- | chirp3_1 | 16,300 | 94.53 | 2.93 | 2.53 |
53
- | orpheus_endfiller_1 | 182 | 94.51 | 0.00 | 5.49 |
54
- | orpheus_grammar_1 | 163 | 92.64 | 3.68 | 3.68 |
55
- | orpheus_midfiller_1 | 140 | 91.43 | 3.57 | 5.00 |
56
- | human_convcollector_1 | 90 | 91.11 | 3.33 | 5.56 |
57
- | chirp3_2 | 8,428 | 90.27 | 6.68 | 3.05 |
58
- | midcentury_1 | 1,044 | 85.44 | 11.78 | 2.78 |
59
- | liva_1 | 3,832 | 84.68 | 6.92 | 8.40 |
60
- | mundo_1 | 496 | 72.78 | 5.24 | 21.98 |
 
 
61
 
 
2
 
3
  **Model:** `/data/smart-turn-v3.0.onnx`
4
 
5
+ **Generated:** 2026-01-07 17:31:01 UTC
6
 
7
  ## Accuracy Results
8
 
9
+ **Total Samples:** 31,527
10
 
11
  **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
 
13
+ **Unique Datasets:** chirp3_1, chirp3_2, chirp3_3_short, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
 
15
  ### Overall Performance
16
+
17
+ | Metric | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
18
+ | :------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
19
+ | Overall | 31,527 | 88.97 | 0.858 | 0.933 | 0.894 | 7.70 | 3.33 |
20
 
21
  ### Performance by Language
22
+
23
+ | Language | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
24
+ | :------------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
25
+ | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 889 | 95.39 | 0.947 | 0.962 | 0.954 | 2.70 | 1.91 |
26
+ | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 95.34 | 0.927 | 0.983 | 0.954 | 3.83 | 0.83 |
27
+ | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 95.23 | 0.928 | 0.980 | 0.954 | 3.78 | 0.98 |
28
+ | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 94.49 | 0.927 | 0.962 | 0.944 | 3.65 | 1.86 |
29
+ | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 94.36 | 0.925 | 0.967 | 0.945 | 3.96 | 1.68 |
30
+ | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,398 | 94.28 | 0.925 | 0.968 | 0.946 | 4.08 | 1.65 |
31
+ | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,252 | 94.25 | 0.945 | 0.942 | 0.944 | 2.80 | 2.96 |
32
+ | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 93.22 | 0.898 | 0.974 | 0.935 | 5.50 | 1.28 |
33
+ | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 974 | 93.02 | 0.904 | 0.955 | 0.929 | 4.83 | 2.16 |
34
+ | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,468 | 92.03 | 0.891 | 0.966 | 0.927 | 6.20 | 1.77 |
35
+ | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 91.82 | 0.881 | 0.954 | 0.916 | 6.03 | 2.15 |
36
+ | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 90.73 | 0.854 | 0.979 | 0.912 | 8.24 | 1.03 |
37
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,284 | 89.95 | 0.856 | 0.970 | 0.910 | 8.49 | 1.56 |
38
+ | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 89.73 | 0.841 | 0.982 | 0.906 | 9.37 | 0.90 |
39
+ | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 89.64 | 0.869 | 0.938 | 0.903 | 7.20 | 3.16 |
40
+ | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 929 | 89.34 | 0.861 | 0.943 | 0.900 | 7.75 | 2.91 |
41
+ | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 87.52 | 0.813 | 0.974 | 0.886 | 11.19 | 1.29 |
42
+ | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 86.17 | 0.810 | 0.950 | 0.875 | 11.30 | 2.53 |
43
+ | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,820 | 85.49 | 0.836 | 0.876 | 0.855 | 8.41 | 6.10 |
44
+ | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,783 | 84.46 | 0.848 | 0.831 | 0.839 | 7.29 | 8.24 |
45
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 81.01 | 0.751 | 0.936 | 0.834 | 15.76 | 3.23 |
46
+ | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 79.58 | 0.721 | 0.960 | 0.824 | 18.43 | 1.99 |
47
+ | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 78.70 | 0.717 | 0.935 | 0.811 | 18.10 | 3.20 |
48
 
49
  ### Performance by Dataset
50
+
51
+ | Dataset | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
52
+ | :-------------------- | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
53
+ | rime_2 | 394 | 95.94 | 0.981 | 0.922 | 0.951 | 0.76 | 3.30 |
54
+ | orpheus_endfiller_1 | 181 | 95.03 | 1.000 | 0.902 | 0.949 | 0.00 | 4.97 |
55
+ | human_5 | 402 | 94.03 | 0.948 | 0.916 | 0.931 | 2.24 | 3.73 |
56
+ | chirp3_1 | 16,254 | 92.84 | 0.903 | 0.961 | 0.931 | 5.19 | 1.96 |
57
+ | human_convcollector_1 | 90 | 91.11 | 0.895 | 0.895 | 0.895 | 4.44 | 4.44 |
58
+ | orpheus_midfiller_1 | 140 | 90.00 | 0.877 | 0.905 | 0.891 | 5.71 | 4.29 |
59
+ | orpheus_grammar_1 | 163 | 87.12 | 0.848 | 0.918 | 0.881 | 8.59 | 4.29 |
60
+ | chirp3_2 | 8,428 | 86.08 | 0.802 | 0.956 | 0.872 | 11.75 | 2.17 |
61
+ | liva_1 | 3,831 | 83.74 | 0.839 | 0.838 | 0.838 | 8.09 | 8.17 |
62
+ | midcentury_1 | 1,044 | 77.39 | 0.704 | 0.917 | 0.797 | 18.58 | 4.02 |
63
+ | chirp3_3_short | 104 | 74.04 | 0.818 | 0.562 | 0.667 | 5.77 | 20.19 |
64
+ | mundo_1 | 496 | 67.34 | 0.741 | 0.524 | 0.614 | 9.07 | 23.59 |
65
 
benchmarks/smart-turn-v3.1-cpu.md CHANGED
@@ -2,60 +2,64 @@
2
 
3
  **Model:** `/data/smart-turn-v3.1-cpu.onnx`
4
 
5
- **Generated:** 2025-12-03 16:13:05 UTC
6
 
7
  ## Accuracy Results
8
 
9
- **Total Samples:** 31,473
10
 
11
  **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
 
13
- **Unique Datasets:** chirp3_1, chirp3_2, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
 
15
  ### Overall Performance
16
- | Metric | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
17
- |--------|--------------|--------------|---------------------|---------------------|
18
- | Overall | 31,473 | 93.02 | 3.07 | 3.91 |
 
19
 
20
  ### Performance by Language
21
- | Language | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
22
- |----------|--------------|--------------|---------------------|---------------------|
23
- | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 97.36 | 1.44 | 1.20 |
24
- | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 890 | 96.97 | 1.01 | 2.02 |
25
- | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 96.27 | 2.07 | 1.66 |
26
- | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,401 | 95.93 | 1.57 | 2.50 |
27
- | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,253 | 95.45 | 1.84 | 2.71 |
28
- | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 95.39 | 2.27 | 2.34 |
29
- | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 95.14 | 2.30 | 2.56 |
30
- | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 976 | 95.08 | 3.07 | 1.84 |
31
- | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 94.85 | 2.48 | 2.67 |
32
- | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 94.71 | 1.57 | 3.72 |
33
- | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,722 | 94.69 | 2.24 | 3.07 |
34
- | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 94.34 | 2.47 | 3.19 |
35
- | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 94.09 | 3.21 | 2.70 |
36
- | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,470 | 93.61 | 2.93 | 3.47 |
37
- | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 93.33 | 3.01 | 3.66 |
38
- | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 93.00 | 3.06 | 3.94 |
39
- | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,295 | 92.90 | 4.09 | 3.01 |
40
- | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,791 | 90.12 | 4.97 | 4.91 |
41
- | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 88.60 | 5.60 | 5.81 |
42
- | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 87.34 | 7.24 | 5.43 |
43
- | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 945 | 85.93 | 3.49 | 10.58 |
44
- | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 84.90 | 8.10 | 7.00 |
45
- | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 77.39 | 6.47 | 16.14 |
 
46
 
47
  ### Performance by Dataset
48
- | Dataset | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
49
- |---------|--------------|--------------|---------------------|---------------------|
50
- | midcentury_1 | 1,044 | 99.04 | 0.10 | 0.86 |
51
- | orpheus_endfiller_1 | 182 | 98.90 | 0.00 | 1.10 |
52
- | rime_2 | 396 | 98.23 | 0.25 | 1.52 |
53
- | human_5 | 402 | 97.76 | 0.50 | 1.74 |
54
- | liva_1 | 3,832 | 94.18 | 2.71 | 3.11 |
55
- | chirp3_1 | 16,300 | 94.07 | 2.55 | 3.38 |
56
- | chirp3_2 | 8,428 | 89.68 | 4.60 | 5.72 |
57
- | orpheus_grammar_1 | 163 | 89.57 | 6.13 | 4.29 |
58
- | human_convcollector_1 | 90 | 88.89 | 2.22 | 8.89 |
59
- | orpheus_midfiller_1 | 140 | 88.57 | 2.86 | 8.57 |
60
- | mundo_1 | 496 | 86.69 | 7.66 | 5.65 |
 
 
61
 
 
2
 
3
  **Model:** `/data/smart-turn-v3.1-cpu.onnx`
4
 
5
+ **Generated:** 2026-01-07 17:38:46 UTC
6
 
7
  ## Accuracy Results
8
 
9
+ **Total Samples:** 31,527
10
 
11
  **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
 
13
+ **Unique Datasets:** chirp3_1, chirp3_2, chirp3_3_short, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
 
15
  ### Overall Performance
16
+
17
+ | Metric | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
18
+ | :------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
19
+ | Overall | 31,527 | 90.13 | 0.883 | 0.924 | 0.903 | 6.08 | 3.79 |
20
 
21
  ### Performance by Language
22
+
23
+ | Language | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
24
+ | :------------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
25
+ | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 95.32 | 0.936 | 0.974 | 0.954 | 3.36 | 1.32 |
26
+ | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,252 | 94.89 | 0.949 | 0.951 | 0.950 | 2.64 | 2.48 |
27
+ | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,398 | 94.71 | 0.943 | 0.956 | 0.949 | 3.00 | 2.29 |
28
+ | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 94.02 | 0.928 | 0.954 | 0.941 | 3.71 | 2.27 |
29
+ | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 974 | 93.43 | 0.910 | 0.957 | 0.933 | 4.52 | 2.05 |
30
+ | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 889 | 93.03 | 0.944 | 0.914 | 0.929 | 2.70 | 4.27 |
31
+ | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 92.85 | 0.946 | 0.904 | 0.925 | 2.50 | 4.65 |
32
+ | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 92.75 | 0.900 | 0.960 | 0.929 | 5.28 | 1.97 |
33
+ | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 92.17 | 0.912 | 0.931 | 0.921 | 4.43 | 3.40 |
34
+ | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 91.94 | 0.881 | 0.969 | 0.923 | 6.52 | 1.53 |
35
+ | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,468 | 91.42 | 0.901 | 0.940 | 0.920 | 5.45 | 3.13 |
36
+ | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,820 | 90.66 | 0.885 | 0.930 | 0.907 | 5.92 | 3.41 |
37
+ | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 90.63 | 0.876 | 0.949 | 0.911 | 6.80 | 2.57 |
38
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,284 | 90.58 | 0.897 | 0.925 | 0.911 | 5.53 | 3.89 |
39
+ | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 89.99 | 0.869 | 0.927 | 0.897 | 6.57 | 3.44 |
40
+ | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 88.95 | 0.855 | 0.944 | 0.897 | 8.19 | 2.86 |
41
+ | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 88.32 | 0.831 | 0.960 | 0.891 | 9.70 | 1.98 |
42
+ | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,783 | 87.49 | 0.854 | 0.898 | 0.875 | 7.52 | 4.99 |
43
+ | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 85.96 | 0.831 | 0.909 | 0.868 | 9.40 | 4.65 |
44
+ | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 929 | 85.25 | 0.833 | 0.888 | 0.860 | 9.04 | 5.71 |
45
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 82.30 | 0.792 | 0.883 | 0.835 | 11.76 | 5.94 |
46
+ | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 80.30 | 0.774 | 0.845 | 0.808 | 12.10 | 7.60 |
47
+ | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 77.99 | 0.805 | 0.735 | 0.769 | 8.86 | 13.15 |
48
 
49
  ### Performance by Dataset
50
+
51
+ | Dataset | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
52
+ | :-------------------- | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
53
+ | orpheus_endfiller_1 | 181 | 95.58 | 1.000 | 0.913 | 0.955 | 0.00 | 4.42 |
54
+ | liva_1 | 3,831 | 93.45 | 0.929 | 0.942 | 0.935 | 3.63 | 2.92 |
55
+ | human_5 | 402 | 93.03 | 0.931 | 0.910 | 0.920 | 2.99 | 3.98 |
56
+ | rime_2 | 394 | 92.89 | 0.932 | 0.898 | 0.915 | 2.79 | 4.31 |
57
+ | chirp3_1 | 16,254 | 92.09 | 0.907 | 0.938 | 0.923 | 4.82 | 3.09 |
58
+ | orpheus_grammar_1 | 163 | 89.57 | 0.895 | 0.906 | 0.901 | 5.52 | 4.91 |
59
+ | orpheus_midfiller_1 | 140 | 89.29 | 0.875 | 0.889 | 0.882 | 5.71 | 5.00 |
60
+ | chirp3_2 | 8,428 | 86.24 | 0.838 | 0.897 | 0.866 | 8.64 | 5.13 |
61
+ | human_convcollector_1 | 90 | 85.56 | 0.791 | 0.895 | 0.840 | 10.00 | 4.44 |
62
+ | mundo_1 | 496 | 82.06 | 0.815 | 0.825 | 0.820 | 9.27 | 8.67 |
63
+ | midcentury_1 | 1,044 | 81.32 | 0.742 | 0.940 | 0.829 | 15.80 | 2.87 |
64
+ | chirp3_3_short | 104 | 78.85 | 0.842 | 0.667 | 0.744 | 5.77 | 15.38 |
65
 
benchmarks/smart-turn-v3.1-gpu.md CHANGED
@@ -2,60 +2,64 @@
2
 
3
  **Model:** `/data/smart-turn-v3.1-gpu.onnx`
4
 
5
- **Generated:** 2025-12-03 16:21:25 UTC
6
 
7
  ## Accuracy Results
8
 
9
- **Total Samples:** 31,473
10
 
11
  **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
 
13
- **Unique Datasets:** chirp3_1, chirp3_2, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
 
15
  ### Overall Performance
16
- | Metric | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
17
- |--------|--------------|--------------|---------------------|---------------------|
18
- | Overall | 31,473 | 93.98 | 3.21 | 2.81 |
 
19
 
20
  ### Performance by Language
21
- | Language | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
22
- |----------|--------------|--------------|---------------------|---------------------|
23
- | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 98.08 | 0.84 | 1.08 |
24
- | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 890 | 97.98 | 0.79 | 1.24 |
25
- | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 97.52 | 1.24 | 1.24 |
26
- | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,401 | 96.79 | 1.57 | 1.64 |
27
- | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 96.37 | 2.42 | 1.21 |
28
- | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,253 | 96.09 | 1.68 | 2.23 |
29
- | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 95.99 | 2.15 | 1.86 |
30
- | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 95.74 | 2.18 | 2.08 |
31
- | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 976 | 95.59 | 2.56 | 1.84 |
32
- | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 95.57 | 2.47 | 1.96 |
33
- | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,722 | 95.55 | 2.55 | 1.90 |
34
- | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 95.52 | 2.81 | 1.66 |
35
- | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,470 | 94.15 | 2.93 | 2.93 |
36
- | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 94.09 | 3.34 | 2.57 |
37
- | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 93.97 | 2.58 | 3.44 |
38
- | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 93.69 | 3.25 | 3.06 |
39
- | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,295 | 93.36 | 3.17 | 3.47 |
40
- | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,791 | 90.95 | 5.75 | 3.29 |
41
- | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 945 | 88.99 | 4.34 | 6.67 |
42
- | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 88.91 | 6.44 | 4.65 |
43
- | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 88.24 | 6.20 | 5.56 |
44
- | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 85.10 | 7.10 | 7.80 |
45
- | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 81.87 | 9.86 | 8.27 |
 
46
 
47
  ### Performance by Dataset
48
- | Dataset | Sample Count | Accuracy (%) | False Positives (%) | False Negatives (%) |
49
- |---------|--------------|--------------|---------------------|---------------------|
50
- | midcentury_1 | 1,044 | 99.52 | 0.10 | 0.38 |
51
- | human_5 | 402 | 98.51 | 0.50 | 1.00 |
52
- | orpheus_endfiller_1 | 182 | 98.35 | 0.00 | 1.65 |
53
- | rime_2 | 396 | 97.98 | 0.25 | 1.77 |
54
- | liva_1 | 3,832 | 95.12 | 2.97 | 1.91 |
55
- | chirp3_1 | 16,300 | 95.10 | 2.58 | 2.32 |
56
- | orpheus_grammar_1 | 163 | 93.87 | 4.29 | 1.84 |
57
- | chirp3_2 | 8,428 | 90.76 | 4.84 | 4.40 |
58
- | human_convcollector_1 | 90 | 88.89 | 6.67 | 4.44 |
59
- | orpheus_midfiller_1 | 140 | 87.86 | 5.00 | 7.14 |
60
- | mundo_1 | 496 | 85.69 | 8.87 | 5.44 |
 
 
61
 
 
2
 
3
  **Model:** `/data/smart-turn-v3.1-gpu.onnx`
4
 
5
+ **Generated:** 2026-01-07 17:45:59 UTC
6
 
7
  ## Accuracy Results
8
 
9
+ **Total Samples:** 31,527
10
 
11
  **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
 
13
+ **Unique Datasets:** chirp3_1, chirp3_2, chirp3_3_short, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
 
15
  ### Overall Performance
16
+
17
+ | Metric | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
18
+ | :------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
19
+ | Overall | 31,527 | 91.64 | 0.894 | 0.944 | 0.918 | 5.55 | 2.81 |
20
 
21
  ### Performance by Language
22
+
23
+ | Language | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
24
+ | :------------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
25
+ | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 95.68 | 0.944 | 0.971 | 0.958 | 2.88 | 1.44 |
26
+ | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,398 | 95.42 | 0.950 | 0.963 | 0.956 | 2.65 | 1.93 |
27
+ | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 95.34 | 0.935 | 0.973 | 0.954 | 3.31 | 1.35 |
28
+ | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,252 | 95.29 | 0.950 | 0.958 | 0.954 | 2.56 | 2.16 |
29
+ | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 889 | 95.16 | 0.965 | 0.937 | 0.951 | 1.69 | 3.15 |
30
+ | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 95.16 | 0.936 | 0.970 | 0.952 | 3.33 | 1.51 |
31
+ | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 94.85 | 0.942 | 0.953 | 0.947 | 2.86 | 2.29 |
32
+ | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 94.50 | 0.922 | 0.972 | 0.946 | 4.09 | 1.41 |
33
+ | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 974 | 94.35 | 0.921 | 0.963 | 0.942 | 3.90 | 1.75 |
34
+ | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 93.10 | 0.905 | 0.960 | 0.932 | 4.94 | 1.96 |
35
+ | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,468 | 92.64 | 0.911 | 0.953 | 0.932 | 4.90 | 2.45 |
36
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,284 | 92.52 | 0.919 | 0.939 | 0.929 | 4.28 | 3.19 |
37
+ | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 92.03 | 0.900 | 0.933 | 0.917 | 4.84 | 3.12 |
38
+ | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,820 | 91.94 | 0.889 | 0.954 | 0.921 | 5.82 | 2.24 |
39
+ | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 91.14 | 0.880 | 0.954 | 0.916 | 6.55 | 2.31 |
40
+ | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 90.50 | 0.859 | 0.968 | 0.910 | 7.92 | 1.58 |
41
+ | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 89.84 | 0.865 | 0.950 | 0.905 | 7.59 | 2.56 |
42
+ | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,783 | 89.62 | 0.871 | 0.924 | 0.897 | 6.67 | 3.70 |
43
+ | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 929 | 88.37 | 0.850 | 0.937 | 0.891 | 8.40 | 3.23 |
44
+ | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 87.01 | 0.838 | 0.923 | 0.878 | 9.08 | 3.91 |
45
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 84.88 | 0.833 | 0.878 | 0.855 | 8.91 | 6.20 |
46
+ | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 81.20 | 0.801 | 0.820 | 0.810 | 10.00 | 8.80 |
47
+ | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 81.08 | 0.780 | 0.862 | 0.819 | 12.05 | 6.87 |
48
 
49
  ### Performance by Dataset
50
+
51
+ | Dataset | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
52
+ | :-------------------- | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
53
+ | orpheus_endfiller_1 | 181 | 97.24 | 1.000 | 0.946 | 0.972 | 0.00 | 2.76 |
54
+ | rime_2 | 394 | 97.21 | 0.959 | 0.976 | 0.967 | 1.78 | 1.02 |
55
+ | human_5 | 402 | 95.02 | 0.939 | 0.949 | 0.944 | 2.74 | 2.24 |
56
+ | liva_1 | 3,831 | 94.23 | 0.929 | 0.959 | 0.944 | 3.68 | 2.09 |
57
+ | chirp3_1 | 16,254 | 93.53 | 0.919 | 0.955 | 0.937 | 4.22 | 2.25 |
58
+ | orpheus_grammar_1 | 163 | 89.57 | 0.878 | 0.929 | 0.903 | 6.75 | 3.68 |
59
+ | orpheus_midfiller_1 | 140 | 89.29 | 0.853 | 0.921 | 0.885 | 7.14 | 3.57 |
60
+ | chirp3_2 | 8,428 | 87.81 | 0.850 | 0.916 | 0.882 | 8.03 | 4.15 |
61
+ | chirp3_3_short | 104 | 85.58 | 0.867 | 0.812 | 0.839 | 5.77 | 8.65 |
62
+ | mundo_1 | 496 | 84.68 | 0.840 | 0.854 | 0.847 | 8.06 | 7.26 |
63
+ | human_convcollector_1 | 90 | 84.44 | 0.761 | 0.921 | 0.833 | 12.22 | 3.33 |
64
+ | midcentury_1 | 1,044 | 84.39 | 0.766 | 0.974 | 0.858 | 14.37 | 1.25 |
65
 
benchmarks/smart-turn-v3.2-cpu.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Endpointing Model Benchmark Report
2
+
3
+ **Model:** `/data/smart-turn-v3.2-cpu.onnx`
4
+
5
+ **Generated:** 2026-01-07 17:53:34 UTC
6
+
7
+ ## Accuracy Results
8
+
9
+ **Total Samples:** 31,527
10
+
11
+ **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
+
13
+ **Unique Datasets:** chirp3_1, chirp3_2, chirp3_3_short, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
+
15
+ ### Overall Performance
16
+
17
+ | Metric | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
18
+ | :------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
19
+ | Overall | 31,527 | 92.63 | 0.909 | 0.947 | 0.927 | 4.73 | 2.64 |
20
+
21
+ ### Performance by Language
22
+
23
+ | Language | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
24
+ | :------------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
25
+ | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 889 | 96.96 | 0.956 | 0.984 | 0.970 | 2.25 | 0.79 |
26
+ | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 96.79 | 0.955 | 0.981 | 0.968 | 2.28 | 0.93 |
27
+ | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 96.37 | 0.947 | 0.982 | 0.964 | 2.72 | 0.91 |
28
+ | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 96.16 | 0.962 | 0.962 | 0.962 | 1.92 | 1.92 |
29
+ | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,398 | 95.92 | 0.954 | 0.968 | 0.961 | 2.43 | 1.65 |
30
+ | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 974 | 95.38 | 0.948 | 0.955 | 0.952 | 2.46 | 2.16 |
31
+ | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 95.16 | 0.939 | 0.964 | 0.951 | 3.09 | 1.75 |
32
+ | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 94.50 | 0.930 | 0.961 | 0.946 | 3.58 | 1.92 |
33
+ | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 94.49 | 0.934 | 0.954 | 0.944 | 3.29 | 2.22 |
34
+ | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,820 | 94.26 | 0.926 | 0.959 | 0.942 | 3.75 | 1.99 |
35
+ | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 94.19 | 0.924 | 0.954 | 0.939 | 3.66 | 2.15 |
36
+ | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 94.16 | 0.930 | 0.954 | 0.942 | 3.56 | 2.28 |
37
+ | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,252 | 94.09 | 0.924 | 0.964 | 0.943 | 4.07 | 1.84 |
38
+ | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,468 | 93.53 | 0.920 | 0.960 | 0.940 | 4.36 | 2.11 |
39
+ | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 92.81 | 0.906 | 0.957 | 0.931 | 5.01 | 2.18 |
40
+ | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 91.81 | 0.896 | 0.950 | 0.922 | 5.62 | 2.56 |
41
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,284 | 90.11 | 0.856 | 0.975 | 0.911 | 8.57 | 1.32 |
42
+ | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 89.76 | 0.872 | 0.936 | 0.903 | 6.97 | 3.27 |
43
+ | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,783 | 89.57 | 0.867 | 0.929 | 0.897 | 6.95 | 3.48 |
44
+ | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 929 | 85.79 | 0.894 | 0.818 | 0.854 | 4.95 | 9.26 |
45
+ | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 83.80 | 0.800 | 0.892 | 0.844 | 10.90 | 5.30 |
46
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 82.43 | 0.762 | 0.952 | 0.846 | 15.12 | 2.45 |
47
+ | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 79.38 | 0.811 | 0.764 | 0.786 | 8.86 | 11.75 |
48
+
49
+ ### Performance by Dataset
50
+
51
+ | Dataset | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
52
+ | :-------------------- | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
53
+ | midcentury_1 | 1,044 | 98.47 | 0.978 | 0.990 | 0.984 | 1.05 | 0.48 |
54
+ | rime_2 | 394 | 96.95 | 0.970 | 0.958 | 0.964 | 1.27 | 1.78 |
55
+ | orpheus_endfiller_1 | 181 | 96.69 | 1.000 | 0.935 | 0.966 | 0.00 | 3.31 |
56
+ | human_5 | 402 | 96.02 | 0.945 | 0.966 | 0.956 | 2.49 | 1.49 |
57
+ | liva_1 | 3,831 | 93.97 | 0.924 | 0.960 | 0.941 | 3.99 | 2.04 |
58
+ | chirp3_1 | 16,254 | 93.78 | 0.923 | 0.955 | 0.939 | 3.97 | 2.25 |
59
+ | orpheus_grammar_1 | 163 | 91.41 | 0.890 | 0.953 | 0.920 | 6.13 | 2.45 |
60
+ | chirp3_3_short | 104 | 91.35 | 0.933 | 0.875 | 0.903 | 2.88 | 5.77 |
61
+ | chirp3_2 | 8,428 | 89.31 | 0.870 | 0.923 | 0.896 | 6.85 | 3.84 |
62
+ | orpheus_midfiller_1 | 140 | 87.14 | 0.846 | 0.873 | 0.859 | 7.14 | 5.71 |
63
+ | human_convcollector_1 | 90 | 86.67 | 0.810 | 0.895 | 0.850 | 8.89 | 4.44 |
64
+ | mundo_1 | 496 | 84.27 | 0.796 | 0.919 | 0.853 | 11.69 | 4.03 |
65
+
benchmarks/smart-turn-v3.2-gpu.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Endpointing Model Benchmark Report
2
+
3
+ **Model:** `/data/smart-turn-v3.2-gpu.onnx`
4
+
5
+ **Generated:** 2026-01-07 17:59:39 UTC
6
+
7
+ ## Accuracy Results
8
+
9
+ **Total Samples:** 31,527
10
+
11
+ **Unique Languages:** ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese
12
+
13
+ **Unique Datasets:** chirp3_1, chirp3_2, chirp3_3_short, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2
14
+
15
+ ### Overall Performance
16
+
17
+ | Metric | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
18
+ | :------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
19
+ | Overall | 31,527 | 93.71 | 0.931 | 0.944 | 0.937 | 3.51 | 2.78 |
20
+
21
+ ### Performance by Language
22
+
23
+ | Language | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
24
+ | :------------ | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
25
+ | ๐Ÿ‡ฐ๐Ÿ‡ท Korean | 889 | 97.64 | 0.977 | 0.975 | 0.976 | 1.12 | 1.24 |
26
+ | ๐Ÿ‡ฏ๐Ÿ‡ต Japanese | 834 | 97.12 | 0.974 | 0.969 | 0.971 | 1.32 | 1.56 |
27
+ | ๐Ÿ‡น๐Ÿ‡ท Turkish | 966 | 97.00 | 0.967 | 0.973 | 0.970 | 1.66 | 1.35 |
28
+ | ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch | 1,398 | 96.92 | 0.966 | 0.975 | 0.970 | 1.79 | 1.29 |
29
+ | ๐Ÿ‡ฉ๐Ÿ‡ช German | 1,322 | 96.60 | 0.957 | 0.976 | 0.966 | 2.19 | 1.21 |
30
+ | ๐Ÿ‡ต๐Ÿ‡น Portuguese | 1,398 | 95.49 | 0.948 | 0.960 | 0.954 | 2.58 | 1.93 |
31
+ | ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian | 971 | 95.47 | 0.939 | 0.971 | 0.955 | 3.09 | 1.44 |
32
+ | ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish | 1,010 | 95.25 | 0.950 | 0.954 | 0.952 | 2.48 | 2.28 |
33
+ | ๐Ÿ‡ต๐Ÿ‡ฑ Polish | 974 | 95.17 | 0.946 | 0.952 | 0.949 | 2.57 | 2.26 |
34
+ | ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian | 929 | 95.05 | 0.943 | 0.952 | 0.947 | 2.69 | 2.26 |
35
+ | ๐Ÿ‡ฎ๐Ÿ‡น Italian | 782 | 95.01 | 0.949 | 0.951 | 0.950 | 2.56 | 2.43 |
36
+ | ๐Ÿ‡ซ๐Ÿ‡ท French | 1,252 | 94.73 | 0.941 | 0.956 | 0.949 | 3.04 | 2.24 |
37
+ | ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English | 7,820 | 94.71 | 0.940 | 0.953 | 0.946 | 2.98 | 2.31 |
38
+ | ๐Ÿ‡ท๐Ÿ‡บ Russian | 1,468 | 94.41 | 0.937 | 0.958 | 0.947 | 3.41 | 2.18 |
39
+ | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish | 779 | 93.58 | 0.930 | 0.944 | 0.937 | 3.59 | 2.82 |
40
+ | ๐Ÿ‡ณ๐Ÿ‡ด Norwegian | 1,014 | 93.00 | 0.929 | 0.934 | 0.932 | 3.65 | 3.35 |
41
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi | 1,284 | 92.76 | 0.930 | 0.931 | 0.931 | 3.66 | 3.58 |
42
+ | ๐Ÿ‡ช๐Ÿ‡ธ Spanish | 1,783 | 91.53 | 0.908 | 0.920 | 0.914 | 4.54 | 3.93 |
43
+ | ๐Ÿ‡จ๐Ÿ‡ณ Chinese | 929 | 90.53 | 0.899 | 0.918 | 0.908 | 5.27 | 4.20 |
44
+ | ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic | 947 | 89.12 | 0.869 | 0.925 | 0.896 | 7.07 | 3.80 |
45
+ | ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi | 774 | 88.11 | 0.870 | 0.901 | 0.885 | 6.85 | 5.04 |
46
+ | ๐Ÿ‡ง๐Ÿ‡ฉ Bengali | 1,000 | 85.10 | 0.847 | 0.849 | 0.848 | 7.50 | 7.40 |
47
+ | ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese | 1,004 | 82.47 | 0.814 | 0.840 | 0.826 | 9.56 | 7.97 |
48
+
49
+ ### Performance by Dataset
50
+
51
+ | Dataset | Sample Count | Accuracy (%) | Precision | Recall | F1 | FPR (%) | FNR (%) |
52
+ | :-------------------- | -----------: | -----------: | --------: | -----: | ----: | ------: | ------: |
53
+ | midcentury_1 | 1,044 | 98.85 | 0.992 | 0.984 | 0.988 | 0.38 | 0.77 |
54
+ | rime_2 | 394 | 98.22 | 0.982 | 0.976 | 0.979 | 0.76 | 1.02 |
55
+ | human_5 | 402 | 97.01 | 0.977 | 0.955 | 0.966 | 1.00 | 1.99 |
56
+ | orpheus_endfiller_1 | 181 | 95.58 | 0.988 | 0.924 | 0.955 | 0.55 | 3.87 |
57
+ | chirp3_1 | 16,254 | 94.80 | 0.943 | 0.954 | 0.948 | 2.89 | 2.31 |
58
+ | liva_1 | 3,831 | 94.49 | 0.934 | 0.958 | 0.946 | 3.39 | 2.11 |
59
+ | orpheus_grammar_1 | 163 | 92.02 | 0.919 | 0.929 | 0.924 | 4.29 | 3.68 |
60
+ | chirp3_3_short | 104 | 91.35 | 0.933 | 0.875 | 0.903 | 2.88 | 5.77 |
61
+ | chirp3_2 | 8,428 | 90.76 | 0.898 | 0.918 | 0.908 | 5.17 | 4.07 |
62
+ | human_convcollector_1 | 90 | 90.00 | 0.837 | 0.947 | 0.889 | 7.78 | 2.22 |
63
+ | orpheus_midfiller_1 | 140 | 87.86 | 0.859 | 0.873 | 0.866 | 6.43 | 5.71 |
64
+ | mundo_1 | 496 | 87.70 | 0.871 | 0.882 | 0.877 | 6.45 | 5.85 |
65
+
smart-turn-v3.2-cpu.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2bb026316b14a660486a75b1733cd3fbab8c2fd0314dc9af7be49f8cca967e4f
3
+ size 8679182
smart-turn-v3.2-gpu.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab8dc64b88713f90b571c15b714bd1330e6c883cad8763dacf65c9376dc539be
3
+ size 32411198