wq2012 commited on
Commit
58c6c62
1 Parent(s): 161d456

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -3
README.md CHANGED
@@ -1,3 +1,50 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - speech
5
+ - audio
6
+ - lang-id
7
+ - langid
8
+ ---
9
+
10
+ # Conformer based spoken language identification model
11
+
12
+ ## Summary
13
+
14
+ This is a conformer-based streaming language identification model with attentive temporal pooling.
15
+
16
+ The model was trained with public data only.
17
+
18
+ The paper: https://arxiv.org/abs/2202.12163
19
+
20
+ ```
21
+ @inproceedings{wang2022attentive,
22
+ title={Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech},
23
+ author={Quan Wang and Yang Yu and Jason Pelecanos and Yiling Huang and Ignacio Lopez Moreno},
24
+ booktitle={Odyssey: The Speaker and Language Recognition Workshop},
25
+ year={2022}
26
+ }
27
+ ```
28
+
29
+ ## Usage
30
+
31
+ Run use this model, you will need to use the `siglingvo` library: https://github.com/google/speaker-id/tree/master/lingvo
32
+
33
+ Since lingvo does not support Python 3.11 yet, make sure your Python is up to 3.10.
34
+
35
+ Install the library:
36
+
37
+ ```
38
+ pip install sidlingvo
39
+ ```
40
+
41
+ Example usage:
42
+
43
+ ```Python
44
+ import sidlingvo
45
+
46
+ wav_file = "your_wav_file.wav"
47
+ runner = wav_to_lang.WavToLangRunner()
48
+ top_lang, _ = runner.wav_to_lang(wav_file)
49
+ print("Predicted language:", top_lang)
50
+ ```