DrishtiSharma commited on
Commit
034f3c2
1 Parent(s): b8d87ae

Upload NOTES.md

Browse files
Files changed (1) hide show
  1. NOTES.md +65 -0
NOTES.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+ # Things that might be relevant
4
+
5
+ ## Trained models
6
+
7
+ ESPnet model for Yoloxochitl Mixtec
8
+ - Huggingface Hub page https://huggingface.co/espnet/ftshijt_espnet2_asr_yolo_mixtec_transformer
9
+ - Model source code https://github.com/espnet/espnet/tree/master/egs/yoloxochitl_mixtec/asr1
10
+ - Colab notebook to setup and apply the model https://colab.research.google.com/drive/1ieoW2b3ERydjaaWuhVPBP_v2QqqWsC1Q?usp=sharing
11
+
12
+ Coqui model for Yoloxochitl Mixtec
13
+ - Huggingface Hub page
14
+ - Coqui page https://coqui.ai/mixtec/jemeyer/v1.0.0
15
+ - Colab notebook to setup and apply the model https://colab.research.google.com/drive/1b1SujEGC_F3XhvUCuUyZK_tyUkEaFZ7D?usp=sharing#scrollTo=6IvRFke4Ckpz
16
+
17
+ Spanish ASR models
18
+ - XLS-R model based on CV8 with LM https://huggingface.co/jonatasgrosman/wav2vec2-xls-r-1b-spanish
19
+ - XLSR model based on CV6 with LM https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-spanish
20
+ - XLSR model based on Librispeech https://huggingface.co/IIC/wav2vec2-spanish-multilibrispeech
21
+
22
+ Speechbrain Language identification on Common Language (from Common Voice 6/7?)
23
+ - source code https://github.com/speechbrain/speechbrain/tree/develop/recipes/CommonLanguage
24
+ - HF Hub model page https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa
25
+ - HF Hub space https://huggingface.co/spaces/akhaliq/Speechbrain-audio-classification
26
+
27
+ Speechbrain Language identification on VoxLingua
28
+ - source code https://github.com/speechbrain/speechbrain/tree/develop/recipes/VoxLingua107/lang_id
29
+ - HF Hub model page https://huggingface.co/speechbrain/lang-id-voxlingua107-ecapa
30
+
31
+
32
+ ## Corpora
33
+
34
+ OpenSLR89 https://www.openslr.org/89/
35
+
36
+ Common Language https://huggingface.co/datasets/common_language
37
+
38
+ VoxLingua http://bark.phon.ioc.ee/voxlingua107/
39
+
40
+ Multilibrispeech https://huggingface.co/datasets/multilingual_librispeech
41
+
42
+
43
+ # Possible demos
44
+
45
+ ## Simple categorization of utterances
46
+
47
+ A few example files are provided for each language, and the user can record their own.
48
+ The predicted confidence of each class label is shown.
49
+
50
+ ## Segmentation and identification
51
+
52
+ Recordings with alternating languages in a single audio file, provided examples or the user can record.
53
+ Some voice activity detection to split the audio, then predict language of each piece
54
+
55
+ ## Identication and transcription
56
+
57
+ Example files for each language separately.
58
+ The lang-id model predicts what language it is.
59
+ The corresponding ASR model produces a transcript.
60
+
61
+ ## Segmentation, identification and transcription
62
+
63
+ Recordings with alternating languages in a single audio file.
64
+ Use voice activity detection to split the audio, then predict the language of each piece
65
+ Use the corresponding ASR model to produce a transcript of each piece to display.