Spaces:

DrishtiSharma
/

stt-test

Sleeping

App Files Files Community

DrishtiSharma commited on May 3, 2022

Commit

034f3c2

•

1 Parent(s): b8d87ae

Upload NOTES.md

Browse files

Files changed (1) hide show

NOTES.md +65 -0

NOTES.md ADDED Viewed

	@@ -0,0 +1,65 @@

+# Things that might be relevant
+## Trained models
+ESPnet model for Yoloxochitl Mixtec
+ - Huggingface Hub page https://huggingface.co/espnet/ftshijt_espnet2_asr_yolo_mixtec_transformer
+ - Model source code https://github.com/espnet/espnet/tree/master/egs/yoloxochitl_mixtec/asr1
+ - Colab notebook to setup and apply the model https://colab.research.google.com/drive/1ieoW2b3ERydjaaWuhVPBP_v2QqqWsC1Q?usp=sharing
+Coqui model for Yoloxochitl Mixtec
+ - Huggingface Hub page
+ - Coqui page https://coqui.ai/mixtec/jemeyer/v1.0.0
+ - Colab notebook to setup and apply the model https://colab.research.google.com/drive/1b1SujEGC_F3XhvUCuUyZK_tyUkEaFZ7D?usp=sharing#scrollTo=6IvRFke4Ckpz
+Spanish ASR models
+ - XLS-R model based on CV8 with LM https://huggingface.co/jonatasgrosman/wav2vec2-xls-r-1b-spanish
+ - XLSR model based on CV6 with LM https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-spanish
+ - XLSR model based on Librispeech https://huggingface.co/IIC/wav2vec2-spanish-multilibrispeech
+Speechbrain Language identification on Common Language (from Common Voice 6/7?)
+ - source code https://github.com/speechbrain/speechbrain/tree/develop/recipes/CommonLanguage
+ - HF Hub model page https://huggingface.co/speechbrain/lang-id-commonlanguage_ecapa
+ - HF Hub space https://huggingface.co/spaces/akhaliq/Speechbrain-audio-classification
+Speechbrain Language identification on VoxLingua
+ - source code https://github.com/speechbrain/speechbrain/tree/develop/recipes/VoxLingua107/lang_id
+ - HF Hub model page https://huggingface.co/speechbrain/lang-id-voxlingua107-ecapa
+## Corpora
+OpenSLR89 https://www.openslr.org/89/
+Common Language https://huggingface.co/datasets/common_language
+VoxLingua http://bark.phon.ioc.ee/voxlingua107/
+Multilibrispeech https://huggingface.co/datasets/multilingual_librispeech
+# Possible demos
+## Simple categorization of utterances
+A few example files are provided for each language, and the user can record their own.
+The predicted confidence of each class label is shown.
+## Segmentation and identification
+Recordings with alternating languages in a single audio file, provided examples or the user can record.
+Some voice activity detection to split the audio, then predict language of each piece
+## Identication and transcription
+Example files for each language separately.
+The lang-id model predicts what language it is.
+The corresponding ASR model produces a transcript.
+## Segmentation, identification and transcription
+Recordings with alternating languages in a single audio file.
+Use voice activity detection to split the audio, then predict the language of each piece
+Use the corresponding ASR model to produce a transcript of each piece to display.