|
.. _tutorials: |
|
|
|
Tutorials |
|
========= |
|
|
|
The best way to get started with NeMo is to start with one of our tutorials. |
|
|
|
Most NeMo tutorials can be run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_. |
|
|
|
To run a tutorial: |
|
|
|
#. Click the **Colab** link (see table below). |
|
#. Connect to an instance with a GPU. For example, click **Runtime** > **Change runtime type** and select **GPU** for the hardware accelerator. |
|
|
|
.. list-table:: **Tutorials** |
|
:widths: 15 25 25 |
|
:header-rows: 1 |
|
|
|
* - Domain |
|
- Title |
|
- GitHub URL |
|
* - General |
|
- Getting Started: Exploring Nemo Fundamentals |
|
- `NeMo Fundamentals <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/00_NeMo_Primer.ipynb>`_ |
|
* - General |
|
- Getting Started: Sample Conversational AI application |
|
- `Audio translator example <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/AudioTranslationSample.ipynb>`_ |
|
* - General |
|
- Getting Started: Voice swap application |
|
- `Voice swap example <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/VoiceSwapSample.ipynb>`_ |
|
* - General |
|
- Exploring NeMo Model Construction |
|
- `NeMo Models <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/01_NeMo_Models.ipynb>`_ |
|
* - General |
|
- Exploring NeMo Adapters |
|
- `NeMo Adapters <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/02_NeMo_Adapters.ipynb>`_ |
|
* - General |
|
- Publishing NeMo models on Hugging Face Hub |
|
- `NeMo Models on HF Hub <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/Publish_NeMo_Model_On_Hugging_Face_Hub.ipynb>`_ |
|
* - ASR |
|
- ASR with NeMo |
|
- `ASR with NeMo <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_with_NeMo.ipynb>`_ |
|
* - ASR |
|
- ASR with Subword Tokenization |
|
- `ASR with Subword Tokenization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_with_Subword_Tokenization.ipynb>`_ |
|
* - ASR |
|
- Offline ASR Inference with Beam Search and External Language Model Rescoring |
|
- `Offline ASR <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Offline_ASR.ipynb>`_ |
|
* - ASR |
|
- Online ASR inference with Microphone |
|
- `Online ASR Microphone <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_ASR_Microphone_Demo.ipynb>`_ |
|
* - ASR |
|
- Fine-tuning CTC Models on New Languages |
|
- `ASR CTC Language Fine-Tuning <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_CTC_Language_Finetuning.ipynb>`_ |
|
* - ASR |
|
- Intro to Transducers |
|
- `Intro to Transducers <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Intro_to_Transducers.ipynb>`_ |
|
* - ASR |
|
- ASR with Transducers |
|
- `ASR with Transducers <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_with_Transducers.ipynb>`_ |
|
* - ASR |
|
- ASR with Adapters |
|
- `ASR with Adapters <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/asr_adapters/ASR_with_Adapters.ipynb>`_ |
|
* - ASR |
|
- Speech Commands |
|
- `Speech Commands <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Speech_Commands.ipynb>`_ |
|
* - ASR |
|
- Online and Offline Speech Commands Inference |
|
- `Online Offline Microphone Speech Commands <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_Offline_Speech_Commands_Demo.ipynb>`_ |
|
* - ASR |
|
- Voice Activity Detection (VAD) |
|
- `Voice Activity Detection <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Voice_Activity_Detection.ipynb>`_ |
|
* - ASR |
|
- Online and Offline VAD Inference |
|
- `Online Offline Microphone VAD <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_Offline_Microphone_VAD_Demo.ipynb>`_ |
|
* - ASR |
|
- Speaker Recognition and Verification |
|
- `Speaker Recognition and Verification <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/speaker_tasks/Speaker_Identification_Verification.ipynb>`_ |
|
* - ASR |
|
- Speaker Diarization Inference |
|
- `Speaker Diarization Inference <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/speaker_tasks/Speaker_Diarization_Inference.ipynb>`_ |
|
* - ASR |
|
- ASR with Speaker Diarization |
|
- `ASR with Speaker Diarization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/speaker_tasks/ASR_with_SpeakerDiarization.ipynb>`_ |
|
* - ASR |
|
- Online Noise Augmentation |
|
- `Online Noise Augmentation <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_Noise_Augmentation.ipynb>`_ |
|
* - ASR |
|
- ASR for Telephony Speech |
|
- `ASR for Telephony Speech <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_for_telephony_speech.ipynb>`_ |
|
* - ASR |
|
- Streaming inference for ASR |
|
- `Streaming inference <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Streaming_ASR.ipynb>`_ |
|
* - ASR |
|
- Buffered Transducer inference for ASR |
|
- `Buffered Transducer inference <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Buffered_Transducer_Inference.ipynb>`_ |
|
* - ASR |
|
- Buffered Transducer inference with LCS Merge Algorithm |
|
- `Buffered Transducer inference with LCS Merge <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Buffered_Transducer_Inference_with_LCS_Merge.ipynb>`_ |
|
* - ASR |
|
- Offline ASR with VAD for CTC models |
|
- `Offline ASR with VAD for CTC models <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Offline_ASR_with_VAD_for_CTC_models.ipynb>`_ |
|
* - ASR |
|
- Self-supervised pre-training for ASR |
|
- `Self-supervised Pre-training for ASR <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Self_Supervised_Pre_Training.ipynb>`_ |
|
* - ASR |
|
- Multi-lingual ASR |
|
- `Multi-lingual ASR <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Multilang_ASR.ipynb>`_ |
|
* - NLP |
|
- Using Pretrained Language Models for Downstream Tasks |
|
- `Pretrained Language Models for Downstream Tasks <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/01_Pretrained_Language_Models_for_Downstream_Tasks.ipynb>`_ |
|
* - NLP |
|
- Exploring NeMo NLP Tokenizers |
|
- `NLP Tokenizers <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/02_NLP_Tokenizers.ipynb>`_ |
|
* - NLP |
|
- Text Classification (Sentiment Analysis) with BERT |
|
- `Text Classification (Sentiment Analysis) <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Text_Classification_Sentiment_Analysis.ipynb>`_ |
|
* - NLP |
|
- Question Answering |
|
- `Question Answering <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Question_Answering.ipynb>`_ |
|
* - NLP |
|
- Token Classification (Named Entity Recognition) |
|
- `Token Classification: Named Entity Recognition <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Token_Classification_Named_Entity_Recognition.ipynb>`_ |
|
* - NLP |
|
- Joint Intent Classification and Slot Filling |
|
- `Joint Intent and Slot Classification <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb>`_ |
|
* - NLP |
|
- GLUE Benchmark |
|
- `GLUE Benchmark <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/GLUE_Benchmark.ipynb>`_ |
|
* - NLP |
|
- Punctuation and Capitalization |
|
- `Punctuation and Capitalization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Punctuation_and_Capitalization.ipynb>`_ |
|
* - NLP |
|
- Entity Linking |
|
- `Entity Linking <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Entity_Linking_Medical.ipynb>`_ |
|
* - NLP |
|
- Named Entity Recognition - BioMegatron |
|
- `Named Entity Recognition - BioMegatron <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Token_Classification-BioMegatron.ipynb>`_ |
|
* - NLP |
|
- Relation Extraction - BioMegatron |
|
- `Relation Extraction - BioMegatron <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Relation_Extraction-BioMegatron.ipynb>`_ |
|
* - NLP |
|
- P-Tuning/Prompt-Tuning |
|
- `P-Tuning/Prompt-Tuning <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Multitask_Prompt_and_PTuning.ipynb>`_ |
|
* - NLP |
|
- Synthetic Tabular Data Generation |
|
- `Synthetic Tabular Data Generation <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Megatron_Synthetic_Tabular_Data_Generation.ipynb>`_ |
|
* - TTS |
|
- NeMo TTS Primer |
|
- `NeMo TTS Primer <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/NeMo_TTS_Primer.ipynb>`_ |
|
* - TTS |
|
- TTS Speech/Text Aligner Inference |
|
- `TTS Speech/Text Aligner Inference <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Aligner_Inference_Examples.ipynb>`_ |
|
* - TTS |
|
- FastPitch and MixerTTS Model Training |
|
- `FastPitch and MixerTTS Model Training <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_MixerTTS_Training.ipynb>`_ |
|
* - TTS |
|
- FastPitch Finetuning |
|
- `FastPitch Finetuning <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_Finetuning.ipynb>`_ |
|
* - TTS |
|
- FastPitch and HiFiGAN Model Training for German |
|
- `FastPitch and HiFiGAN Model Training for German <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_GermanTTS_Training.ipynb>`_ |
|
* - TTS |
|
- Tacotron2 Model Training |
|
- `Tacotron2 Model Training <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Tacotron2_Training.ipynb>`_ |
|
* - TTS |
|
- FastPitch Duration and Pitch Control |
|
- `FastPitch Duration and Pitch Control <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Inference_DurationPitchControl.ipynb>`_ |
|
* - TTS |
|
- FastPitch Speaker Interpolation |
|
- `FastPitch Speaker Interpolation <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_Speaker_Interpolation.ipynb>`_ |
|
* - TTS |
|
- Inference and Model Selection |
|
- `TTS Inference and Model Selection <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Inference_ModelSelect.ipynb>`_ |
|
* - TTS |
|
- Pronunciation_customization |
|
- `TTS Pronunciation_customization <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Pronunciation_customization.ipynb>`_ |
|
* - Tools |
|
- CTC Segmentation |
|
- `CTC Segmentation <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/tools/CTC_Segmentation_Tutorial.ipynb>`_ |
|
* - Text Processing (TN/ITN) |
|
- Text Normalization and Inverse Normalization for ASR and TTS |
|
- `Text Normalization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb>`_ |
|
* - Text Processing (TN/ITN) |
|
- Inverse Text Normalization for ASR - Thutmose Tagger |
|
- `Inverse Text Normalization with Thutmose Tagger <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/ITN_with_Thutmose_Tagger.ipynb>`_ |
|
* - Text Processing (TN/ITN) |
|
- Constructing Normalization Grammars with WFSTs |
|
- `WFST Tutorial <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/WFST_Tutorial.ipynb>`_ |
|
|