| .. _tutorials: | |
| Tutorials | |
| ========= | |
| The best way to get started with NeMo is to start with one of our tutorials. | |
| Most NeMo tutorials can be run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_. | |
| To run a tutorial: | |
| #. Click the **Colab** link (see table below). | |
| #. Connect to an instance with a GPU. For example, click **Runtime** > **Change runtime type** and select **GPU** for the hardware accelerator. | |
| .. list-table:: **Tutorials** | |
| :widths: 15 25 25 | |
| :header-rows: 1 | |
| * - Domain | |
| - Title | |
| - GitHub URL | |
| * - General | |
| - Getting Started: Exploring Nemo Fundamentals | |
| - `NeMo Fundamentals <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/00_NeMo_Primer.ipynb>`_ | |
| * - General | |
| - Getting Started: Sample Conversational AI application | |
| - `Audio translator example <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/AudioTranslationSample.ipynb>`_ | |
| * - General | |
| - Getting Started: Voice swap application | |
| - `Voice swap example <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/VoiceSwapSample.ipynb>`_ | |
| * - General | |
| - Exploring NeMo Model Construction | |
| - `NeMo Models <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/01_NeMo_Models.ipynb>`_ | |
| * - General | |
| - Exploring NeMo Adapters | |
| - `NeMo Adapters <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/02_NeMo_Adapters.ipynb>`_ | |
| * - General | |
| - Publishing NeMo models on Hugging Face Hub | |
| - `NeMo Models on HF Hub <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/Publish_NeMo_Model_On_Hugging_Face_Hub.ipynb>`_ | |
| * - ASR | |
| - ASR with NeMo | |
| - `ASR with NeMo <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_with_NeMo.ipynb>`_ | |
| * - ASR | |
| - ASR with Subword Tokenization | |
| - `ASR with Subword Tokenization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_with_Subword_Tokenization.ipynb>`_ | |
| * - ASR | |
| - Offline ASR Inference with Beam Search and External Language Model Rescoring | |
| - `Offline ASR <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Offline_ASR.ipynb>`_ | |
| * - ASR | |
| - Online ASR inference with Microphone | |
| - `Online ASR Microphone <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_ASR_Microphone_Demo.ipynb>`_ | |
| * - ASR | |
| - Fine-tuning CTC Models on New Languages | |
| - `ASR CTC Language Fine-Tuning <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_CTC_Language_Finetuning.ipynb>`_ | |
| * - ASR | |
| - Intro to Transducers | |
| - `Intro to Transducers <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Intro_to_Transducers.ipynb>`_ | |
| * - ASR | |
| - ASR with Transducers | |
| - `ASR with Transducers <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_with_Transducers.ipynb>`_ | |
| * - ASR | |
| - ASR with Adapters | |
| - `ASR with Adapters <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/asr_adapters/ASR_with_Adapters.ipynb>`_ | |
| * - ASR | |
| - Speech Commands | |
| - `Speech Commands <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Speech_Commands.ipynb>`_ | |
| * - ASR | |
| - Online and Offline Speech Commands Inference | |
| - `Online Offline Microphone Speech Commands <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_Offline_Speech_Commands_Demo.ipynb>`_ | |
| * - ASR | |
| - Voice Activity Detection (VAD) | |
| - `Voice Activity Detection <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Voice_Activity_Detection.ipynb>`_ | |
| * - ASR | |
| - Online and Offline VAD Inference | |
| - `Online Offline Microphone VAD <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_Offline_Microphone_VAD_Demo.ipynb>`_ | |
| * - ASR | |
| - Speaker Recognition and Verification | |
| - `Speaker Recognition and Verification <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/speaker_tasks/Speaker_Identification_Verification.ipynb>`_ | |
| * - ASR | |
| - Speaker Diarization Inference | |
| - `Speaker Diarization Inference <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/speaker_tasks/Speaker_Diarization_Inference.ipynb>`_ | |
| * - ASR | |
| - ASR with Speaker Diarization | |
| - `ASR with Speaker Diarization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/speaker_tasks/ASR_with_SpeakerDiarization.ipynb>`_ | |
| * - ASR | |
| - Online Noise Augmentation | |
| - `Online Noise Augmentation <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_Noise_Augmentation.ipynb>`_ | |
| * - ASR | |
| - ASR for Telephony Speech | |
| - `ASR for Telephony Speech <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/ASR_for_telephony_speech.ipynb>`_ | |
| * - ASR | |
| - Streaming inference for ASR | |
| - `Streaming inference <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Streaming_ASR.ipynb>`_ | |
| * - ASR | |
| - Buffered Transducer inference for ASR | |
| - `Buffered Transducer inference <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Buffered_Transducer_Inference.ipynb>`_ | |
| * - ASR | |
| - Buffered Transducer inference with LCS Merge Algorithm | |
| - `Buffered Transducer inference with LCS Merge <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Buffered_Transducer_Inference_with_LCS_Merge.ipynb>`_ | |
| * - ASR | |
| - Offline ASR with VAD for CTC models | |
| - `Offline ASR with VAD for CTC models <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Offline_ASR_with_VAD_for_CTC_models.ipynb>`_ | |
| * - ASR | |
| - Self-supervised pre-training for ASR | |
| - `Self-supervised Pre-training for ASR <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Self_Supervised_Pre_Training.ipynb>`_ | |
| * - ASR | |
| - Multi-lingual ASR | |
| - `Multi-lingual ASR <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Multilang_ASR.ipynb>`_ | |
| * - NLP | |
| - Using Pretrained Language Models for Downstream Tasks | |
| - `Pretrained Language Models for Downstream Tasks <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/01_Pretrained_Language_Models_for_Downstream_Tasks.ipynb>`_ | |
| * - NLP | |
| - Exploring NeMo NLP Tokenizers | |
| - `NLP Tokenizers <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/02_NLP_Tokenizers.ipynb>`_ | |
| * - NLP | |
| - Text Classification (Sentiment Analysis) with BERT | |
| - `Text Classification (Sentiment Analysis) <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Text_Classification_Sentiment_Analysis.ipynb>`_ | |
| * - NLP | |
| - Question Answering | |
| - `Question Answering <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Question_Answering.ipynb>`_ | |
| * - NLP | |
| - Token Classification (Named Entity Recognition) | |
| - `Token Classification: Named Entity Recognition <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Token_Classification_Named_Entity_Recognition.ipynb>`_ | |
| * - NLP | |
| - Joint Intent Classification and Slot Filling | |
| - `Joint Intent and Slot Classification <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb>`_ | |
| * - NLP | |
| - GLUE Benchmark | |
| - `GLUE Benchmark <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/GLUE_Benchmark.ipynb>`_ | |
| * - NLP | |
| - Punctuation and Capitalization | |
| - `Punctuation and Capitalization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Punctuation_and_Capitalization.ipynb>`_ | |
| * - NLP | |
| - Entity Linking | |
| - `Entity Linking <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Entity_Linking_Medical.ipynb>`_ | |
| * - NLP | |
| - Named Entity Recognition - BioMegatron | |
| - `Named Entity Recognition - BioMegatron <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Token_Classification-BioMegatron.ipynb>`_ | |
| * - NLP | |
| - Relation Extraction - BioMegatron | |
| - `Relation Extraction - BioMegatron <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Relation_Extraction-BioMegatron.ipynb>`_ | |
| * - NLP | |
| - P-Tuning/Prompt-Tuning | |
| - `P-Tuning/Prompt-Tuning <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Multitask_Prompt_and_PTuning.ipynb>`_ | |
| * - NLP | |
| - Synthetic Tabular Data Generation | |
| - `Synthetic Tabular Data Generation <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Megatron_Synthetic_Tabular_Data_Generation.ipynb>`_ | |
| * - TTS | |
| - NeMo TTS Primer | |
| - `NeMo TTS Primer <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/NeMo_TTS_Primer.ipynb>`_ | |
| * - TTS | |
| - TTS Speech/Text Aligner Inference | |
| - `TTS Speech/Text Aligner Inference <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Aligner_Inference_Examples.ipynb>`_ | |
| * - TTS | |
| - FastPitch and MixerTTS Model Training | |
| - `FastPitch and MixerTTS Model Training <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_MixerTTS_Training.ipynb>`_ | |
| * - TTS | |
| - FastPitch Finetuning | |
| - `FastPitch Finetuning <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_Finetuning.ipynb>`_ | |
| * - TTS | |
| - FastPitch and HiFiGAN Model Training for German | |
| - `FastPitch and HiFiGAN Model Training for German <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_GermanTTS_Training.ipynb>`_ | |
| * - TTS | |
| - Tacotron2 Model Training | |
| - `Tacotron2 Model Training <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Tacotron2_Training.ipynb>`_ | |
| * - TTS | |
| - FastPitch Duration and Pitch Control | |
| - `FastPitch Duration and Pitch Control <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Inference_DurationPitchControl.ipynb>`_ | |
| * - TTS | |
| - FastPitch Speaker Interpolation | |
| - `FastPitch Speaker Interpolation <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/FastPitch_Speaker_Interpolation.ipynb>`_ | |
| * - TTS | |
| - Inference and Model Selection | |
| - `TTS Inference and Model Selection <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Inference_ModelSelect.ipynb>`_ | |
| * - TTS | |
| - Pronunciation_customization | |
| - `TTS Pronunciation_customization <https://colab.research.google.com/github/NVIDIA/NeMo/tree/stable/tutorials/tts/Pronunciation_customization.ipynb>`_ | |
| * - Tools | |
| - CTC Segmentation | |
| - `CTC Segmentation <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/tools/CTC_Segmentation_Tutorial.ipynb>`_ | |
| * - Text Processing (TN/ITN) | |
| - Text Normalization and Inverse Normalization for ASR and TTS | |
| - `Text Normalization <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb>`_ | |
| * - Text Processing (TN/ITN) | |
| - Inverse Text Normalization for ASR - Thutmose Tagger | |
| - `Inverse Text Normalization with Thutmose Tagger <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/ITN_with_Thutmose_Tagger.ipynb>`_ | |
| * - Text Processing (TN/ITN) | |
| - Constructing Normalization Grammars with WFSTs | |
| - `WFST Tutorial <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/WFST_Tutorial.ipynb>`_ | |
