---
library_name: transformers
tags: []
---

# Tokenizer

A tokenizer with a vocab size of 50k for [Intro to Deep Learning](https://deeplearning.cs.cmu.edu/F24/index.html) Homework 4 on Language Modelling and Automatic Speech Recognition.

The tokenizer was trained on [LibriSpeech LM text](https://www.openslr.org/11/)