reazonspeech-espnet-v1 is an ESPnet model trained for Japanese automatic speech recognition (ASR).

  • This model was trained on 15,000 hours of ReazonSpeech corpus.
  • Make sure that your audio file is sampled at 16khz when using this model.

For more details, please visit the official project page.

