|
--- |
|
tags: |
|
- espnet |
|
- audio |
|
- audio-to-audio |
|
- vocoder |
|
language: |
|
- en |
|
datasets: |
|
- vctk |
|
license: cc-by-4.0 |
|
--- |
|
|
|
## Vocoder model - HifiGAN - English |
|
|
|
https://github.com/kan-bayashi/ParallelWaveGAN |
|
|
|
**No support given.** |
|
|
|
### Details |
|
|
|
``` |
|
batch_size: 16 |
|
discriminator_params: |
|
follow_official_norm: true |
|
period_discriminator_params: |
|
bias: true |
|
channels: 32 |
|
downsample_scales: |
|
- 3 |
|
- 3 |
|
- 3 |
|
- 3 |
|
- 1 |
|
in_channels: 1 |
|
kernel_sizes: |
|
- 5 |
|
- 3 |
|
max_downsample_channels: 1024 |
|
nonlinear_activation: LeakyReLU |
|
nonlinear_activation_params: |
|
negative_slope: 0.1 |
|
out_channels: 1 |
|
use_spectral_norm: false |
|
use_weight_norm: true |
|
periods: |
|
- 2 |
|
- 3 |
|
- 5 |
|
- 7 |
|
- 11 |
|
``` |
|
|