About train Wav2Vec2-PreTraining

#3
by huutuongtu - opened

Hello @patrickvonplaten , I am training wav2vec2 with my own unlabeled datasets (about 300h) following this code: https://github.com/huggingface/transformers/tree/main/examples/pytorch/speech-pretraining
My settings:
max_train_steps = 120000
num_warmup_steps= 32000
lr = 0.001
batch_size = 2
I don't know when I train, contrastive loss and grad_norm quickly decrease and equal zero (in about 400-500 steps). Do you have any idea for fix this? Thank you

Sign up or log in to comment