# python3 -m espnet2.bin.asr_train --use_preprocessor true --bpemodel data/fr_token_list/bpe_unigram350/bpe.model --token_type bpe --token_list data/fr_token_list/bpe_unigram350/tokens.txt --non_linguistic_symbols none --cleaner none --g2p none --valid_data_path_and_name_and_type dump/raw/dev_fr/wav.scp,speech,sound --valid_data_path_and_name_and_type dump/raw/dev_fr/text,text,text --valid_shape_file exp/asr_stats_raw_fr_bpe350_sp/valid/speech_shape --valid_shape_file exp/asr_stats_raw_fr_bpe350_sp/valid/text_shape.bpe --resume true --init_param --ignore_init_mismatch false --fold_length 80000 --fold_length 150 --output_dir exp/asr_oxford_French_config_raw_fr_bpe350_sp --config conf/tuning/oxford_French_config.yaml --frontend_conf fs=16k --train_data_path_and_name_and_type dump/raw/train_fr_sp/wav.scp,speech,sound --train_data_path_and_name_and_type dump/raw/train_fr_sp/text,text,text --train_shape_file exp/asr_stats_raw_fr_bpe350_sp/train/speech_shape --train_shape_file exp/asr_stats_raw_fr_bpe350_sp/train/text_shape.bpe --ngpu 3 --multiprocessing_distributed True # Started at Sat Jun 11 13:30:34 EDT 2022 # /usr/bin/python3 /project/ocean/junweih/espnet/espnet2/bin/asr_train.py --use_preprocessor true --bpemodel data/fr_token_list/bpe_unigram350/bpe.model --token_type bpe --token_list data/fr_token_list/bpe_unigram350/tokens.txt --non_linguistic_symbols none --cleaner none --g2p none --valid_data_path_and_name_and_type dump/raw/dev_fr/wav.scp,speech,sound --valid_data_path_and_name_and_type dump/raw/dev_fr/text,text,text --valid_shape_file exp/asr_stats_raw_fr_bpe350_sp/valid/speech_shape --valid_shape_file exp/asr_stats_raw_fr_bpe350_sp/valid/text_shape.bpe --resume true --init_param --ignore_init_mismatch false --fold_length 80000 --fold_length 150 --output_dir exp/asr_oxford_French_config_raw_fr_bpe350_sp --config conf/tuning/oxford_French_config.yaml --frontend_conf fs=16k --train_data_path_and_name_and_type dump/raw/train_fr_sp/wav.scp,speech,sound --train_data_path_and_name_and_type dump/raw/train_fr_sp/text,text,text --train_shape_file exp/asr_stats_raw_fr_bpe350_sp/train/speech_shape --train_shape_file exp/asr_stats_raw_fr_bpe350_sp/train/text_shape.bpe --ngpu 3 --multiprocessing_distributed True [islpc50:0/3] 2022-06-11 13:30:50,303 (distributed_c10d:217) INFO: Added key: store_based_barrier_key:1 to store for rank: 0 [islpc50:0/3] 2022-06-11 13:30:50,303 (distributed_c10d:251) INFO: Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 3 nodes. [islpc50:0/3] 2022-06-11 13:30:50,358 (asr:411) INFO: Vocabulary size: 350 [islpc50:0/3] 2022-06-11 13:30:50,944 (filelock:274) INFO: Lock 139715714756560 acquired on ./hub/s3prl_cache/1c76d6e88090f01736036b28dc995fef583f47f42662d55286332557f957609f.lock [islpc50:0/3] 2022-06-11 13:30:50,945 (filelock:318) INFO: Lock 139715714756560 released on ./hub/s3prl_cache/1c76d6e88090f01736036b28dc995fef583f47f42662d55286332557f957609f.lock [Featurizer] - The selected feature last_hidden_state's downsample rate is 320 [islpc50:0/3] 2022-06-11 13:31:02,326 (s3prl:159) INFO: Pretrained S3PRL frontend model parameters reloaded! [islpc50:0/3] 2022-06-11 13:31:06,195 (abs_task:1157) INFO: pytorch.version=1.10.1+cu111, cuda.available=True, cudnn.version=8005, cudnn.benchmark=False, cudnn.deterministic=True [islpc50:0/3] 2022-06-11 13:31:06,200 (abs_task:1158) INFO: Model structure: ESPnetASRModel( (frontend): S3prlFrontend( (upstream): UpstreamExpert( (model): Wav2Vec2Model( (feature_extractor): ConvFeatureExtractionModel( (conv_layers): ModuleList( (0): Sequential( (0): Conv1d(1, 512, kernel_size=(10,), stride=(5,)) (1): Dropout(p=0.0, inplace=False) (2): Sequential( (0): TransposeLast() (1): Fp32LayerNorm((512,), eps=1e-05, elementwise_affine=True) (2): TransposeLast() ) (3): GELU() ) (1): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,)) (1): Dropout(p=0.0, inplace=False) (2): Sequential( (0): TransposeLast() (1): Fp32LayerNorm((512,), eps=1e-05, elementwise_affine=True) (2): TransposeLast() ) (3): GELU() ) (2): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,)) (1): Dropout(p=0.0, inplace=False) (2): Sequential( (0): TransposeLast() (1): Fp32LayerNorm((512,), eps=1e-05, elementwise_affine=True) (2): TransposeLast() ) (3): GELU() ) (3): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,)) (1): Dropout(p=0.0, inplace=False) (2): Sequential( (0): TransposeLast() (1): Fp32LayerNorm((512,), eps=1e-05, elementwise_affine=True) (2): TransposeLast() ) (3): GELU() ) (4): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,)) (1): Dropout(p=0.0, inplace=False) (2): Sequential( (0): TransposeLast() (1): Fp32LayerNorm((512,), eps=1e-05, elementwise_affine=True) (2): TransposeLast() ) (3): GELU() ) (5): Sequential( (0): Conv1d(512, 512, kernel_size=(2,), stride=(2,)) (1): Dropout(p=0.0, inplace=False) (2): Sequential( (0): TransposeLast() (1): Fp32LayerNorm((512,), eps=1e-05, elementwise_affine=True) (2): TransposeLast() ) (3): GELU() ) (6): Sequential( (0): Conv1d(512, 512, kernel_size=(2,), stride=(2,)) (1): Dropout(p=0.0, inplace=False) (2): Sequential( (0): TransposeLast() (1): Fp32LayerNorm((512,), eps=1e-05, elementwise_affine=True) (2): TransposeLast() ) (3): GELU() ) ) ) (post_extract_proj): Linear(in_features=512, out_features=1024, bias=True) (dropout_input): Dropout(p=0.1, inplace=False) (dropout_features): Dropout(p=0.1, inplace=False) (quantizer): GumbelVectorQuantizer( (weight_proj): Linear(in_features=512, out_features=640, bias=True) ) (project_q): Linear(in_features=768, out_features=768, bias=True) (encoder): TransformerEncoder( (pos_conv): Sequential( (0): Conv1d(1024, 1024, kernel_size=(128,), stride=(1,), padding=(64,), groups=16) (1): SamePad() (2): GELU() ) (layers): ModuleList( (0): AdapterTransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (adapter1): Adapter( (down_projection): Linear(in_features=1024, out_features=192, bias=True) (up_projection): Linear(in_features=192, out_features=1024, bias=True) ) (adapter2): Adapter( (down_projection): Linear(in_features=1024, out_features=192, bias=True) (up_projection): Linear(in_features=192, out_features=1024, bias=True) ) ) (1): AdapterTransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (adapter1): Adapter( (down_projection): Linear(in_features=1024, out_features=192, bias=True) (up_projection): Linear(in_features=192, out_features=1024, bias=True) ) (adapter2): Adapter( (down_projection): Linear(in_features=1024, out_features=192, bias=True) (up_projection): Linear(in_features=192, out_features=1024, bias=True) ) ) (2): AdapterTransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (adapter1): Adapter( (down_projection): Linear(in_features=1024, out_features=192, bias=True) (up_projection): Linear(in_features=192, out_features=1024, bias=True) ) (adapter2): Adapter( (down_projection): Linear(in_features=1024, out_features=192, bias=True) (up_projection): Linear(in_features=192, out_features=1024, bias=True) ) ) (3): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (4): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (5): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (6): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (7): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (8): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (9): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (10): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (11): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (12): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (13): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (14): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (15): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (16): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (17): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (18): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (19): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (20): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (21): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (22): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (23): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (dropout1): Dropout(p=0.0, inplace=False) (dropout2): Dropout(p=0.0, inplace=False) (dropout3): Dropout(p=0.0, inplace=False) (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) ) (layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (final_proj): Linear(in_features=1024, out_features=768, bias=True) ) ) (featurizer): Featurizer() ) (normalize): UtteranceMVN(norm_means=True, norm_vars=False) (encoder): RNNEncoder( (enc): ModuleList( (0): RNNP( (birnn0): LSTM(1024, 320, batch_first=True, bidirectional=True) (bt0): Linear(in_features=640, out_features=320, bias=True) (birnn1): LSTM(320, 320, batch_first=True, bidirectional=True) (bt1): Linear(in_features=640, out_features=320, bias=True) (birnn2): LSTM(320, 320, batch_first=True, bidirectional=True) (bt2): Linear(in_features=640, out_features=320, bias=True) (birnn3): LSTM(320, 320, batch_first=True, bidirectional=True) (bt3): Linear(in_features=640, out_features=320, bias=True) ) ) ) (criterion_att): LabelSmoothingLoss( (criterion): KLDivLoss() ) (ctc): CTC( (ctc_lo): Linear(in_features=320, out_features=350, bias=True) (ctc_loss): CTCLoss() ) ) Model summary: Class Name: ESPnetASRModel Total Number of model parameters: 329.07 M Number of trainable parameters: 11.68 M (3.5%) Size: 46.7 MB Type: torch.float32 [islpc50:0/3] 2022-06-11 13:31:06,200 (abs_task:1161) INFO: Optimizer: Adam ( Parameter Group 0 amsgrad: False betas: (0.9, 0.999) eps: 1e-08 initial_lr: 0.0002 lr: 5e-09 weight_decay: 0 ) [islpc50:0/3] 2022-06-11 13:31:06,200 (abs_task:1162) INFO: Scheduler: WarmupLR(warmup_steps=40000) [islpc50:0/3] 2022-06-11 13:31:06,281 (abs_task:1171) INFO: Saving the configuration in exp/asr_oxford_French_config_raw_fr_bpe350_sp/config.yaml [islpc50:0/3] 2022-06-11 13:31:16,878 (abs_task:1525) INFO: [train] dataset: ESPnetDataset( speech: {"path": "dump/raw/train_fr_sp/wav.scp", "type": "sound"} text: {"path": "dump/raw/train_fr_sp/text", "type": "text"} preprocess: ) [islpc50:0/3] 2022-06-11 13:31:16,878 (abs_task:1526) INFO: [train] Batch sampler: FoldedBatchSampler(N-batch=81666, batch_size=20, shape_files=['exp/asr_stats_raw_fr_bpe350_sp/train/speech_shape', 'exp/asr_stats_raw_fr_bpe350_sp/train/text_shape.bpe'], sort_in_batch=descending, sort_batch=descending) [islpc50:0/3] 2022-06-11 13:31:16,888 (abs_task:1527) INFO: [train] mini-batch sizes summary: N-batch=81666, mean=14.0, min=4, max=20 [islpc50:0/3] 2022-06-11 13:31:17,197 (abs_task:1525) INFO: [valid] dataset: ESPnetDataset( speech: {"path": "dump/raw/dev_fr/wav.scp", "type": "sound"} text: {"path": "dump/raw/dev_fr/text", "type": "text"} preprocess: ) [islpc50:0/3] 2022-06-11 13:31:17,197 (abs_task:1526) INFO: [valid] Batch sampler: FoldedBatchSampler(N-batch=1254, batch_size=20, shape_files=['exp/asr_stats_raw_fr_bpe350_sp/valid/speech_shape', 'exp/asr_stats_raw_fr_bpe350_sp/valid/text_shape.bpe'], sort_in_batch=descending, sort_batch=descending) [islpc50:0/3] 2022-06-11 13:31:17,198 (abs_task:1527) INFO: [valid] mini-batch sizes summary: N-batch=1254, mean=12.5, min=6, max=20 [islpc50:0/3] 2022-06-11 13:31:17,275 (abs_task:1525) INFO: [plot_att] dataset: ESPnetDataset( speech: {"path": "dump/raw/dev_fr/wav.scp", "type": "sound"} text: {"path": "dump/raw/dev_fr/text", "type": "text"} preprocess: ) [islpc50:0/3] 2022-06-11 13:31:17,276 (abs_task:1526) INFO: [plot_att] Batch sampler: UnsortedBatchSampler(N-batch=15621, batch_size=1, key_file=exp/asr_stats_raw_fr_bpe350_sp/valid/speech_shape, [islpc50:0/3] 2022-06-11 13:31:17,276 (abs_task:1527) INFO: [plot_att] mini-batch sizes summary: N-batch=3, mean=1.0, min=1, max=1 islpc50:2486597:2486597 [0] NCCL INFO Bootstrap : Using bond0:128.2.205.9<0> islpc50:2486597:2486597 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation islpc50:2486597:2486597 [0] NCCL INFO NET/IB : No device found. islpc50:2486597:2486597 [0] NCCL INFO NET/Socket : Using [0]bond0:128.2.205.9<0> islpc50:2486597:2486597 [0] NCCL INFO Using network Socket NCCL version 2.10.3+cuda11.1 islpc50:2486598:2486598 [1] NCCL INFO Bootstrap : Using bond0:128.2.205.9<0> islpc50:2486599:2486599 [2] NCCL INFO Bootstrap : Using bond0:128.2.205.9<0> islpc50:2486598:2486598 [1] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation islpc50:2486599:2486599 [2] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation islpc50:2486598:2486598 [1] NCCL INFO NET/IB : No device found. islpc50:2486599:2486599 [2] NCCL INFO NET/IB : No device found. islpc50:2486598:2486598 [1] NCCL INFO NET/Socket : Using [0]bond0:128.2.205.9<0> islpc50:2486599:2486599 [2] NCCL INFO NET/Socket : Using [0]bond0:128.2.205.9<0> islpc50:2486598:2486598 [1] NCCL INFO Using network Socket islpc50:2486599:2486599 [2] NCCL INFO Using network Socket islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486599:2486684 [2] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486599:2486684 [2] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486599:2486684 [2] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486599:2486684 [2] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486599:2486684 [2] NCCL INFO Trees [0] -1/-1/-1->2->1 [1] -1/-1/-1->2->1 islpc50:2486598:2486683 [1] NCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 islpc50:2486599:2486684 [2] NCCL INFO Setting affinity for GPU 2 to ffffff islpc50:2486597:2486682 [0] NCCL INFO Channel 00/02 : 0 1 2 islpc50:2486598:2486683 [1] NCCL INFO Setting affinity for GPU 1 to ffffff islpc50:2486597:2486682 [0] NCCL INFO Channel 01/02 : 0 1 2 islpc50:2486597:2486682 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 islpc50:2486597:2486682 [0] NCCL INFO Setting affinity for GPU 0 to ffffff islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486599:2486684 [2] NCCL INFO Channel 00 : 2[67000] -> 0[19000] via direct shared memory islpc50:2486598:2486683 [1] NCCL INFO Channel 00 : 1[1a000] -> 2[67000] via direct shared memory islpc50:2486597:2486682 [0] NCCL INFO Channel 00 : 0[19000] -> 1[1a000] via direct shared memory islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486599:2486684 [2] NCCL INFO Channel 01 : 2[67000] -> 0[19000] via direct shared memory islpc50:2486598:2486683 [1] NCCL INFO Channel 01 : 1[1a000] -> 2[67000] via direct shared memory islpc50:2486597:2486682 [0] NCCL INFO Channel 01 : 0[19000] -> 1[1a000] via direct shared memory islpc50:2486599:2486684 [2] NCCL INFO Connected all rings islpc50:2486598:2486683 [1] NCCL INFO Connected all rings islpc50:2486599:2486684 [2] NCCL INFO Channel 00 : 2[67000] -> 1[1a000] via direct shared memory islpc50:2486599:2486684 [2] NCCL INFO Channel 01 : 2[67000] -> 1[1a000] via direct shared memory islpc50:2486597:2486682 [0] NCCL INFO Connected all rings islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486598:2486683 [1] NCCL INFO Channel 00 : 1[1a000] -> 0[19000] via direct shared memory islpc50:2486598:2486683 [1] NCCL INFO Could not enable P2P between dev 1(=1a000) and dev 0(=19000) islpc50:2486598:2486683 [1] NCCL INFO Channel 01 : 1[1a000] -> 0[19000] via direct shared memory islpc50:2486597:2486682 [0] NCCL INFO Could not enable P2P between dev 0(=19000) and dev 1(=1a000) islpc50:2486599:2486684 [2] NCCL INFO Connected all trees islpc50:2486599:2486684 [2] NCCL INFO threadThresholds 8/8/64 | 24/8/64 | 8/8/512 islpc50:2486599:2486684 [2] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer islpc50:2486597:2486682 [0] NCCL INFO Connected all trees islpc50:2486597:2486682 [0] NCCL INFO threadThresholds 8/8/64 | 24/8/64 | 8/8/512 islpc50:2486597:2486682 [0] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer islpc50:2486598:2486683 [1] NCCL INFO Connected all trees islpc50:2486598:2486683 [1] NCCL INFO threadThresholds 8/8/64 | 24/8/64 | 8/8/512 islpc50:2486598:2486683 [1] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer islpc50:2486597:2486682 [0] NCCL INFO comm 0x7f10d4002fb0 rank 0 nranks 3 cudaDev 0 busId 19000 - Init COMPLETE islpc50:2486598:2486683 [1] NCCL INFO comm 0x7f2888002fb0 rank 1 nranks 3 cudaDev 1 busId 1a000 - Init COMPLETE islpc50:2486597:2486597 [0] NCCL INFO Launch mode Parallel islpc50:2486599:2486684 [2] NCCL INFO comm 0x7f385c002fb0 rank 2 nranks 3 cudaDev 2 busId 67000 - Init COMPLETE [islpc50:0/3] 2022-06-11 13:31:17,827 (trainer:280) INFO: 1/30epoch started [s3prl.upstream.experts] Warning: can not import s3prl.upstream.byol_a.expert: No module named 'easydict'. Pass. [s3prl.hub] Warning: can not import s3prl.upstream.byol_a.hubconf: No module named 'easydict'. Please see upstream/byol_a/README.md [s3prl.downstream.experts] Warning: can not import s3prl.downstream.quesst14_dtw.expert: No module named 'dtw'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.separation_stft.expert: No module named 'asteroid'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.enhancement_stft.expert: No module named 'asteroid'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.speech_commands.expert: No module named 'catalyst'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.a2a-vc-vctk.expert: No module named 'resemblyzer'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.voxceleb2_ge2e.expert: No module named 'sox'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.sv_voxceleb1.expert: No module named 'sox'. Pass. Using cache found in ./hub/s3prl_cache/1c76d6e88090f01736036b28dc995fef583f47f42662d55286332557f957609f for https://dl.fbaipublicfiles.com/fairseq/wav2vec/wav2vec_vox_new.pt >> inserted adapters to the following layers: 0, 1, 2 * original model weights: 317,390,592 * new model weights - all: 319,757,184 * new model weights - trainable: 2,366,592 ( 0.75% of original model) [s3prl.upstream.experts] Warning: can not import s3prl.upstream.byol_a.expert: No module named 'easydict'. Pass. [s3prl.hub] Warning: can not import s3prl.upstream.byol_a.hubconf: No module named 'easydict'. Please see upstream/byol_a/README.md [s3prl.downstream.experts] Warning: can not import s3prl.downstream.quesst14_dtw.expert: No module named 'dtw'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.separation_stft.expert: No module named 'asteroid'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.enhancement_stft.expert: No module named 'asteroid'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.speech_commands.expert: No module named 'catalyst'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.a2a-vc-vctk.expert: No module named 'resemblyzer'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.voxceleb2_ge2e.expert: No module named 'sox'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.sv_voxceleb1.expert: No module named 'sox'. Pass. Using cache found in ./hub/s3prl_cache/1c76d6e88090f01736036b28dc995fef583f47f42662d55286332557f957609f for https://dl.fbaipublicfiles.com/fairseq/wav2vec/wav2vec_vox_new.pt >> inserted adapters to the following layers: 0, 1, 2 * original model weights: 317,390,592 * new model weights - all: 319,757,184 * new model weights - trainable: 2,366,592 ( 0.75% of original model) [s3prl.upstream.experts] Warning: can not import s3prl.upstream.byol_a.expert: No module named 'easydict'. Pass. [s3prl.hub] Warning: can not import s3prl.upstream.byol_a.hubconf: No module named 'easydict'. Please see upstream/byol_a/README.md [s3prl.downstream.experts] Warning: can not import s3prl.downstream.quesst14_dtw.expert: No module named 'dtw'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.separation_stft.expert: No module named 'asteroid'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.enhancement_stft.expert: No module named 'asteroid'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.speech_commands.expert: No module named 'catalyst'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.a2a-vc-vctk.expert: No module named 'resemblyzer'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.voxceleb2_ge2e.expert: No module named 'sox'. Pass. [s3prl.downstream.experts] Warning: can not import s3prl.downstream.sv_voxceleb1.expert: No module named 'sox'. Pass. Using cache found in ./hub/s3prl_cache/1c76d6e88090f01736036b28dc995fef583f47f42662d55286332557f957609f for https://dl.fbaipublicfiles.com/fairseq/wav2vec/wav2vec_vox_new.pt >> inserted adapters to the following layers: 0, 1, 2 * original model weights: 317,390,592 * new model weights - all: 319,757,184 * new model weights - trainable: 2,366,592 ( 0.75% of original model) [islpc50:0/3] 2022-06-11 13:31:27,680 (distributed:874) INFO: Reducer buckets have been rebuilt in this iteration. [islpc50:0/3] 2022-06-11 13:53:02,423 (trainer:678) INFO: 1epoch:train:1-4083batch: iter_time=3.909e-04, forward_time=0.147, loss_ctc=189.481, loss=189.481, backward_time=0.054, optim_step_time=0.003, optim0_lr0=5.110e-06, train_time=0.639 [islpc50:0/3] 2022-06-11 14:14:41,914 (trainer:678) INFO: 1epoch:train:4084-8166batch: iter_time=3.290e-04, forward_time=0.147, loss_ctc=145.488, loss=145.488, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.532e-05, train_time=0.636 [islpc50:0/3] 2022-06-11 14:36:20,011 (trainer:678) INFO: 1epoch:train:8167-12249batch: iter_time=2.461e-04, forward_time=0.147, loss_ctc=142.253, loss=142.253, backward_time=0.053, optim_step_time=0.003, optim0_lr0=2.553e-05, train_time=0.636 [islpc50:0/3] 2022-06-11 14:58:00,521 (trainer:678) INFO: 1epoch:train:12250-16332batch: iter_time=0.002, forward_time=0.147, loss_ctc=140.910, loss=140.910, backward_time=0.054, optim_step_time=0.003, optim0_lr0=3.573e-05, train_time=0.637 [islpc50:0/3] 2022-06-11 15:19:38,050 (trainer:678) INFO: 1epoch:train:16333-20415batch: iter_time=1.437e-04, forward_time=0.147, loss_ctc=140.260, loss=140.260, backward_time=0.053, optim_step_time=0.003, optim0_lr0=4.594e-05, train_time=0.635 [islpc50:0/3] 2022-06-11 15:41:22,708 (trainer:678) INFO: 1epoch:train:20416-24498batch: iter_time=1.936e-04, forward_time=0.148, loss_ctc=136.974, loss=136.974, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.615e-05, train_time=0.639 [islpc50:0/3] 2022-06-11 16:03:11,332 (trainer:678) INFO: 1epoch:train:24499-28581batch: iter_time=0.002, forward_time=0.147, loss_ctc=130.515, loss=130.515, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.635e-05, train_time=0.641 [islpc50:0/3] 2022-06-11 16:24:48,997 (trainer:678) INFO: 1epoch:train:28582-32664batch: iter_time=4.174e-04, forward_time=0.147, loss_ctc=122.994, loss=122.994, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.656e-05, train_time=0.635 [islpc50:0/3] 2022-06-11 16:46:26,818 (trainer:678) INFO: 1epoch:train:32665-36747batch: iter_time=3.312e-04, forward_time=0.147, loss_ctc=112.917, loss=112.917, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.677e-05, train_time=0.636 [islpc50:0/3] 2022-06-11 17:08:01,044 (trainer:678) INFO: 1epoch:train:36748-40830batch: iter_time=5.898e-04, forward_time=0.146, loss_ctc=101.946, loss=101.946, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.698e-05, train_time=0.634 [islpc50:0/3] 2022-06-11 17:29:50,972 (trainer:678) INFO: 1epoch:train:40831-44913batch: iter_time=0.003, forward_time=0.145, loss_ctc=92.496, loss=92.496, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.072e-04, train_time=0.642 [islpc50:0/3] 2022-06-11 17:51:42,040 (trainer:678) INFO: 1epoch:train:44914-48996batch: iter_time=0.005, forward_time=0.146, loss_ctc=85.058, loss=85.058, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.174e-04, train_time=0.642 [islpc50:0/3] 2022-06-11 18:13:48,849 (trainer:678) INFO: 1epoch:train:48997-53079batch: iter_time=0.009, forward_time=0.147, loss_ctc=78.277, loss=78.277, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.276e-04, train_time=0.650 [islpc50:0/3] 2022-06-11 18:35:35,708 (trainer:678) INFO: 1epoch:train:53080-57162batch: iter_time=0.005, forward_time=0.146, loss_ctc=72.239, loss=72.239, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.378e-04, train_time=0.640 [islpc50:0/3] 2022-06-11 18:57:13,576 (trainer:678) INFO: 1epoch:train:57163-61245batch: iter_time=5.353e-04, forward_time=0.147, loss_ctc=67.370, loss=67.370, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.480e-04, train_time=0.636 [islpc50:0/3] 2022-06-11 19:18:56,772 (trainer:678) INFO: 1epoch:train:61246-65328batch: iter_time=0.002, forward_time=0.147, loss_ctc=63.521, loss=63.521, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.582e-04, train_time=0.638 [islpc50:0/3] 2022-06-11 19:40:31,111 (trainer:678) INFO: 1epoch:train:65329-69411batch: iter_time=0.002, forward_time=0.146, loss_ctc=59.617, loss=59.617, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.684e-04, train_time=0.634 [islpc50:0/3] 2022-06-11 20:02:15,391 (trainer:678) INFO: 1epoch:train:69412-73494batch: iter_time=0.003, forward_time=0.147, loss_ctc=56.982, loss=56.982, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.786e-04, train_time=0.639 [islpc50:0/3] 2022-06-11 20:23:56,681 (trainer:678) INFO: 1epoch:train:73495-77577batch: iter_time=0.001, forward_time=0.147, loss_ctc=53.999, loss=53.999, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.888e-04, train_time=0.637 [islpc50:0/3] 2022-06-11 20:45:36,537 (trainer:678) INFO: 1epoch:train:77578-81660batch: iter_time=7.228e-04, forward_time=0.147, loss_ctc=52.154, loss=52.154, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.978e-04, train_time=0.636 [islpc50:0/3] 2022-06-11 20:50:16,926 (trainer:334) INFO: 1epoch results: [train] iter_time=0.002, forward_time=0.147, loss_ctc=102.281, loss=102.281, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.020e-04, train_time=0.638, time=7 hours, 14 minutes and 23.47 seconds, total_count=81666, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=59.490, cer_ctc=0.288, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=59.490, time=4 minutes and 21.4 seconds, total_count=1254, gpu_max_cached_mem_GB=9.197, [att_plot] time=14.19 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-11 20:51:12,141 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss, valid.acc [islpc50:0/3] 2022-06-11 20:51:12,141 (trainer:268) INFO: 2/30epoch started. Estimated time to finish: 1 week, 1 day and 20 hours [islpc50:0/3] 2022-06-11 21:36:31,133 (trainer:678) INFO: 2epoch:train:1-4083batch: iter_time=0.245, forward_time=0.146, loss_ctc=49.090, loss=49.090, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.955e-04, train_time=1.332 [islpc50:0/3] 2022-06-11 22:18:42,024 (trainer:678) INFO: 2epoch:train:4084-8166batch: iter_time=0.172, forward_time=0.145, loss_ctc=46.330, loss=46.330, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.909e-04, train_time=1.239 [islpc50:0/3] 2022-06-11 22:43:13,266 (trainer:678) INFO: 2epoch:train:8167-12249batch: iter_time=0.017, forward_time=0.147, loss_ctc=46.015, loss=46.015, backward_time=0.054, optim_step_time=0.004, optim0_lr0=1.866e-04, train_time=0.720 [islpc50:0/3] 2022-06-11 23:05:37,732 (trainer:678) INFO: 2epoch:train:12250-16332batch: iter_time=0.005, forward_time=0.147, loss_ctc=44.383, loss=44.383, backward_time=0.055, optim_step_time=0.004, optim0_lr0=1.826e-04, train_time=0.658 [islpc50:0/3] 2022-06-11 23:27:53,559 (trainer:678) INFO: 2epoch:train:16333-20415batch: iter_time=0.004, forward_time=0.147, loss_ctc=42.847, loss=42.847, backward_time=0.054, optim_step_time=0.004, optim0_lr0=1.789e-04, train_time=0.654 [islpc50:0/3] 2022-06-11 23:49:48,474 (trainer:678) INFO: 2epoch:train:20416-24498batch: iter_time=0.001, forward_time=0.147, loss_ctc=41.620, loss=41.620, backward_time=0.054, optim_step_time=0.004, optim0_lr0=1.753e-04, train_time=0.644 [islpc50:0/3] 2022-06-12 00:11:27,537 (trainer:678) INFO: 2epoch:train:24499-28581batch: iter_time=5.394e-04, forward_time=0.147, loss_ctc=40.160, loss=40.160, backward_time=0.054, optim_step_time=0.004, optim0_lr0=1.720e-04, train_time=0.636 [islpc50:0/3] 2022-06-12 00:33:33,292 (trainer:678) INFO: 2epoch:train:28582-32664batch: iter_time=0.008, forward_time=0.147, loss_ctc=39.611, loss=39.611, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.688e-04, train_time=0.649 [islpc50:0/3] 2022-06-12 00:55:13,015 (trainer:678) INFO: 2epoch:train:32665-36747batch: iter_time=0.001, forward_time=0.147, loss_ctc=38.389, loss=38.389, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.658e-04, train_time=0.636 [islpc50:0/3] 2022-06-12 01:17:00,288 (trainer:678) INFO: 2epoch:train:36748-40830batch: iter_time=0.003, forward_time=0.147, loss_ctc=37.184, loss=37.184, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.630e-04, train_time=0.640 [islpc50:0/3] 2022-06-12 01:38:44,583 (trainer:678) INFO: 2epoch:train:40831-44913batch: iter_time=5.495e-04, forward_time=0.147, loss_ctc=36.951, loss=36.951, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.603e-04, train_time=0.639 [islpc50:0/3] 2022-06-12 02:01:03,078 (trainer:678) INFO: 2epoch:train:44914-48996batch: iter_time=0.012, forward_time=0.147, loss_ctc=36.285, loss=36.285, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.577e-04, train_time=0.655 [islpc50:0/3] 2022-06-12 02:22:49,049 (trainer:678) INFO: 2epoch:train:48997-53079batch: iter_time=0.002, forward_time=0.147, loss_ctc=35.380, loss=35.380, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.553e-04, train_time=0.639 [islpc50:0/3] 2022-06-12 02:45:10,808 (trainer:678) INFO: 2epoch:train:53080-57162batch: iter_time=0.012, forward_time=0.147, loss_ctc=34.490, loss=34.490, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.530e-04, train_time=0.657 [islpc50:0/3] 2022-06-12 03:06:50,413 (trainer:678) INFO: 2epoch:train:57163-61245batch: iter_time=0.002, forward_time=0.147, loss_ctc=34.311, loss=34.311, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.507e-04, train_time=0.636 [islpc50:0/3] 2022-06-12 03:28:47,687 (trainer:678) INFO: 2epoch:train:61246-65328batch: iter_time=0.008, forward_time=0.146, loss_ctc=33.544, loss=33.544, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.486e-04, train_time=0.645 [islpc50:0/3] 2022-06-12 03:52:25,143 (trainer:678) INFO: 2epoch:train:65329-69411batch: iter_time=0.034, forward_time=0.146, loss_ctc=33.036, loss=33.036, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.465e-04, train_time=0.694 [islpc50:0/3] 2022-06-12 04:14:28,713 (trainer:678) INFO: 2epoch:train:69412-73494batch: iter_time=0.007, forward_time=0.147, loss_ctc=32.686, loss=32.686, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.446e-04, train_time=0.648 [islpc50:0/3] 2022-06-12 04:36:12,295 (trainer:678) INFO: 2epoch:train:73495-77577batch: iter_time=0.003, forward_time=0.146, loss_ctc=32.204, loss=32.204, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.427e-04, train_time=0.638 [islpc50:0/3] 2022-06-12 04:57:59,330 (trainer:678) INFO: 2epoch:train:77578-81660batch: iter_time=0.003, forward_time=0.147, loss_ctc=31.782, loss=31.782, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.409e-04, train_time=0.640 [islpc50:0/3] 2022-06-12 05:02:29,118 (trainer:334) INFO: 2epoch results: [train] iter_time=0.027, forward_time=0.147, loss_ctc=38.306, loss=38.306, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.640e-04, train_time=0.715, time=8 hours, 6 minutes and 51.67 seconds, total_count=163332, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=38.605, cer_ctc=0.182, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=38.605, time=4 minutes and 15.26 seconds, total_count=2508, gpu_max_cached_mem_GB=9.197, [att_plot] time=10.04 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-12 05:03:19,443 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-12 05:03:19,444 (trainer:268) INFO: 3/30epoch started. Estimated time to finish: 1 week, 2 days and 1 hour [islpc50:0/3] 2022-06-12 05:25:15,888 (trainer:678) INFO: 3epoch:train:1-4083batch: iter_time=5.122e-04, forward_time=0.148, loss_ctc=30.432, loss=30.432, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.391e-04, train_time=0.645 [islpc50:0/3] 2022-06-12 05:46:58,470 (trainer:678) INFO: 3epoch:train:4084-8166batch: iter_time=7.430e-04, forward_time=0.147, loss_ctc=30.042, loss=30.042, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.374e-04, train_time=0.638 [islpc50:0/3] 2022-06-12 06:08:39,806 (trainer:678) INFO: 3epoch:train:8167-12249batch: iter_time=4.349e-04, forward_time=0.147, loss_ctc=29.845, loss=29.845, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.358e-04, train_time=0.637 [islpc50:0/3] 2022-06-12 06:30:21,386 (trainer:678) INFO: 3epoch:train:12250-16332batch: iter_time=7.696e-04, forward_time=0.147, loss_ctc=29.395, loss=29.395, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.342e-04, train_time=0.637 [islpc50:0/3] 2022-06-12 06:52:00,319 (trainer:678) INFO: 3epoch:train:16333-20415batch: iter_time=0.002, forward_time=0.146, loss_ctc=28.876, loss=28.876, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.327e-04, train_time=0.636 [islpc50:0/3] 2022-06-12 07:13:44,818 (trainer:678) INFO: 3epoch:train:20416-24498batch: iter_time=7.595e-04, forward_time=0.147, loss_ctc=29.208, loss=29.208, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.312e-04, train_time=0.639 [islpc50:0/3] 2022-06-12 07:35:23,347 (trainer:678) INFO: 3epoch:train:24499-28581batch: iter_time=3.038e-04, forward_time=0.147, loss_ctc=28.503, loss=28.503, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.298e-04, train_time=0.636 [islpc50:0/3] 2022-06-12 07:57:02,809 (trainer:678) INFO: 3epoch:train:28582-32664batch: iter_time=6.722e-04, forward_time=0.147, loss_ctc=28.412, loss=28.412, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.284e-04, train_time=0.636 [islpc50:0/3] 2022-06-12 08:24:53,997 (trainer:678) INFO: 3epoch:train:32665-36747batch: iter_time=0.063, forward_time=0.147, loss_ctc=28.261, loss=28.261, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.271e-04, train_time=0.818 [islpc50:0/3] 2022-06-12 09:02:20,601 (trainer:678) INFO: 3epoch:train:36748-40830batch: iter_time=0.139, forward_time=0.146, loss_ctc=28.267, loss=28.267, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.258e-04, train_time=1.100 [islpc50:0/3] 2022-06-12 10:10:16,955 (trainer:678) INFO: 3epoch:train:40831-44913batch: iter_time=0.425, forward_time=0.144, loss_ctc=27.702, loss=27.702, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.246e-04, train_time=1.997 [islpc50:0/3] 2022-06-12 10:43:13,550 (trainer:678) INFO: 3epoch:train:44914-48996batch: iter_time=0.111, forward_time=0.147, loss_ctc=27.709, loss=27.709, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.234e-04, train_time=0.968 [islpc50:0/3] 2022-06-12 11:05:16,766 (trainer:678) INFO: 3epoch:train:48997-53079batch: iter_time=0.006, forward_time=0.147, loss_ctc=27.699, loss=27.699, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.222e-04, train_time=0.648 [islpc50:0/3] 2022-06-12 11:27:18,026 (trainer:678) INFO: 3epoch:train:53080-57162batch: iter_time=0.002, forward_time=0.147, loss_ctc=27.181, loss=27.181, backward_time=0.054, optim_step_time=0.004, optim0_lr0=1.210e-04, train_time=0.647 [islpc50:0/3] 2022-06-12 11:50:23,966 (trainer:678) INFO: 3epoch:train:57163-61245batch: iter_time=0.017, forward_time=0.147, loss_ctc=26.846, loss=26.846, backward_time=0.054, optim_step_time=0.004, optim0_lr0=1.199e-04, train_time=0.679 [islpc50:0/3] 2022-06-12 12:16:52,496 (trainer:678) INFO: 3epoch:train:61246-65328batch: iter_time=0.036, forward_time=0.146, loss_ctc=26.744, loss=26.744, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.188e-04, train_time=0.778 [islpc50:0/3] 2022-06-12 12:38:35,434 (trainer:678) INFO: 3epoch:train:65329-69411batch: iter_time=0.002, forward_time=0.146, loss_ctc=26.442, loss=26.442, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.178e-04, train_time=0.638 [islpc50:0/3] 2022-06-12 13:20:12,831 (trainer:678) INFO: 3epoch:train:69412-73494batch: iter_time=0.192, forward_time=0.146, loss_ctc=26.421, loss=26.421, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.167e-04, train_time=1.223 [islpc50:0/3] 2022-06-12 14:10:00,036 (trainer:678) INFO: 3epoch:train:73495-77577batch: iter_time=0.273, forward_time=0.146, loss_ctc=26.133, loss=26.133, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.157e-04, train_time=1.461 [islpc50:0/3] 2022-06-12 15:05:48,750 (trainer:678) INFO: 3epoch:train:77578-81660batch: iter_time=0.333, forward_time=0.147, loss_ctc=26.070, loss=26.070, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.148e-04, train_time=1.642 [islpc50:0/3] 2022-06-12 15:10:22,347 (trainer:334) INFO: 3epoch results: [train] iter_time=0.080, forward_time=0.147, loss_ctc=28.009, loss=28.009, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.258e-04, train_time=0.885, time=10 hours, 2 minutes and 33.97 seconds, total_count=244998, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=33.220, cer_ctc=0.159, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=33.220, time=4 minutes and 17.05 seconds, total_count=3762, gpu_max_cached_mem_GB=9.197, [att_plot] time=11.88 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-12 15:10:59,862 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-12 15:10:59,863 (trainer:268) INFO: 4/30epoch started. Estimated time to finish: 1 week, 2 days and 14 hours [islpc50:0/3] 2022-06-12 15:36:43,052 (trainer:678) INFO: 4epoch:train:1-4083batch: iter_time=0.032, forward_time=0.146, loss_ctc=24.473, loss=24.473, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.138e-04, train_time=0.756 [islpc50:0/3] 2022-06-12 15:58:26,436 (trainer:678) INFO: 4epoch:train:4084-8166batch: iter_time=2.295e-04, forward_time=0.147, loss_ctc=24.445, loss=24.445, backward_time=0.054, optim_step_time=0.003, optim0_lr0=1.129e-04, train_time=0.638 [islpc50:0/3] 2022-06-12 16:20:06,957 (trainer:678) INFO: 4epoch:train:8167-12249batch: iter_time=0.001, forward_time=0.147, loss_ctc=24.365, loss=24.365, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.120e-04, train_time=0.637 [islpc50:0/3] 2022-06-12 16:41:53,848 (trainer:678) INFO: 4epoch:train:12250-16332batch: iter_time=0.002, forward_time=0.147, loss_ctc=24.288, loss=24.288, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.111e-04, train_time=0.640 [islpc50:0/3] 2022-06-12 17:03:37,790 (trainer:678) INFO: 4epoch:train:16333-20415batch: iter_time=3.871e-04, forward_time=0.147, loss_ctc=24.116, loss=24.116, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.102e-04, train_time=0.639 [islpc50:0/3] 2022-06-12 17:25:19,801 (trainer:678) INFO: 4epoch:train:20416-24498batch: iter_time=0.001, forward_time=0.148, loss_ctc=24.180, loss=24.180, backward_time=0.052, optim_step_time=0.004, optim0_lr0=1.094e-04, train_time=0.638 [islpc50:0/3] 2022-06-12 17:47:05,914 (trainer:678) INFO: 4epoch:train:24499-28581batch: iter_time=0.003, forward_time=0.148, loss_ctc=23.946, loss=23.946, backward_time=0.052, optim_step_time=0.004, optim0_lr0=1.086e-04, train_time=0.640 [islpc50:0/3] 2022-06-12 18:09:07,674 (trainer:678) INFO: 4epoch:train:28582-32664batch: iter_time=0.007, forward_time=0.148, loss_ctc=23.945, loss=23.945, backward_time=0.051, optim_step_time=0.004, optim0_lr0=1.078e-04, train_time=0.647 [islpc50:0/3] 2022-06-12 18:33:32,346 (trainer:678) INFO: 4epoch:train:32665-36747batch: iter_time=0.035, forward_time=0.149, loss_ctc=23.532, loss=23.532, backward_time=0.052, optim_step_time=0.004, optim0_lr0=1.070e-04, train_time=0.717 [islpc50:0/3] 2022-06-12 19:02:07,429 (trainer:678) INFO: 4epoch:train:36748-40830batch: iter_time=0.086, forward_time=0.149, loss_ctc=23.718, loss=23.718, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.062e-04, train_time=0.840 [islpc50:0/3] 2022-06-12 19:30:04,243 (trainer:678) INFO: 4epoch:train:40831-44913batch: iter_time=0.076, forward_time=0.149, loss_ctc=23.173, loss=23.173, backward_time=0.052, optim_step_time=0.004, optim0_lr0=1.054e-04, train_time=0.821 [islpc50:0/3] 2022-06-12 19:52:05,544 (trainer:678) INFO: 4epoch:train:44914-48996batch: iter_time=0.004, forward_time=0.150, loss_ctc=23.818, loss=23.818, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.047e-04, train_time=0.647 [islpc50:0/3] 2022-06-12 20:32:04,024 (trainer:678) INFO: 4epoch:train:48997-53079batch: iter_time=0.229, forward_time=0.149, loss_ctc=23.292, loss=23.292, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.040e-04, train_time=1.175 [islpc50:0/3] 2022-06-12 21:21:33,701 (trainer:678) INFO: 4epoch:train:53080-57162batch: iter_time=0.323, forward_time=0.146, loss_ctc=22.869, loss=22.869, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.033e-04, train_time=1.454 [islpc50:0/3] 2022-06-12 21:43:10,815 (trainer:678) INFO: 4epoch:train:57163-61245batch: iter_time=8.433e-04, forward_time=0.146, loss_ctc=22.760, loss=22.760, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.026e-04, train_time=0.635 [islpc50:0/3] 2022-06-12 22:04:50,780 (trainer:678) INFO: 4epoch:train:61246-65328batch: iter_time=3.822e-04, forward_time=0.147, loss_ctc=22.995, loss=22.995, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.019e-04, train_time=0.637 [islpc50:0/3] 2022-06-12 22:26:30,612 (trainer:678) INFO: 4epoch:train:65329-69411batch: iter_time=0.001, forward_time=0.147, loss_ctc=22.948, loss=22.948, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.012e-04, train_time=0.637 [islpc50:0/3] 2022-06-12 22:48:20,098 (trainer:678) INFO: 4epoch:train:69412-73494batch: iter_time=0.006, forward_time=0.146, loss_ctc=22.627, loss=22.627, backward_time=0.053, optim_step_time=0.003, optim0_lr0=1.006e-04, train_time=0.641 [islpc50:0/3] 2022-06-12 23:11:17,224 (trainer:678) INFO: 4epoch:train:73495-77577batch: iter_time=0.022, forward_time=0.147, loss_ctc=22.641, loss=22.641, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.992e-05, train_time=0.674 [islpc50:0/3] 2022-06-12 23:33:14,133 (trainer:678) INFO: 4epoch:train:77578-81660batch: iter_time=0.007, forward_time=0.147, loss_ctc=22.712, loss=22.712, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.929e-05, train_time=0.645 [islpc50:0/3] 2022-06-12 23:37:52,836 (trainer:334) INFO: 4epoch results: [train] iter_time=0.042, forward_time=0.147, loss_ctc=23.540, loss=23.540, backward_time=0.053, optim_step_time=0.004, optim0_lr0=1.061e-04, train_time=0.738, time=8 hours, 22 minutes and 19.3 seconds, total_count=326664, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=30.503, cer_ctc=0.144, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=30.503, time=4 minutes and 24.59 seconds, total_count=5016, gpu_max_cached_mem_GB=9.197, [att_plot] time=9.08 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-12 23:38:32,333 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-12 23:38:32,335 (trainer:268) INFO: 5/30epoch started. Estimated time to finish: 1 week, 2 days and 5 hours [islpc50:0/3] 2022-06-13 00:00:20,787 (trainer:678) INFO: 5epoch:train:1-4083batch: iter_time=3.201e-04, forward_time=0.147, loss_ctc=21.178, loss=21.178, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.867e-05, train_time=0.641 [islpc50:0/3] 2022-06-13 00:21:54,875 (trainer:678) INFO: 5epoch:train:4084-8166batch: iter_time=1.799e-04, forward_time=0.147, loss_ctc=21.245, loss=21.245, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.806e-05, train_time=0.634 [islpc50:0/3] 2022-06-13 00:43:33,338 (trainer:678) INFO: 5epoch:train:8167-12249batch: iter_time=2.541e-04, forward_time=0.147, loss_ctc=21.105, loss=21.105, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.746e-05, train_time=0.636 [islpc50:0/3] 2022-06-13 01:05:12,985 (trainer:678) INFO: 5epoch:train:12250-16332batch: iter_time=3.423e-04, forward_time=0.147, loss_ctc=21.305, loss=21.305, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.688e-05, train_time=0.636 [islpc50:0/3] 2022-06-13 01:26:51,045 (trainer:678) INFO: 5epoch:train:16333-20415batch: iter_time=0.001, forward_time=0.147, loss_ctc=21.059, loss=21.059, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.630e-05, train_time=0.636 [islpc50:0/3] 2022-06-13 01:48:26,787 (trainer:678) INFO: 5epoch:train:20416-24498batch: iter_time=6.588e-04, forward_time=0.147, loss_ctc=20.941, loss=20.941, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.574e-05, train_time=0.634 [islpc50:0/3] 2022-06-13 02:10:11,368 (trainer:678) INFO: 5epoch:train:24499-28581batch: iter_time=6.304e-04, forward_time=0.148, loss_ctc=20.976, loss=20.976, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.518e-05, train_time=0.639 [islpc50:0/3] 2022-06-13 02:31:53,345 (trainer:678) INFO: 5epoch:train:28582-32664batch: iter_time=0.001, forward_time=0.147, loss_ctc=21.029, loss=21.029, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.464e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 02:53:32,468 (trainer:678) INFO: 5epoch:train:32665-36747batch: iter_time=0.003, forward_time=0.146, loss_ctc=20.707, loss=20.707, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.410e-05, train_time=0.636 [islpc50:0/3] 2022-06-13 03:15:29,416 (trainer:678) INFO: 5epoch:train:36748-40830batch: iter_time=0.005, forward_time=0.147, loss_ctc=20.843, loss=20.843, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.357e-05, train_time=0.645 [islpc50:0/3] 2022-06-13 03:37:08,963 (trainer:678) INFO: 5epoch:train:40831-44913batch: iter_time=0.001, forward_time=0.147, loss_ctc=20.818, loss=20.818, backward_time=0.054, optim_step_time=0.003, optim0_lr0=9.306e-05, train_time=0.636 [islpc50:0/3] 2022-06-13 03:58:52,524 (trainer:678) INFO: 5epoch:train:44914-48996batch: iter_time=0.002, forward_time=0.147, loss_ctc=20.532, loss=20.532, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.255e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 04:20:35,554 (trainer:678) INFO: 5epoch:train:48997-53079batch: iter_time=0.003, forward_time=0.146, loss_ctc=20.393, loss=20.393, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.205e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 04:42:39,084 (trainer:678) INFO: 5epoch:train:53080-57162batch: iter_time=0.006, forward_time=0.147, loss_ctc=20.528, loss=20.528, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.155e-05, train_time=0.648 [islpc50:0/3] 2022-06-13 05:04:28,533 (trainer:678) INFO: 5epoch:train:57163-61245batch: iter_time=0.003, forward_time=0.147, loss_ctc=20.478, loss=20.478, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.107e-05, train_time=0.641 [islpc50:0/3] 2022-06-13 05:26:16,331 (trainer:678) INFO: 5epoch:train:61246-65328batch: iter_time=0.003, forward_time=0.147, loss_ctc=20.189, loss=20.189, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.059e-05, train_time=0.640 [islpc50:0/3] 2022-06-13 05:48:31,928 (trainer:678) INFO: 5epoch:train:65329-69411batch: iter_time=0.010, forward_time=0.148, loss_ctc=20.556, loss=20.556, backward_time=0.053, optim_step_time=0.003, optim0_lr0=9.012e-05, train_time=0.654 [islpc50:0/3] 2022-06-13 06:10:37,344 (trainer:678) INFO: 5epoch:train:69412-73494batch: iter_time=0.008, forward_time=0.147, loss_ctc=20.453, loss=20.453, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.965e-05, train_time=0.649 [islpc50:0/3] 2022-06-13 06:32:33,516 (trainer:678) INFO: 5epoch:train:73495-77577batch: iter_time=0.007, forward_time=0.146, loss_ctc=20.150, loss=20.150, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.920e-05, train_time=0.645 [islpc50:0/3] 2022-06-13 06:54:15,864 (trainer:678) INFO: 5epoch:train:77578-81660batch: iter_time=0.002, forward_time=0.147, loss_ctc=20.112, loss=20.112, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.875e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 06:58:55,028 (trainer:334) INFO: 5epoch results: [train] iter_time=0.003, forward_time=0.147, loss_ctc=20.730, loss=20.730, backward_time=0.053, optim_step_time=0.004, optim0_lr0=9.346e-05, train_time=0.640, time=7 hours, 15 minutes and 48.49 seconds, total_count=408330, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=29.017, cer_ctc=0.136, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=29.017, time=4 minutes and 25.77 seconds, total_count=6270, gpu_max_cached_mem_GB=9.197, [att_plot] time=8.43 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-13 06:59:40,484 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-13 06:59:40,532 (trainer:268) INFO: 6/30epoch started. Estimated time to finish: 1 week, 1 day and 15 hours [islpc50:0/3] 2022-06-13 07:21:31,739 (trainer:678) INFO: 6epoch:train:1-4083batch: iter_time=2.200e-04, forward_time=0.147, loss_ctc=18.802, loss=18.802, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.831e-05, train_time=0.642 [islpc50:0/3] 2022-06-13 07:43:17,702 (trainer:678) INFO: 6epoch:train:4084-8166batch: iter_time=1.542e-04, forward_time=0.148, loss_ctc=18.792, loss=18.792, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.787e-05, train_time=0.639 [islpc50:0/3] 2022-06-13 08:05:02,425 (trainer:678) INFO: 6epoch:train:8167-12249batch: iter_time=2.066e-04, forward_time=0.148, loss_ctc=18.955, loss=18.955, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.744e-05, train_time=0.639 [islpc50:0/3] 2022-06-13 08:26:39,705 (trainer:678) INFO: 6epoch:train:12250-16332batch: iter_time=2.035e-04, forward_time=0.147, loss_ctc=18.796, loss=18.796, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.702e-05, train_time=0.635 [islpc50:0/3] 2022-06-13 08:48:23,135 (trainer:678) INFO: 6epoch:train:16333-20415batch: iter_time=3.192e-04, forward_time=0.148, loss_ctc=19.051, loss=19.051, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.660e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 09:10:06,576 (trainer:678) INFO: 6epoch:train:20416-24498batch: iter_time=1.452e-04, forward_time=0.148, loss_ctc=18.818, loss=18.818, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.619e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 09:31:47,792 (trainer:678) INFO: 6epoch:train:24499-28581batch: iter_time=3.846e-04, forward_time=0.147, loss_ctc=18.996, loss=18.996, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.578e-05, train_time=0.637 [islpc50:0/3] 2022-06-13 09:53:28,053 (trainer:678) INFO: 6epoch:train:28582-32664batch: iter_time=0.001, forward_time=0.147, loss_ctc=18.819, loss=18.819, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.538e-05, train_time=0.637 [islpc50:0/3] 2022-06-13 10:15:04,855 (trainer:678) INFO: 6epoch:train:32665-36747batch: iter_time=4.431e-04, forward_time=0.147, loss_ctc=18.874, loss=18.874, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.499e-05, train_time=0.635 [islpc50:0/3] 2022-06-13 10:36:52,034 (trainer:678) INFO: 6epoch:train:36748-40830batch: iter_time=0.003, forward_time=0.147, loss_ctc=18.530, loss=18.530, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.460e-05, train_time=0.640 [islpc50:0/3] 2022-06-13 10:58:42,318 (trainer:678) INFO: 6epoch:train:40831-44913batch: iter_time=5.869e-04, forward_time=0.148, loss_ctc=18.828, loss=18.828, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.422e-05, train_time=0.642 [islpc50:0/3] 2022-06-13 11:20:27,536 (trainer:678) INFO: 6epoch:train:44914-48996batch: iter_time=0.002, forward_time=0.147, loss_ctc=18.568, loss=18.568, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.384e-05, train_time=0.639 [islpc50:0/3] 2022-06-13 11:42:13,645 (trainer:678) INFO: 6epoch:train:48997-53079batch: iter_time=3.780e-04, forward_time=0.148, loss_ctc=18.765, loss=18.765, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.346e-05, train_time=0.640 [islpc50:0/3] 2022-06-13 12:04:00,465 (trainer:678) INFO: 6epoch:train:53080-57162batch: iter_time=3.692e-04, forward_time=0.148, loss_ctc=18.794, loss=18.794, backward_time=0.054, optim_step_time=0.003, optim0_lr0=8.309e-05, train_time=0.640 [islpc50:0/3] 2022-06-13 12:32:58,310 (trainer:678) INFO: 6epoch:train:57163-61245batch: iter_time=0.067, forward_time=0.147, loss_ctc=18.633, loss=18.633, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.273e-05, train_time=0.851 [islpc50:0/3] 2022-06-13 13:39:28,738 (trainer:678) INFO: 6epoch:train:61246-65328batch: iter_time=0.459, forward_time=0.146, loss_ctc=18.400, loss=18.400, backward_time=0.053, optim_step_time=0.004, optim0_lr0=8.237e-05, train_time=1.954 [islpc50:0/3] 2022-06-13 14:36:10,843 (trainer:678) INFO: 6epoch:train:65329-69411batch: iter_time=0.371, forward_time=0.146, loss_ctc=18.651, loss=18.651, backward_time=0.053, optim_step_time=0.004, optim0_lr0=8.202e-05, train_time=1.666 [islpc50:0/3] 2022-06-13 15:04:26,972 (trainer:678) INFO: 6epoch:train:69412-73494batch: iter_time=0.099, forward_time=0.147, loss_ctc=18.430, loss=18.430, backward_time=0.053, optim_step_time=0.004, optim0_lr0=8.167e-05, train_time=0.831 [islpc50:0/3] 2022-06-13 15:28:43,124 (trainer:678) INFO: 6epoch:train:73495-77577batch: iter_time=0.038, forward_time=0.147, loss_ctc=18.606, loss=18.606, backward_time=0.053, optim_step_time=0.004, optim0_lr0=8.132e-05, train_time=0.713 [islpc50:0/3] 2022-06-13 15:53:47,649 (trainer:678) INFO: 6epoch:train:77578-81660batch: iter_time=0.041, forward_time=0.147, loss_ctc=18.483, loss=18.483, backward_time=0.053, optim_step_time=0.004, optim0_lr0=8.098e-05, train_time=0.737 [islpc50:0/3] 2022-06-13 15:58:47,898 (trainer:334) INFO: 6epoch results: [train] iter_time=0.054, forward_time=0.147, loss_ctc=18.729, loss=18.729, backward_time=0.053, optim_step_time=0.003, optim0_lr0=8.449e-05, train_time=0.785, time=8 hours, 54 minutes and 12.09 seconds, total_count=489996, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=27.973, cer_ctc=0.130, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=27.973, time=4 minutes and 29.6 seconds, total_count=7524, gpu_max_cached_mem_GB=9.197, [att_plot] time=25.68 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-13 15:59:30,927 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-13 15:59:30,934 (trainer:268) INFO: 7/30epoch started. Estimated time to finish: 1 week, 1 day and 9 hours [islpc50:0/3] 2022-06-13 16:46:03,579 (trainer:678) INFO: 7epoch:train:1-4083batch: iter_time=0.222, forward_time=0.146, loss_ctc=17.228, loss=17.228, backward_time=0.053, optim_step_time=0.004, optim0_lr0=8.064e-05, train_time=1.368 [islpc50:0/3] 2022-06-13 17:33:28,496 (trainer:678) INFO: 7epoch:train:4084-8166batch: iter_time=0.255, forward_time=0.146, loss_ctc=17.391, loss=17.391, backward_time=0.053, optim_step_time=0.004, optim0_lr0=8.031e-05, train_time=1.393 [islpc50:0/3] 2022-06-13 17:59:49,859 (trainer:678) INFO: 7epoch:train:8167-12249batch: iter_time=0.043, forward_time=0.147, loss_ctc=17.159, loss=17.159, backward_time=0.054, optim_step_time=0.004, optim0_lr0=7.998e-05, train_time=0.774 [islpc50:0/3] 2022-06-13 18:21:46,567 (trainer:678) INFO: 7epoch:train:12250-16332batch: iter_time=0.005, forward_time=0.147, loss_ctc=16.942, loss=16.942, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.966e-05, train_time=0.645 [islpc50:0/3] 2022-06-13 18:43:45,118 (trainer:678) INFO: 7epoch:train:16333-20415batch: iter_time=0.005, forward_time=0.147, loss_ctc=17.464, loss=17.464, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.934e-05, train_time=0.646 [islpc50:0/3] 2022-06-13 19:05:28,462 (trainer:678) INFO: 7epoch:train:20416-24498batch: iter_time=8.441e-04, forward_time=0.147, loss_ctc=17.506, loss=17.506, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.902e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 19:27:13,835 (trainer:678) INFO: 7epoch:train:24499-28581batch: iter_time=0.002, forward_time=0.147, loss_ctc=17.093, loss=17.093, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.871e-05, train_time=0.639 [islpc50:0/3] 2022-06-13 19:48:58,918 (trainer:678) INFO: 7epoch:train:28582-32664batch: iter_time=0.001, forward_time=0.147, loss_ctc=17.265, loss=17.265, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.840e-05, train_time=0.639 [islpc50:0/3] 2022-06-13 20:10:41,875 (trainer:678) INFO: 7epoch:train:32665-36747batch: iter_time=0.001, forward_time=0.147, loss_ctc=17.339, loss=17.339, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.809e-05, train_time=0.638 [islpc50:0/3] 2022-06-13 20:32:54,950 (trainer:678) INFO: 7epoch:train:36748-40830batch: iter_time=0.008, forward_time=0.147, loss_ctc=16.973, loss=16.973, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.779e-05, train_time=0.653 [islpc50:0/3] 2022-06-13 20:55:21,048 (trainer:678) INFO: 7epoch:train:40831-44913batch: iter_time=0.012, forward_time=0.148, loss_ctc=17.292, loss=17.292, backward_time=0.054, optim_step_time=0.004, optim0_lr0=7.749e-05, train_time=0.659 [islpc50:0/3] 2022-06-13 21:17:30,717 (trainer:678) INFO: 7epoch:train:44914-48996batch: iter_time=0.008, forward_time=0.147, loss_ctc=17.270, loss=17.270, backward_time=0.053, optim_step_time=0.004, optim0_lr0=7.720e-05, train_time=0.651 [islpc50:0/3] 2022-06-13 21:39:55,332 (trainer:678) INFO: 7epoch:train:48997-53079batch: iter_time=0.013, forward_time=0.146, loss_ctc=17.085, loss=17.085, backward_time=0.054, optim_step_time=0.004, optim0_lr0=7.691e-05, train_time=0.658 [islpc50:0/3] 2022-06-13 22:03:09,327 (trainer:678) INFO: 7epoch:train:53080-57162batch: iter_time=0.026, forward_time=0.148, loss_ctc=17.349, loss=17.349, backward_time=0.054, optim_step_time=0.004, optim0_lr0=7.662e-05, train_time=0.683 [islpc50:0/3] 2022-06-13 22:25:07,099 (trainer:678) INFO: 7epoch:train:57163-61245batch: iter_time=0.007, forward_time=0.146, loss_ctc=17.011, loss=17.011, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.633e-05, train_time=0.645 [islpc50:0/3] 2022-06-13 22:46:53,616 (trainer:678) INFO: 7epoch:train:61246-65328batch: iter_time=0.002, forward_time=0.147, loss_ctc=17.282, loss=17.282, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.605e-05, train_time=0.640 [islpc50:0/3] 2022-06-13 23:26:18,867 (trainer:678) INFO: 7epoch:train:65329-69411batch: iter_time=0.174, forward_time=0.147, loss_ctc=17.112, loss=17.112, backward_time=0.053, optim_step_time=0.004, optim0_lr0=7.577e-05, train_time=1.159 [islpc50:0/3] 2022-06-13 23:48:38,261 (trainer:678) INFO: 7epoch:train:69412-73494batch: iter_time=0.010, forward_time=0.147, loss_ctc=16.771, loss=16.771, backward_time=0.050, optim_step_time=0.004, optim0_lr0=7.550e-05, train_time=0.656 [islpc50:0/3] 2022-06-14 00:10:19,084 (trainer:678) INFO: 7epoch:train:73495-77577batch: iter_time=8.710e-04, forward_time=0.147, loss_ctc=16.903, loss=16.903, backward_time=0.052, optim_step_time=0.003, optim0_lr0=7.522e-05, train_time=0.637 [islpc50:0/3] 2022-06-14 00:32:42,166 (trainer:678) INFO: 7epoch:train:77578-81660batch: iter_time=0.014, forward_time=0.146, loss_ctc=16.971, loss=16.971, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.495e-05, train_time=0.658 [islpc50:0/3] 2022-06-14 00:37:38,212 (trainer:334) INFO: 7epoch results: [train] iter_time=0.040, forward_time=0.147, loss_ctc=17.169, loss=17.169, backward_time=0.053, optim_step_time=0.004, optim0_lr0=7.770e-05, train_time=0.754, time=8 hours, 33 minutes and 16.7 seconds, total_count=571662, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=27.443, cer_ctc=0.125, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=27.443, time=4 minutes and 41.74 seconds, total_count=8778, gpu_max_cached_mem_GB=9.197, [att_plot] time=8.84 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-14 00:38:13,151 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-14 00:38:13,152 (trainer:268) INFO: 8/30epoch started. Estimated time to finish: 1 week, 1 day and 2 hours [islpc50:0/3] 2022-06-14 01:00:24,114 (trainer:678) INFO: 8epoch:train:1-4083batch: iter_time=0.007, forward_time=0.147, loss_ctc=15.648, loss=15.648, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.468e-05, train_time=0.652 [islpc50:0/3] 2022-06-14 01:22:13,264 (trainer:678) INFO: 8epoch:train:4084-8166batch: iter_time=0.002, forward_time=0.148, loss_ctc=15.931, loss=15.931, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.442e-05, train_time=0.641 [islpc50:0/3] 2022-06-14 01:43:58,428 (trainer:678) INFO: 8epoch:train:8167-12249batch: iter_time=0.003, forward_time=0.147, loss_ctc=16.128, loss=16.128, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.416e-05, train_time=0.639 [islpc50:0/3] 2022-06-14 02:06:21,283 (trainer:678) INFO: 8epoch:train:12250-16332batch: iter_time=0.011, forward_time=0.147, loss_ctc=16.089, loss=16.089, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.390e-05, train_time=0.658 [islpc50:0/3] 2022-06-14 02:29:43,855 (trainer:678) INFO: 8epoch:train:16333-20415batch: iter_time=0.033, forward_time=0.146, loss_ctc=16.036, loss=16.036, backward_time=0.051, optim_step_time=0.004, optim0_lr0=7.364e-05, train_time=0.687 [islpc50:0/3] 2022-06-14 02:52:13,625 (trainer:678) INFO: 8epoch:train:20416-24498batch: iter_time=0.017, forward_time=0.147, loss_ctc=15.965, loss=15.965, backward_time=0.050, optim_step_time=0.004, optim0_lr0=7.339e-05, train_time=0.661 [islpc50:0/3] 2022-06-14 03:14:01,306 (trainer:678) INFO: 8epoch:train:24499-28581batch: iter_time=0.008, forward_time=0.146, loss_ctc=15.946, loss=15.946, backward_time=0.049, optim_step_time=0.003, optim0_lr0=7.314e-05, train_time=0.640 [islpc50:0/3] 2022-06-14 03:36:17,005 (trainer:678) INFO: 8epoch:train:28582-32664batch: iter_time=0.013, forward_time=0.147, loss_ctc=15.891, loss=15.891, backward_time=0.050, optim_step_time=0.004, optim0_lr0=7.289e-05, train_time=0.654 [islpc50:0/3] 2022-06-14 03:58:04,800 (trainer:678) INFO: 8epoch:train:32665-36747batch: iter_time=0.006, forward_time=0.146, loss_ctc=15.847, loss=15.847, backward_time=0.049, optim_step_time=0.003, optim0_lr0=7.265e-05, train_time=0.640 [islpc50:0/3] 2022-06-14 04:19:44,874 (trainer:678) INFO: 8epoch:train:36748-40830batch: iter_time=0.003, forward_time=0.147, loss_ctc=15.922, loss=15.922, backward_time=0.049, optim_step_time=0.003, optim0_lr0=7.240e-05, train_time=0.637 [islpc50:0/3] 2022-06-14 04:41:31,918 (trainer:678) INFO: 8epoch:train:40831-44913batch: iter_time=0.002, forward_time=0.147, loss_ctc=15.822, loss=15.822, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.216e-05, train_time=0.640 [islpc50:0/3] 2022-06-14 05:03:16,326 (trainer:678) INFO: 8epoch:train:44914-48996batch: iter_time=0.002, forward_time=0.147, loss_ctc=15.966, loss=15.966, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.192e-05, train_time=0.639 [islpc50:0/3] 2022-06-14 05:25:37,286 (trainer:678) INFO: 8epoch:train:48997-53079batch: iter_time=0.013, forward_time=0.146, loss_ctc=15.875, loss=15.875, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.169e-05, train_time=0.657 [islpc50:0/3] 2022-06-14 05:47:20,860 (trainer:678) INFO: 8epoch:train:53080-57162batch: iter_time=0.002, forward_time=0.147, loss_ctc=15.793, loss=15.793, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.145e-05, train_time=0.638 [islpc50:0/3] 2022-06-14 06:09:55,612 (trainer:678) INFO: 8epoch:train:57163-61245batch: iter_time=0.016, forward_time=0.147, loss_ctc=15.799, loss=15.799, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.122e-05, train_time=0.663 [islpc50:0/3] 2022-06-14 06:31:35,268 (trainer:678) INFO: 8epoch:train:61246-65328batch: iter_time=5.277e-04, forward_time=0.147, loss_ctc=15.932, loss=15.932, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.099e-05, train_time=0.636 [islpc50:0/3] 2022-06-14 06:53:19,122 (trainer:678) INFO: 8epoch:train:65329-69411batch: iter_time=6.935e-04, forward_time=0.147, loss_ctc=15.941, loss=15.941, backward_time=0.054, optim_step_time=0.003, optim0_lr0=7.076e-05, train_time=0.639 [islpc50:0/3] 2022-06-14 07:15:02,369 (trainer:678) INFO: 8epoch:train:69412-73494batch: iter_time=0.001, forward_time=0.147, loss_ctc=15.916, loss=15.916, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.054e-05, train_time=0.638 [islpc50:0/3] 2022-06-14 07:36:39,943 (trainer:678) INFO: 8epoch:train:73495-77577batch: iter_time=5.321e-04, forward_time=0.147, loss_ctc=15.725, loss=15.725, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.032e-05, train_time=0.635 [islpc50:0/3] 2022-06-14 07:58:34,882 (trainer:678) INFO: 8epoch:train:77578-81660batch: iter_time=0.004, forward_time=0.147, loss_ctc=15.744, loss=15.744, backward_time=0.053, optim_step_time=0.003, optim0_lr0=7.010e-05, train_time=0.644 [islpc50:0/3] 2022-06-14 08:03:14,027 (trainer:334) INFO: 8epoch results: [train] iter_time=0.007, forward_time=0.147, loss_ctc=15.895, loss=15.895, backward_time=0.052, optim_step_time=0.003, optim0_lr0=7.232e-05, train_time=0.647, time=7 hours, 20 minutes and 27.64 seconds, total_count=653328, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=27.068, cer_ctc=0.123, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=27.068, time=4 minutes and 24.7 seconds, total_count=10032, gpu_max_cached_mem_GB=9.197, [att_plot] time=8.53 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-14 08:04:05,339 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-14 08:04:05,342 (trainer:268) INFO: 9/30epoch started. Estimated time to finish: 1 week, 15 hours and 10.66 seconds [islpc50:0/3] 2022-06-14 08:25:54,471 (trainer:678) INFO: 9epoch:train:1-4083batch: iter_time=2.817e-04, forward_time=0.146, loss_ctc=14.555, loss=14.555, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.988e-05, train_time=0.641 [islpc50:0/3] 2022-06-14 08:47:39,744 (trainer:678) INFO: 9epoch:train:4084-8166batch: iter_time=2.877e-04, forward_time=0.148, loss_ctc=14.727, loss=14.727, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.966e-05, train_time=0.639 [islpc50:0/3] 2022-06-14 09:09:22,313 (trainer:678) INFO: 9epoch:train:8167-12249batch: iter_time=7.865e-04, forward_time=0.147, loss_ctc=14.788, loss=14.788, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.945e-05, train_time=0.638 [islpc50:0/3] 2022-06-14 09:30:54,957 (trainer:678) INFO: 9epoch:train:12250-16332batch: iter_time=2.681e-04, forward_time=0.146, loss_ctc=14.676, loss=14.676, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.923e-05, train_time=0.633 [islpc50:0/3] 2022-06-14 09:52:36,058 (trainer:678) INFO: 9epoch:train:16333-20415batch: iter_time=2.285e-04, forward_time=0.147, loss_ctc=14.718, loss=14.718, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.902e-05, train_time=0.637 [islpc50:0/3] 2022-06-14 10:14:18,274 (trainer:678) INFO: 9epoch:train:20416-24498batch: iter_time=3.074e-04, forward_time=0.147, loss_ctc=15.019, loss=15.019, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.881e-05, train_time=0.638 [islpc50:0/3] 2022-06-14 10:37:22,309 (trainer:678) INFO: 9epoch:train:24499-28581batch: iter_time=0.018, forward_time=0.147, loss_ctc=14.901, loss=14.901, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.861e-05, train_time=0.678 [islpc50:0/3] 2022-06-14 11:13:18,212 (trainer:678) INFO: 9epoch:train:28582-32664batch: iter_time=0.159, forward_time=0.146, loss_ctc=14.625, loss=14.625, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.840e-05, train_time=1.056 [islpc50:0/3] 2022-06-14 11:44:19,500 (trainer:678) INFO: 9epoch:train:32665-36747batch: iter_time=0.108, forward_time=0.147, loss_ctc=14.948, loss=14.948, backward_time=0.054, optim_step_time=0.004, optim0_lr0=6.820e-05, train_time=0.912 [islpc50:0/3] 2022-06-14 12:06:35,915 (trainer:678) INFO: 9epoch:train:36748-40830batch: iter_time=0.004, forward_time=0.148, loss_ctc=14.866, loss=14.866, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.800e-05, train_time=0.654 [islpc50:0/3] 2022-06-14 12:36:56,733 (trainer:678) INFO: 9epoch:train:40831-44913batch: iter_time=0.085, forward_time=0.146, loss_ctc=15.018, loss=15.018, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.780e-05, train_time=0.892 [islpc50:0/3] 2022-06-14 12:58:55,748 (trainer:678) INFO: 9epoch:train:44914-48996batch: iter_time=0.008, forward_time=0.146, loss_ctc=15.008, loss=15.008, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.760e-05, train_time=0.646 [islpc50:0/3] 2022-06-14 13:21:46,560 (trainer:678) INFO: 9epoch:train:48997-53079batch: iter_time=0.021, forward_time=0.146, loss_ctc=14.721, loss=14.721, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.740e-05, train_time=0.671 [islpc50:0/3] 2022-06-14 13:43:26,340 (trainer:678) INFO: 9epoch:train:53080-57162batch: iter_time=4.675e-04, forward_time=0.147, loss_ctc=14.851, loss=14.851, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.721e-05, train_time=0.636 [islpc50:0/3] 2022-06-14 14:06:18,861 (trainer:678) INFO: 9epoch:train:57163-61245batch: iter_time=0.014, forward_time=0.147, loss_ctc=14.969, loss=14.969, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.702e-05, train_time=0.672 [islpc50:0/3] 2022-06-14 14:56:39,545 (trainer:678) INFO: 9epoch:train:61246-65328batch: iter_time=0.264, forward_time=0.146, loss_ctc=15.072, loss=15.072, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.682e-05, train_time=1.479 [islpc50:0/3] 2022-06-14 15:33:32,975 (trainer:678) INFO: 9epoch:train:65329-69411batch: iter_time=0.152, forward_time=0.145, loss_ctc=14.842, loss=14.842, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.663e-05, train_time=1.084 [islpc50:0/3] 2022-06-14 16:01:02,837 (trainer:678) INFO: 9epoch:train:69412-73494batch: iter_time=0.056, forward_time=0.147, loss_ctc=15.093, loss=15.093, backward_time=0.054, optim_step_time=0.004, optim0_lr0=6.645e-05, train_time=0.808 [islpc50:0/3] 2022-06-14 16:24:02,306 (trainer:678) INFO: 9epoch:train:73495-77577batch: iter_time=0.015, forward_time=0.147, loss_ctc=14.977, loss=14.977, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.626e-05, train_time=0.676 [islpc50:0/3] 2022-06-14 16:45:49,101 (trainer:678) INFO: 9epoch:train:77578-81660batch: iter_time=9.266e-04, forward_time=0.148, loss_ctc=14.675, loss=14.675, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.608e-05, train_time=0.640 [islpc50:0/3] 2022-06-14 16:51:00,648 (trainer:334) INFO: 9epoch results: [train] iter_time=0.045, forward_time=0.147, loss_ctc=14.852, loss=14.852, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.793e-05, train_time=0.766, time=8 hours, 41 minutes and 49.27 seconds, total_count=734994, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=26.731, cer_ctc=0.120, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=26.731, time=4 minutes and 43.43 seconds, total_count=11286, gpu_max_cached_mem_GB=9.197, [att_plot] time=22.6 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-14 16:51:41,574 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-14 16:51:41,585 (trainer:268) INFO: 10/30epoch started. Estimated time to finish: 1 week, 7 hours and 47 minutes [islpc50:0/3] 2022-06-14 17:19:52,231 (trainer:678) INFO: 10epoch:train:1-4083batch: iter_time=0.048, forward_time=0.146, loss_ctc=13.460, loss=13.460, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.589e-05, train_time=0.828 [islpc50:0/3] 2022-06-14 17:42:39,269 (trainer:678) INFO: 10epoch:train:4084-8166batch: iter_time=0.012, forward_time=0.146, loss_ctc=13.767, loss=13.767, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.571e-05, train_time=0.669 [islpc50:0/3] 2022-06-14 18:08:43,303 (trainer:678) INFO: 10epoch:train:8167-12249batch: iter_time=0.044, forward_time=0.147, loss_ctc=13.920, loss=13.920, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.553e-05, train_time=0.766 [islpc50:0/3] 2022-06-14 18:30:45,618 (trainer:678) INFO: 10epoch:train:12250-16332batch: iter_time=0.002, forward_time=0.147, loss_ctc=14.174, loss=14.174, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.535e-05, train_time=0.648 [islpc50:0/3] 2022-06-14 18:52:24,123 (trainer:678) INFO: 10epoch:train:16333-20415batch: iter_time=1.410e-04, forward_time=0.147, loss_ctc=13.985, loss=13.985, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.517e-05, train_time=0.636 [islpc50:0/3] 2022-06-14 19:14:32,465 (trainer:678) INFO: 10epoch:train:20416-24498batch: iter_time=0.008, forward_time=0.147, loss_ctc=13.841, loss=13.841, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.500e-05, train_time=0.650 [islpc50:0/3] 2022-06-14 19:36:26,489 (trainer:678) INFO: 10epoch:train:24499-28581batch: iter_time=0.003, forward_time=0.148, loss_ctc=13.953, loss=13.953, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.482e-05, train_time=0.643 [islpc50:0/3] 2022-06-14 19:58:14,237 (trainer:678) INFO: 10epoch:train:28582-32664batch: iter_time=0.002, forward_time=0.147, loss_ctc=14.003, loss=14.003, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.465e-05, train_time=0.640 [islpc50:0/3] 2022-06-14 20:20:03,934 (trainer:678) INFO: 10epoch:train:32665-36747batch: iter_time=0.003, forward_time=0.147, loss_ctc=14.001, loss=14.001, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.448e-05, train_time=0.641 [islpc50:0/3] 2022-06-14 20:41:53,758 (trainer:678) INFO: 10epoch:train:36748-40830batch: iter_time=0.001, forward_time=0.147, loss_ctc=13.938, loss=13.938, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.431e-05, train_time=0.641 [islpc50:0/3] 2022-06-14 21:04:11,872 (trainer:678) INFO: 10epoch:train:40831-44913batch: iter_time=0.011, forward_time=0.147, loss_ctc=13.957, loss=13.957, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.414e-05, train_time=0.655 [islpc50:0/3] 2022-06-14 21:26:16,365 (trainer:678) INFO: 10epoch:train:44914-48996batch: iter_time=0.009, forward_time=0.146, loss_ctc=13.794, loss=13.794, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.397e-05, train_time=0.649 [islpc50:0/3] 2022-06-14 21:48:01,800 (trainer:678) INFO: 10epoch:train:48997-53079batch: iter_time=0.002, forward_time=0.147, loss_ctc=14.245, loss=14.245, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.380e-05, train_time=0.639 [islpc50:0/3] 2022-06-14 22:09:52,798 (trainer:678) INFO: 10epoch:train:53080-57162batch: iter_time=0.002, forward_time=0.148, loss_ctc=14.095, loss=14.095, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.364e-05, train_time=0.642 [islpc50:0/3] 2022-06-14 22:31:37,315 (trainer:678) INFO: 10epoch:train:57163-61245batch: iter_time=3.079e-04, forward_time=0.147, loss_ctc=14.024, loss=14.024, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.348e-05, train_time=0.639 [islpc50:0/3] 2022-06-14 22:59:27,013 (trainer:678) INFO: 10epoch:train:61246-65328batch: iter_time=0.041, forward_time=0.147, loss_ctc=14.020, loss=14.020, backward_time=0.053, optim_step_time=0.004, optim0_lr0=6.331e-05, train_time=0.818 [islpc50:0/3] 2022-06-14 23:26:01,556 (trainer:678) INFO: 10epoch:train:65329-69411batch: iter_time=0.036, forward_time=0.148, loss_ctc=13.950, loss=13.950, backward_time=0.054, optim_step_time=0.004, optim0_lr0=6.315e-05, train_time=0.781 [islpc50:0/3] 2022-06-14 23:48:53,586 (trainer:678) INFO: 10epoch:train:69412-73494batch: iter_time=0.008, forward_time=0.147, loss_ctc=14.009, loss=14.009, backward_time=0.054, optim_step_time=0.004, optim0_lr0=6.299e-05, train_time=0.672 [islpc50:0/3] 2022-06-15 00:11:10,387 (trainer:678) INFO: 10epoch:train:73495-77577batch: iter_time=0.007, forward_time=0.147, loss_ctc=13.922, loss=13.922, backward_time=0.054, optim_step_time=0.004, optim0_lr0=6.283e-05, train_time=0.655 [islpc50:0/3] 2022-06-15 00:34:53,551 (trainer:678) INFO: 10epoch:train:77578-81660batch: iter_time=0.036, forward_time=0.147, loss_ctc=13.824, loss=13.824, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.268e-05, train_time=0.697 [islpc50:0/3] 2022-06-15 00:39:30,121 (trainer:334) INFO: 10epoch results: [train] iter_time=0.014, forward_time=0.147, loss_ctc=13.944, loss=13.944, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.425e-05, train_time=0.680, time=7 hours, 43 minutes and 16.95 seconds, total_count=816660, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=26.584, cer_ctc=0.120, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=26.584, time=4 minutes and 22.96 seconds, total_count=12540, gpu_max_cached_mem_GB=9.197, [att_plot] time=8.62 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-15 00:40:09,135 (trainer:382) INFO: The best model has been updated: train.loss, valid.loss [islpc50:0/3] 2022-06-15 00:40:09,138 (trainer:268) INFO: 11/30epoch started. Estimated time to finish: 6 days, 22 hours and 17 minutes [islpc50:0/3] 2022-06-15 01:02:03,273 (trainer:678) INFO: 11epoch:train:1-4083batch: iter_time=4.734e-04, forward_time=0.148, loss_ctc=13.081, loss=13.081, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.252e-05, train_time=0.643 [islpc50:0/3] 2022-06-15 01:23:48,224 (trainer:678) INFO: 11epoch:train:4084-8166batch: iter_time=3.319e-04, forward_time=0.148, loss_ctc=13.119, loss=13.119, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.236e-05, train_time=0.639 [islpc50:0/3] 2022-06-15 01:45:28,644 (trainer:678) INFO: 11epoch:train:8167-12249batch: iter_time=3.340e-04, forward_time=0.147, loss_ctc=13.089, loss=13.089, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.221e-05, train_time=0.637 [islpc50:0/3] 2022-06-15 02:07:06,785 (trainer:678) INFO: 11epoch:train:12250-16332batch: iter_time=3.519e-04, forward_time=0.147, loss_ctc=13.173, loss=13.173, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.206e-05, train_time=0.636 [islpc50:0/3] 2022-06-15 02:28:42,597 (trainer:678) INFO: 11epoch:train:16333-20415batch: iter_time=1.419e-04, forward_time=0.146, loss_ctc=12.881, loss=12.881, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.190e-05, train_time=0.635 [islpc50:0/3] 2022-06-15 02:50:29,895 (trainer:678) INFO: 11epoch:train:20416-24498batch: iter_time=9.968e-04, forward_time=0.148, loss_ctc=13.215, loss=13.215, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.175e-05, train_time=0.640 [islpc50:0/3] 2022-06-15 03:12:16,517 (trainer:678) INFO: 11epoch:train:24499-28581batch: iter_time=3.952e-04, forward_time=0.148, loss_ctc=13.313, loss=13.313, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.160e-05, train_time=0.640 [islpc50:0/3] 2022-06-15 03:33:56,167 (trainer:678) INFO: 11epoch:train:28582-32664batch: iter_time=2.950e-04, forward_time=0.147, loss_ctc=13.252, loss=13.252, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.146e-05, train_time=0.636 [islpc50:0/3] 2022-06-15 03:55:36,679 (trainer:678) INFO: 11epoch:train:32665-36747batch: iter_time=1.386e-04, forward_time=0.147, loss_ctc=13.212, loss=13.212, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.131e-05, train_time=0.637 [islpc50:0/3] 2022-06-15 04:17:25,613 (trainer:678) INFO: 11epoch:train:36748-40830batch: iter_time=0.001, forward_time=0.148, loss_ctc=13.044, loss=13.044, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.116e-05, train_time=0.641 [islpc50:0/3] 2022-06-15 04:39:10,142 (trainer:678) INFO: 11epoch:train:40831-44913batch: iter_time=1.377e-04, forward_time=0.148, loss_ctc=13.263, loss=13.263, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.102e-05, train_time=0.639 [islpc50:0/3] 2022-06-15 05:00:50,973 (trainer:678) INFO: 11epoch:train:44914-48996batch: iter_time=1.781e-04, forward_time=0.147, loss_ctc=13.174, loss=13.174, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.087e-05, train_time=0.637 [islpc50:0/3] 2022-06-15 05:22:31,097 (trainer:678) INFO: 11epoch:train:48997-53079batch: iter_time=5.914e-04, forward_time=0.147, loss_ctc=13.263, loss=13.263, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.073e-05, train_time=0.637 [islpc50:0/3] 2022-06-15 05:44:14,471 (trainer:678) INFO: 11epoch:train:53080-57162batch: iter_time=2.362e-04, forward_time=0.147, loss_ctc=13.174, loss=13.174, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.059e-05, train_time=0.638 [islpc50:0/3] 2022-06-15 06:05:58,319 (trainer:678) INFO: 11epoch:train:57163-61245batch: iter_time=5.242e-04, forward_time=0.147, loss_ctc=12.927, loss=12.927, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.044e-05, train_time=0.639 [islpc50:0/3] 2022-06-15 06:27:48,743 (trainer:678) INFO: 11epoch:train:61246-65328batch: iter_time=0.002, forward_time=0.147, loss_ctc=13.410, loss=13.410, backward_time=0.054, optim_step_time=0.003, optim0_lr0=6.030e-05, train_time=0.642 [islpc50:0/3] 2022-06-15 06:49:31,615 (trainer:678) INFO: 11epoch:train:65329-69411batch: iter_time=0.002, forward_time=0.146, loss_ctc=13.038, loss=13.038, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.016e-05, train_time=0.638 [islpc50:0/3] 2022-06-15 07:11:33,014 (trainer:678) INFO: 11epoch:train:69412-73494batch: iter_time=0.005, forward_time=0.147, loss_ctc=13.247, loss=13.247, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.003e-05, train_time=0.647 [islpc50:0/3] 2022-06-15 07:33:12,255 (trainer:678) INFO: 11epoch:train:73495-77577batch: iter_time=3.519e-04, forward_time=0.147, loss_ctc=13.096, loss=13.096, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.989e-05, train_time=0.636 [islpc50:0/3] 2022-06-15 07:54:56,945 (trainer:678) INFO: 11epoch:train:77578-81660batch: iter_time=2.236e-04, forward_time=0.148, loss_ctc=13.275, loss=13.275, backward_time=0.054, optim_step_time=0.003, optim0_lr0=5.975e-05, train_time=0.639 [islpc50:0/3] 2022-06-15 07:59:42,084 (trainer:334) INFO: 11epoch results: [train] iter_time=8.300e-04, forward_time=0.147, loss_ctc=13.162, loss=13.162, backward_time=0.053, optim_step_time=0.003, optim0_lr0=6.111e-05, train_time=0.639, time=7 hours, 14 minutes and 54.02 seconds, total_count=898326, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=27.026, cer_ctc=0.117, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=27.026, time=4 minutes and 30.25 seconds, total_count=13794, gpu_max_cached_mem_GB=9.197, [att_plot] time=8.68 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-15 08:00:23,107 (trainer:382) INFO: The best model has been updated: train.loss [islpc50:0/3] 2022-06-15 08:00:23,110 (trainer:268) INFO: 12/30epoch started. Estimated time to finish: 6 days, 12 hours and 17 minutes [islpc50:0/3] 2022-06-15 08:22:13,828 (trainer:678) INFO: 12epoch:train:1-4083batch: iter_time=2.927e-04, forward_time=0.147, loss_ctc=12.440, loss=12.440, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.962e-05, train_time=0.642 [islpc50:0/3] 2022-06-15 08:43:54,839 (trainer:678) INFO: 12epoch:train:4084-8166batch: iter_time=1.735e-04, forward_time=0.147, loss_ctc=12.386, loss=12.386, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.948e-05, train_time=0.637 [islpc50:0/3] 2022-06-15 09:05:34,366 (trainer:678) INFO: 12epoch:train:8167-12249batch: iter_time=1.755e-04, forward_time=0.147, loss_ctc=12.314, loss=12.314, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.935e-05, train_time=0.636 [islpc50:0/3] 2022-06-15 09:27:12,425 (trainer:678) INFO: 12epoch:train:12250-16332batch: iter_time=1.829e-04, forward_time=0.147, loss_ctc=12.240, loss=12.240, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.921e-05, train_time=0.636 [islpc50:0/3] 2022-06-15 09:48:49,838 (trainer:678) INFO: 12epoch:train:16333-20415batch: iter_time=1.353e-04, forward_time=0.147, loss_ctc=12.260, loss=12.260, backward_time=0.054, optim_step_time=0.003, optim0_lr0=5.908e-05, train_time=0.635 [islpc50:0/3] 2022-06-15 10:10:31,180 (trainer:678) INFO: 12epoch:train:20416-24498batch: iter_time=1.995e-04, forward_time=0.147, loss_ctc=12.512, loss=12.512, backward_time=0.054, optim_step_time=0.003, optim0_lr0=5.895e-05, train_time=0.637 [islpc50:0/3] 2022-06-15 10:32:08,126 (trainer:678) INFO: 12epoch:train:24499-28581batch: iter_time=2.033e-04, forward_time=0.147, loss_ctc=12.403, loss=12.403, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.882e-05, train_time=0.635 [islpc50:0/3] 2022-06-15 10:53:56,703 (trainer:678) INFO: 12epoch:train:28582-32664batch: iter_time=1.467e-04, forward_time=0.148, loss_ctc=12.682, loss=12.682, backward_time=0.054, optim_step_time=0.003, optim0_lr0=5.869e-05, train_time=0.641 [islpc50:0/3] 2022-06-15 11:15:40,262 (trainer:678) INFO: 12epoch:train:32665-36747batch: iter_time=1.644e-04, forward_time=0.147, loss_ctc=12.348, loss=12.348, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.856e-05, train_time=0.638 [islpc50:0/3] 2022-06-15 11:37:24,488 (trainer:678) INFO: 12epoch:train:36748-40830batch: iter_time=3.316e-04, forward_time=0.147, loss_ctc=12.603, loss=12.603, backward_time=0.054, optim_step_time=0.003, optim0_lr0=5.844e-05, train_time=0.639 [islpc50:0/3] 2022-06-15 11:59:12,585 (trainer:678) INFO: 12epoch:train:40831-44913batch: iter_time=3.447e-04, forward_time=0.148, loss_ctc=12.357, loss=12.357, backward_time=0.054, optim_step_time=0.003, optim0_lr0=5.831e-05, train_time=0.641 [islpc50:0/3] 2022-06-15 12:20:54,859 (trainer:678) INFO: 12epoch:train:44914-48996batch: iter_time=1.954e-04, forward_time=0.147, loss_ctc=12.485, loss=12.485, backward_time=0.053, optim_step_time=0.003, optim0_lr0=5.818e-05, train_time=0.638 [islpc50:0/3] 2022-06-15 12:55:12,640 (trainer:678) INFO: 12epoch:train:48997-53079batch: iter_time=0.129, forward_time=0.146, loss_ctc=12.461, loss=12.461, backward_time=0.053, optim_step_time=0.004, optim0_lr0=5.806e-05, train_time=1.008 [islpc50:0/3] 2022-06-15 13:17:54,747 (trainer:678) INFO: 12epoch:train:53080-57162batch: iter_time=0.016, forward_time=0.147, loss_ctc=12.447, loss=12.447, backward_time=0.053, optim_step_time=0.004, optim0_lr0=5.793e-05, train_time=0.667 [islpc50:0/3] 2022-06-15 14:10:53,985 (trainer:678) INFO: 12epoch:train:57163-61245batch: iter_time=0.338, forward_time=0.146, loss_ctc=12.576, loss=12.576, backward_time=0.053, optim_step_time=0.004, optim0_lr0=5.781e-05, train_time=1.557 [islpc50:0/3] 2022-06-15 15:05:03,428 (trainer:678) INFO: 12epoch:train:61246-65328batch: iter_time=0.363, forward_time=0.146, loss_ctc=12.479, loss=12.479, backward_time=0.053, optim_step_time=0.004, optim0_lr0=5.769e-05, train_time=1.591 [islpc50:0/3] 2022-06-15 15:47:48,216 (trainer:678) INFO: 12epoch:train:65329-69411batch: iter_time=0.211, forward_time=0.146, loss_ctc=12.514, loss=12.514, backward_time=0.054, optim_step_time=0.004, optim0_lr0=5.756e-05, train_time=1.256 [islpc50:0/3] 2022-06-15 17:03:30,425 (trainer:678) INFO: 12epoch:train:69412-73494batch: iter_time=0.441, forward_time=0.147, loss_ctc=12.412, loss=12.412, backward_time=0.054, optim_step_time=0.004, optim0_lr0=5.744e-05, train_time=2.224 [islpc50:0/3] 2022-06-15 17:54:07,103 (trainer:678) INFO: 12epoch:train:73495-77577batch: iter_time=0.234, forward_time=0.146, loss_ctc=12.532, loss=12.532, backward_time=0.054, optim_step_time=0.004, optim0_lr0=5.732e-05, train_time=1.488 [islpc50:0/3] 2022-06-15 18:20:17,084 (trainer:678) INFO: 12epoch:train:77578-81660batch: iter_time=0.032, forward_time=0.147, loss_ctc=12.570, loss=12.570, backward_time=0.054, optim_step_time=0.004, optim0_lr0=5.720e-05, train_time=0.769 [islpc50:0/3] 2022-06-15 18:25:26,411 (trainer:334) INFO: 12epoch results: [train] iter_time=0.088, forward_time=0.147, loss_ctc=12.451, loss=12.451, backward_time=0.054, optim_step_time=0.004, optim0_lr0=5.839e-05, train_time=0.911, time=10 hours, 20 minutes and 1.3 seconds, total_count=979992, gpu_max_cached_mem_GB=9.197, [valid] loss_ctc=27.013, cer_ctc=0.115, loss_att=nan, acc=nan, cer=nan, wer=nan, loss=27.013, time=4 minutes and 41.08 seconds, total_count=15048, gpu_max_cached_mem_GB=9.197, [att_plot] time=20.93 seconds, total_count=0, gpu_max_cached_mem_GB=9.197 [islpc50:0/3] 2022-06-15 18:26:04,579 (trainer:382) INFO: The best model has been updated: train.loss [islpc50:0/3] 2022-06-15 18:26:04,591 (reporter:421) INFO: [Early stopping] valid.loss has not been improved 2 epochs continuously. The training was stopped at 12epoch [islpc50:0/3] 2022-06-15 18:26:04,635 (average_nbest_models:72) INFO: Averaging 10best models: criterion="train.loss": exp/asr_oxford_French_config_raw_fr_bpe350_sp/train.loss.ave_10best.pth [islpc50:0/3] 2022-06-15 18:28:55,599 (average_nbest_models:72) INFO: Averaging 10best models: criterion="valid.loss": exp/asr_oxford_French_config_raw_fr_bpe350_sp/valid.loss.ave_10best.pth [islpc50:0/3] 2022-06-15 18:29:23,683 (average_nbest_models:72) INFO: Averaging 10best models: criterion="valid.acc": exp/asr_oxford_French_config_raw_fr_bpe350_sp/valid.acc.ave_10best.pth /usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 84 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' # Accounting: time=363646 threads=1 # Ended (code 0) at Wed Jun 15 18:31:20 EDT 2022, elapsed time 363646 seconds