INFO:nncf:Ignored adding weight sparsifier for operation: OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/NNCFEmbedding[embed_tokens]/embedding_0 INFO:nncf:Ignored adding weight sparsifier for operation: OPTForCausalLM/NNCFLinear[lm_head]/linear_0 INFO:nncf:Not adding activation input quantizer for operation: 3 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/NNCFEmbedding[embed_tokens]/embedding_0 INFO:nncf:Not adding activation input quantizer for operation: 6 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/long_0 INFO:nncf:Not adding activation input quantizer for operation: 7 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/cumsum_0 INFO:nncf:Not adding activation input quantizer for operation: 8 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/type_as_0 INFO:nncf:Not adding activation input quantizer for operation: 9 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/__mul___0 INFO:nncf:Not adding activation input quantizer for operation: 10 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/long_1 INFO:nncf:Not adding activation input quantizer for operation: 11 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/__sub___0 INFO:nncf:Not adding activation input quantizer for operation: 12 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/__getitem___0 INFO:nncf:Not adding activation input quantizer for operation: 13 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 14 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/OPTLearnedPositionalEmbedding[embed_positions]/embedding_0 INFO:nncf:Not adding activation input quantizer for operation: 16 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 36 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[0]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 47 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[0]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 48 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[0]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 54 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[0]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 56 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[0]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 76 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[1]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 87 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[1]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 88 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[1]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 94 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[1]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 96 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[1]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 116 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[2]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 127 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[2]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 128 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[2]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 134 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[2]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 136 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[2]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 156 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[3]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 167 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[3]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 168 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[3]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 174 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[3]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 176 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[3]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 196 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[4]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 207 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[4]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 208 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[4]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 214 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[4]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 216 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[4]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 236 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[5]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 247 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[5]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 248 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[5]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 254 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[5]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 256 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[5]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 276 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[6]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 287 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[6]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 288 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[6]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 294 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[6]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 296 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[6]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 316 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[7]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 327 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[7]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 328 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[7]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 334 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[7]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 336 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[7]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 356 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[8]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 367 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[8]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 368 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[8]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 374 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[8]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 376 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[8]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 396 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[9]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 407 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[9]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 408 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[9]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 414 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[9]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 416 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[9]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 436 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[10]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 447 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[10]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 448 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[10]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 454 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[10]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 456 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[10]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 476 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[11]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 487 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[11]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 488 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[11]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 494 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[11]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 496 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[11]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 516 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[12]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 527 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[12]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 528 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[12]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 534 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[12]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 536 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[12]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 556 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[13]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 567 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[13]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 568 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[13]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 574 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[13]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 576 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[13]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 596 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[14]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 607 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[14]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 608 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[14]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 614 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[14]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 616 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[14]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 636 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[15]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 647 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[15]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 648 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[15]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 654 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[15]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 656 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[15]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 676 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[16]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 687 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[16]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 688 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[16]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 694 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[16]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 696 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[16]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 716 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[17]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 727 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[17]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 728 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[17]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 734 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[17]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 736 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[17]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 756 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[18]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 767 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[18]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 768 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[18]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 774 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[18]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 776 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[18]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 796 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[19]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 807 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[19]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 808 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[19]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 814 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[19]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 816 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[19]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 836 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[20]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 847 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[20]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 848 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[20]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 854 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[20]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 856 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[20]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 876 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[21]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 887 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[21]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 888 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[21]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 894 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[21]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 896 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[21]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 916 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[22]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 927 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[22]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 928 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[22]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 934 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[22]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 936 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[22]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 956 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[23]/OPTAttention[self_attn]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 967 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[23]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 968 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[23]/NNCFLayerNorm[self_attn_layer_norm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 974 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[23]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 976 OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/ModuleList[layers]/OPTDecoderLayer[23]/NNCFLayerNorm[final_layer_norm]/layer_norm_0 INFO:nncf:Collecting tensor statistics |████████████████| 1 / 1 INFO:nncf:Compiling and loading torch extension: quantized_functions_cpu... INFO:nncf:Finished loading torch extension: quantized_functions_cpu INFO:nncf:Statistics of the sparsified model: Epoch 0 |+-----------------------------------------+-------+ Epoch 0 || Statistic's name | Value | Epoch 0 |+=========================================+=======+ Epoch 0 || Sparsity level of the whole model | 0.722 | Epoch 0 |+-----------------------------------------+-------+ Epoch 0 || Sparsity level of all sparsified layers | 0.850 | Epoch 0 |+-----------------------------------------+-------+ Epoch 0 | Epoch 0 |Statistics by sparsified layers: Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || Layer's name | Weight's shape | Sparsity level | Weight's percentage | Epoch 0 |+======================+================+================+=====================+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.669 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[2]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.680 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[2]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.942 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[2]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.945 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[2]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.658 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[3]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.658 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[3]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.664 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[3]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 512] | 0.465 | 0.173 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/NNCFLinea | | | | Epoch 0 || r[project_in]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.676 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[3]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.601 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[0]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[3]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.946 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[3]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.659 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[4]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.660 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[4]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.670 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[4]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.631 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[0]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.680 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[4]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.939 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[4]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.946 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[4]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.675 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[5]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.673 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[5]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.671 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[5]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.670 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[0]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.683 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[5]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.939 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[5]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.944 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[5]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.680 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[6]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.687 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[6]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.678 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[6]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.685 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[6]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.939 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[6]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.943 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[6]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.676 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[7]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.681 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[7]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.678 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[7]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.685 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[7]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.942 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[7]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.941 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[7]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.671 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[8]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.678 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[8]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.679 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[8]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.683 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[8]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.938 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[8]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.943 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[8]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.681 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[9]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.681 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[9]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.679 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[9]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.684 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[9]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[9]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.944 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[9]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.666 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[10]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.667 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[10]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.671 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[10]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.678 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[10]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.719 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[0]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[10]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.944 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[10]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.660 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[11]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.664 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[11]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.665 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[11]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.672 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[11]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[11]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.944 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[11]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.664 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[12]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.665 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[12]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.943 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[0]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.660 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[12]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.943 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[0]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.668 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[12]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[12]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.945 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[12]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.662 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[13]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.664 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[13]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.662 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[13]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.670 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[13]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.663 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[1]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[13]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.945 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[13]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.659 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[14]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.660 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[14]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.656 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[14]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.669 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[1]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.666 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[14]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[14]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.946 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[14]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.660 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[15]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.661 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[15]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.659 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[15]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.675 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[1]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[v_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.675 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[15]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[15]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.946 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[15]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.660 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[16]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.660 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[16]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.655 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[16]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.671 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[16]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[16]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.946 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[16]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.661 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[17]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.661 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[17]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.658 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[17]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.673 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[17]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[17]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.946 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[17]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.664 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[18]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.664 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[18]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.652 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[18]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.667 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[18]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.941 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[18]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.946 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[18]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.669 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[19]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.668 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[19]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.666 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[19]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.678 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[19]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.941 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[19]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.945 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[19]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.677 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[20]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.679 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[20]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.671 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[20]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.683 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[20]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.683 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[1]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[out_proj]/linear | | | | Epoch 0 || _0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.941 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[20]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.944 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[20]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.677 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[21]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.677 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[21]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.673 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[21]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.681 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[21]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.940 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[21]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.944 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[21]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.670 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[22]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.689 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[22]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.943 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[1]/NNCFLinear[ | | | | Epoch 0 || fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.670 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[22]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.945 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[1]/NNCFLinear[ | | | | Epoch 0 || fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.687 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[22]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.943 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[22]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.945 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[22]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.668 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[23]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[q_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.672 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[23]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[k_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.664 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[23]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[v_proj]/linear_ | | | | Epoch 0 || 0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.684 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[23]/OPTAttenti | | | | Epoch 0 || on[self_attn]/NNCFLi | | | | Epoch 0 || near[out_proj]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.655 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[2]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[q_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [4096, 1024] | 0.944 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[23]/NNCFLinear | | | | Epoch 0 || [fc1]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 4096] | 0.945 | 1.384 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[23]/NNCFLinear | | | | Epoch 0 || [fc2]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [512, 1024] | 0.316 | 0.173 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/NNCFLinea | | | | Epoch 0 || r[project_out]/linea | | | | Epoch 0 || r_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 || OPTForCausalLM/OPTMo | [1024, 1024] | 0.658 | 0.346 | Epoch 0 || del[model]/OPTDecode | | | | Epoch 0 || r[decoder]/ModuleLis | | | | Epoch 0 || t[layers]/OPTDecoder | | | | Epoch 0 || Layer[2]/OPTAttentio | | | | Epoch 0 || n[self_attn]/NNCFLin | | | | Epoch 0 || ear[k_proj]/linear_0 | | | | Epoch 0 |+----------------------+----------------+----------------+---------------------+ Epoch 0 | Epoch 0 |Statistics of the magnitude sparsity algorithm: Epoch 0 |+----------------------------------------------------------------------+-------+ Epoch 0 || Statistic's name | Value | Epoch 0 |+======================================================================+=======+ Epoch 0 || A target level of the sparsity for the algorithm for the current | None | Epoch 0 || epoch | | Epoch 0 |+----------------------------------------------------------------------+-------+ Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || Layer's name | Sparsity threshold | Epoch 0 |+=========================================================+====================+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[2]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[2]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[2]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[2]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[3]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[3]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[3]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/NNCF | 0.001 | Epoch 0 || Linear[project_in]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[3]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[0]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[3]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[3]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[4]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[4]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[4]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[0]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[4]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[4]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[4]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[5]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[5]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[5]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[0]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[5]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[5]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[5]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[6]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[6]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[6]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[6]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[6]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[6]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[7]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[7]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[7]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[7]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[7]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[7]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[8]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[8]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[8]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[8]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[8]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[8]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[9]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[9]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[9]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[9]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[9]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[9]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[10]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[10]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[10]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[10]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[0]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[10]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[10]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[11]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[11]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[11]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[11]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[11]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[11]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[12]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[12]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[0]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[12]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[0]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[12]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[12]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[12]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[13]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[13]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[13]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[13]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[1]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[13]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[13]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[14]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[14]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[14]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[1]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[14]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[14]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[14]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[15]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[15]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[15]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[1]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[15]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[15]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[15]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[16]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[16]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[16]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[16]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[16]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[16]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[17]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[17]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[17]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[17]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[17]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[17]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[18]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[18]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[18]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[18]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[18]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[18]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[19]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[19]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[19]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[19]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[19]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[19]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[20]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[20]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[20]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[20]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[1]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[20]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[20]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[21]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[21]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[21]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[21]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[21]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[21]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[22]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[22]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[1]/NNCFLinear[fc1]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[22]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[1]/NNCFLinear[fc2]/linea | | Epoch 0 || r_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[22]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[22]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[22]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[23]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[23]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[23]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[v_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[23]/OPTAttention[self_at | | Epoch 0 || tn]/NNCFLinear[out_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[2]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[q_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[23]/NNCFLinear[fc1]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[23]/NNCFLinear[fc2]/line | | Epoch 0 || ar_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/NNCF | 0.001 | Epoch 0 || Linear[project_out]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 || OPTForCausalLM/OPTModel[model]/OPTDecoder[decoder]/Modu | 0.001 | Epoch 0 || leList[layers]/OPTDecoderLayer[2]/OPTAttention[self_att | | Epoch 0 || n]/NNCFLinear[k_proj]/linear_0 | | Epoch 0 |+---------------------------------------------------------+--------------------+ Epoch 0 | Epoch 0 |Statistics of the quantization algorithm: Epoch 0 |+--------------------------------+-------+ Epoch 0 || Statistic's name | Value | Epoch 0 |+================================+=======+ Epoch 0 || Ratio of enabled quantizations | 100 | Epoch 0 |+--------------------------------+-------+ Epoch 0 | Epoch 0 |Statistics of the quantization share: Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Statistic's name | Value | Epoch 0 |+==================================+======================+ Epoch 0 || Symmetric WQs / All placed WQs | 100.00 % (147 / 147) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Asymmetric WQs / All placed WQs | 0.00 % (0 / 147) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Signed WQs / All placed WQs | 100.00 % (147 / 147) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Unsigned WQs / All placed WQs | 0.00 % (0 / 147) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Per-tensor WQs / All placed WQs | 0.00 % (0 / 147) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Per-channel WQs / All placed WQs | 100.00 % (147 / 147) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Placed WQs / Potential WQs | 75.00 % (147 / 196) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Symmetric AQs / All placed AQs | 100.00 % (243 / 243) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Asymmetric AQs / All placed AQs | 0.00 % (0 / 243) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Signed AQs / All placed AQs | 80.25 % (195 / 243) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Unsigned AQs / All placed AQs | 19.75 % (48 / 243) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Per-tensor AQs / All placed AQs | 100.00 % (243 / 243) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 || Per-channel AQs / All placed AQs | 0.00 % (0 / 243) | Epoch 0 |+----------------------------------+----------------------+ Epoch 0 | Epoch 0 |Statistics of the bitwidth distribution: Epoch 0 |+--------------+---------------------+--------------------+--------------------+ Epoch 0 || Num bits (N) | N-bits WQs / Placed | N-bits AQs / | N-bits Qs / Placed | Epoch 0 || | WQs | Placed AQs | Qs | Epoch 0 |+==============+=====================+====================+====================+ Epoch 0 || 8 | 100.00 % (147 / | 100.00 % (243 / | 100.00 % (390 / | Epoch 0 || | 147) | 243) | 390) | Epoch 0 |+--------------+---------------------+--------------------+--------------------+ WARNING:nncf:You are setting `forward` on an NNCF-processed model object. NNCF relies on custom-wrapping the `forward` call in order to function properly. Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behaviour. If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling: model.nncf.set_original_unbound_forward(fn) if `fn` has an unbound 0-th `self` argument, or with model.nncf.temporary_bound_original_forward(fn): ... if `fn` already had 0-th `self` argument bound or never had it in the first place. WARNING:nncf:You are setting `forward` on an NNCF-processed model object. NNCF relies on custom-wrapping the `forward` call in order to function properly. Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behaviour. If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling: model.nncf.set_original_unbound_forward(fn) if `fn` has an unbound 0-th `self` argument, or with model.nncf.temporary_bound_original_forward(fn): ... if `fn` already had 0-th `self` argument bound or never had it in the first place.