Loading model from path bert-base-uncased Task: ner Model path: bert-base-uncased Data path: ./data/ud/ Tokenizer: bert-base-uncased Batch size: 32 Epochs: 10 Learning rate: 2e-05 LR Decay: 0.3 LR Decay End Epoch: 5 Sequence length: 96 Training: True Num Threads: 24 Num Sentences: 0 Max Norm: 0.0 Use GNN: False Use label weights: False PID: 3523179, PGID: 3523174 ATen/Parallel: at::get_num_threads() : 24 at::get_num_interop_threads() : 36 OpenMP 201511 (a.k.a. OpenMP 4.5) omp_get_max_threads() : 24 Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications mkl_get_max_threads() : 24 Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815) std::thread::hardware_concurrency() : 72 Environment variables: OMP_NUM_THREADS : 24 MKL_NUM_THREADS : 24 ATen parallel backend: OpenMP Training model Loading Training Data Loading NER labels from ./data/ud/**/*-train-orig.ner en_atis-ud-train-orig.ner num sentences: 4274 en_cesl-ud-train-orig.ner num sentences: 4124 en_ewt-ud-train-orig.ner num sentences: 11649 en_gum-ud-train-orig.ner num sentences: 5344 en_lines-ud-train-orig.ner num sentences: 3010 en_partut-ud-train-orig.ner num sentences: 1739 Example of NER labels: [[['what', 'O'], ['is', 'O'], ['the', 'O'], ['cost', 'O'], ['of', 'O'], ['a', 'O'], ['round', 'O'], ['trip', 'O'], ['flight', 'O'], ['from', 'O'], ['pittsburgh', 'S-GPE'], ['to', 'O'], ['atlanta', 'S-GPE'], ['beginning', 'O'], ['on', 'O'], ['april', 'B-DATE'], ['twenty', 'I-DATE'], ['fifth', 'E-DATE'], ['and', 'O'], ['returning', 'O'], ['on', 'O'], ['may', 'B-DATE'], ['sixth', 'E-DATE']], [['now', 'O'], ['i', 'O'], ['need', 'O'], ['a', 'O'], ['flight', 'O'], ['leaving', 'O'], ['fort', 'B-GPE'], ['worth', 'E-GPE'], ['and', 'O'], ['arriving', 'O'], ['in', 'O'], ['denver', 'S-GPE'], ['no', 'O'], ['later', 'O'], ['than', 'O'], ['2', 'B-TIME'], ['pm', 'E-TIME'], ['next', 'B-DATE'], ['monday', 'E-DATE']]] 30140 sentences, 942 batches of size 32 Control example of InputFeatures Input Ids: [101, 2085, 1045, 2342, 1037, 3462, 2975, 3481, 4276, 1998, 7194, 1999, 7573, 2053, 2101, 2084, 1016, 7610, 2279, 6928, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Input Mask: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Label Ids: [77, 1, 1, 1, 1, 1, 1, 31, 32, 1, 1, 1, 16, 1, 1, 1, 28, 30, 17, 19, 78, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Valid Ids: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] Label Mask: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Segment Ids: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Loading Validation Data Loading NER labels from ./data/ud/**/*-dev-orig.ner en_atis-ud-dev-orig.ner num sentences: 572 en_cesl-ud-dev-orig.ner num sentences: 500 en_ewt-ud-dev-orig.ner num sentences: 1875 en_gum-ud-dev-orig.ner num sentences: 788 en_lines-ud-dev-orig.ner num sentences: 986 en_partut-ud-dev-orig.ner num sentences: 149 Example of NER labels: [[['i', 'O'], ['would', 'O'], ['like', 'O'], ['the', 'O'], ['cheapest', 'O'], ['flight', 'O'], ['from', 'O'], ['pittsburgh', 'S-GPE'], ['to', 'O'], ['atlanta', 'S-GPE'], ['leaving', 'O'], ['april', 'B-DATE'], ['twenty', 'I-DATE'], ['fifth', 'E-DATE'], ['and', 'O'], ['returning', 'O'], ['may', 'B-DATE'], ['sixth', 'E-DATE']], [['i', 'O'], ['want', 'O'], ['a', 'O'], ['flight', 'O'], ['from', 'O'], ['memphis', 'S-LOC'], ['to', 'O'], ['seattle', 'S-FAC'], ['that', 'O'], ['arrives', 'O'], ['no', 'O'], ['later', 'O'], ['than', 'O'], ['3', 'B-TIME'], ['pm', 'E-TIME']]] 4870 sentences, 153 batches of size 32 Control example of InputFeatures Input Ids: [101, 1045, 2215, 1037, 3462, 2013, 9774, 2000, 5862, 2008, 8480, 2053, 2101, 2084, 1017, 7610, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Input Mask: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Label Ids: [77, 1, 1, 1, 1, 1, 59, 1, 60, 1, 1, 1, 1, 1, 28, 30, 78, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Valid Ids: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] Label Mask: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Segment Ids: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Test Data Loading NER labels from ./data/ud/**/*-test-orig.ner en_atis-ud-test-orig.ner num sentences: 586 en_cesl-ud-test-orig.ner num sentences: 500 en_ewt-ud-test-orig.ner num sentences: 1955 en_gum-ud-test-orig.ner num sentences: 851 en_lines-ud-test-orig.ner num sentences: 988 en_pud-ud-test-orig.ner num sentences: 973 en_partut-ud-test-orig.ner num sentences: 149 en_pronouns-ud-test-orig.ner Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForNer: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight'] - This IS expected if you are initializing BertForNer from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertForNer from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of BertForNer were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. num sentences: 265 Example of NER labels: [[['what', 'O'], ['are', 'O'], ['the', 'O'], ['coach', 'O'], ['flights', 'O'], ['between', 'O'], ['dallas', 'S-GPE'], ['and', 'O'], ['baltimore', 'S-GPE'], ['leaving', 'O'], ['august', 'B-DATE'], ['tenth', 'E-DATE'], ['and', 'O'], ['returning', 'O'], ['august', 'B-DATE'], ['twelve', 'E-DATE']], [['i', 'O'], ['want', 'O'], ['a', 'O'], ['flight', 'O'], ['from', 'O'], ['nashville', 'S-GPE'], ['to', 'O'], ['seattle', 'S-GPE'], ['that', 'O'], ['arrives', 'O'], ['no', 'O'], ['later', 'O'], ['than', 'O'], ['3', 'B-TIME'], ['pm', 'E-TIME']]] 6267 sentences, 196 batches of size 32 Control example of InputFeatures Input Ids: [101, 1045, 2215, 1037, 3462, 2013, 8423, 2000, 5862, 2008, 8480, 2053, 2101, 2084, 1017, 7610, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Input Mask: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Label Ids: [77, 1, 1, 1, 1, 1, 16, 1, 16, 1, 1, 1, 1, 1, 28, 30, 78, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Valid Ids: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] Label Mask: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Segment Ids: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Adjusting learning rate of group 0 to 2.0000e-05. 0%| | 0/942 [00:00 seems not to be NE tag. warnings.warn('{} seems not to be NE tag.'.format(chunk)) /home/9_QuAnTuM_6/cdaniel/venv_syntrans/lib/python3.8/site-packages/seqeval/metrics/sequence_labeling.py:171: UserWarning: seems not to be NE tag. warnings.warn('{} seems not to be NE tag.'.format(chunk)) /home/9_QuAnTuM_6/cdaniel/venv_syntrans/lib/python3.8/site-packages/seqeval/metrics/sequence_labeling.py:171: UserWarning: X seems not to be NE tag. warnings.warn('{} seems not to be NE tag.'.format(chunk)) /home/9_QuAnTuM_6/cdaniel/venv_syntrans/lib/python3.8/site-packages/seqeval/metrics/sequence_labeling.py:171: UserWarning: [SEP] seems not to be NE tag. warnings.warn('{} seems not to be NE tag.'.format(chunk)) /home/9_QuAnTuM_6/cdaniel/venv_syntrans/lib/python3.8/site-packages/seqeval/metrics/sequence_labeling.py:171: UserWarning: seems not to be NE tag. warnings.warn('{} seems not to be NE tag.'.format(chunk)) O Token Predictions: 471383, NER token predictions: 32921 loss: 0.4831624427086608 w prec: 0.48333733331137896 w recall: 0.32263332619404583 w f1: 0.3743943894083346 0%| | 0/153 [00:00