INFO:transformers.configuration_utils:loading configuration file ../../Multilingual-MiniLM-L12-H384/config.json INFO:transformers.configuration_utils:Model config BertConfig { "attention_probs_dropout_prob": 0.1, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 384, "initializer_range": 0.02, "intermediate_size": 1536, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 250037 } INFO:transformers.modeling_utils:loading weights file ../../Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.modeling_utils:Weights of BertModel not initialized from pretrained model: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias'] INFO:transformers.configuration_utils:loading configuration file ../../Multilingual-MiniLM-L12-H384/config.json INFO:transformers.configuration_utils:Model config BertConfig { "attention_probs_dropout_prob": 0.1, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 384, "initializer_range": 0.02, "intermediate_size": 1536, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 250037 } INFO:transformers.modeling_utils:loading weights file ../../Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.configuration_utils:loading configuration file ../../Multilingual-MiniLM-L12-H384/config.json INFO:transformers.configuration_utils:Model config BertConfig { "attention_probs_dropout_prob": 0.1, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 384, "initializer_range": 0.02, "intermediate_size": 1536, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 250037 } INFO:transformers.modeling_utils:loading weights file ../../Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.modeling_utils:Weights of BertModel not initialized from pretrained model: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias'] INFO:transformers.configuration_utils:loading configuration file ../../Multilingual-MiniLM-L12-H384/config.json INFO:transformers.configuration_utils:Model config BertConfig { "attention_probs_dropout_prob": 0.1, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 384, "initializer_range": 0.02, "intermediate_size": 1536, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 250037 } INFO:transformers.modeling_tf_utils:loading weights file ../../Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.modeling_tf_pytorch_utils:Loading PyTorch weights from /home/patrick/hugging_face/models/Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.modeling_tf_pytorch_utils:PyTorch checkpoint contains 117,904,565 parameters INFO:transformers.modeling_tf_pytorch_utils:Loaded 117,505,920 parameters in the TF 2.0 model. INFO:transformers.modeling_tf_pytorch_utils:Weights or buffers not loaded from PyTorch model: {'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias'} INFO:transformers.configuration_utils:Configuration saved in ./config.json INFO:transformers.modeling_utils:Model weights saved in ./pytorch_model.bin INFO:transformers.configuration_utils:Configuration saved in ./config.json INFO:transformers.modeling_tf_utils:Model weights saved in ./tf_model.h5 INFO:transformers.tokenization_utils_base:Model name '../../MiniLM-L12-H384-uncased/' not found in model shortcut name list (xlm-roberta-base, xlm-roberta-large, xlm-roberta-large-finetuned-conll02-dutch, xlm-roberta-large-finetuned-conll02-spanish, xlm-roberta-large-finetuned-conll03-english, xlm-roberta-large-finetuned-conll03-german). Assuming '../../MiniLM-L12-H384-uncased/' is a path, a model identifier, or url to a directory containing tokenizer files. INFO:transformers.tokenization_utils_base:Didn't find file ../../MiniLM-L12-H384-uncased/sentencepiece.bpe.model. We won't load it. INFO:transformers.tokenization_utils_base:Didn't find file ../../MiniLM-L12-H384-uncased/added_tokens.json. We won't load it. INFO:transformers.tokenization_utils_base:Didn't find file ../../MiniLM-L12-H384-uncased/special_tokens_map.json. We won't load it. INFO:transformers.tokenization_utils_base:Didn't find file ../../MiniLM-L12-H384-uncased/tokenizer_config.json. We won't load it. INFO:transformers.configuration_utils:loading configuration file ../../Multilingual-MiniLM-L12-H384/config.json INFO:transformers.configuration_utils:Model config BertConfig { "attention_probs_dropout_prob": 0.1, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 384, "initializer_range": 0.02, "intermediate_size": 1536, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 250037 } INFO:transformers.modeling_utils:loading weights file ../../Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.modeling_utils:Weights of BertModel not initialized from pretrained model: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias'] INFO:transformers.configuration_utils:loading configuration file ../../Multilingual-MiniLM-L12-H384/config.json INFO:transformers.configuration_utils:Model config BertConfig { "attention_probs_dropout_prob": 0.1, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 384, "initializer_range": 0.02, "intermediate_size": 1536, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 250037 } INFO:transformers.modeling_tf_utils:loading weights file ../../Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.modeling_tf_pytorch_utils:Loading PyTorch weights from /home/patrick/hugging_face/models/Multilingual-MiniLM-L12-H384/pytorch_model.bin INFO:transformers.modeling_tf_pytorch_utils:PyTorch checkpoint contains 117,904,565 parameters INFO:transformers.modeling_tf_pytorch_utils:Loaded 117,505,920 parameters in the TF 2.0 model. INFO:transformers.modeling_tf_pytorch_utils:Weights or buffers not loaded from PyTorch model: {'cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias'} INFO:transformers.configuration_utils:Configuration saved in ./config.json INFO:transformers.modeling_utils:Model weights saved in ./pytorch_model.bin INFO:transformers.configuration_utils:Configuration saved in ./config.json INFO:transformers.modeling_tf_utils:Model weights saved in ./tf_model.h5 INFO:transformers.tokenization_utils_base:Model name '../../Multilingual-MiniLM-L12-H384/sentencepiece.bpe.model' not found in model shortcut name list (xlm-roberta-base, xlm-roberta-large, xlm-roberta-large-finetuned-conll02-dutch, xlm-roberta-large-finetuned-conll02-spanish, xlm-roberta-large-finetuned-conll03-english, xlm-roberta-large-finetuned-conll03-german). Assuming '../../Multilingual-MiniLM-L12-H384/sentencepiece.bpe.model' is a path, a model identifier, or url to a directory containing tokenizer files. WARNING:transformers.tokenization_utils_base:Calling XLMRobertaTokenizer.from_pretrained() with the path to a single file or url is deprecated INFO:transformers.tokenization_utils_base:loading file ../../Multilingual-MiniLM-L12-H384/sentencepiece.bpe.model