Fine-tuning model

#2
by zppcst - opened

Hi, I found this model very interesting and wanted to use it for my task. Also, I wanted to fine-tune it with my own data.

So I used this link as a training guide: https://spacy.io/usage/training#basics

Also, I used your it_nerIta_trf/config.cfg file to do the fine-tuning.

Starting the training shows this:

=========================== Initializing pipeline ===========================
[2022-12-05 15:22:00,040] [INFO] Set up nlp object from config
[2022-12-05 15:22:00,046] [INFO] Pipeline: ['transformer', 'ner']
[2022-12-05 15:22:00,048] [INFO] Created vocabulary
[2022-12-05 15:22:00,049] [INFO] Finished initializing nlp object
Some weights of the model checkpoint at bullmount/hseBert-it-cased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertModel were not initialized from the model checkpoint at bullmount/hseBert-it-cased and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2022-12-05 15:22:06,822] [INFO] Initialized pipeline components: ['transformer', 'ner']

Training seems to proceed well, albeit slowly. But I can't figure out which one case I am in and whether this can cause problems in training:

  • This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).

  • This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

Incidentally, this happens even if I use a standard configuration file provided by the spacy site mentioned above.

Can you help me?

Thanks in advance.

Sign up or log in to comment