Training A Model From Hugging Face Hub¶
Using AutoNLP you can also finetune a model that is hosted on Hugging Face Hub. You can choose from one of the 10K+ models hosted here: http://hf.co/models. The model must have it’s own tokenizer!
To train a model of your choice from hub, all you need to do is specify –hub_model parameter while creating a project.
Let’s assume our data is in CSV format and looks something like the following:
|i love autonlp||0.1|
|i dont like this movie||0.5|
|this is the best tutorial ever||-1.5|
Here, we see only three samples but you can have as many samples as you like: 5000, 10000, 100000 or even a million or more!
Once you have the data in the format specified above, you are ready to train models using AutoNLP. Yes, it’s that easy.
The first step would be login to AutoNLP:
$ autonlp login --api-key YOUR_HUGGING_FACE_API_TOKEN
If you do not know your Hugging Face API token, please create an account on huggingface.co and you will find your api key in settings. Please do not share your api key with anyone!
Once you have logged in, you can create a new project:
$ autonlp create_project --name hub_model_training --task single_column_regression --hub_model abhishek/my_awesome_model --max_models 25
The hub model, “abhishek/my_awesome_model” must consist of a tokenizer and must be compatible with Hugging Face’s transformers. You can also specify “–max_models” parameter to train different variations of the same model. When you specify “–hub_model”, the language parameter is ignored and AutoNLP does everything but model search.
Everything else remains the same as any other task!