🤗 + neuraly - Italian BERT Sentiment model
This model performs sentiment analysis on Italian sentences. It was trained starting from an instance of bert-base-italian-cased, and fine-tuned on an Italian dataset of tweets, reaching 82% of accuracy on the latter one.
Intended uses & limitations
How to use
import torch from torch import nn from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained("neuraly/bert-base-italian-cased-sentiment") # Load the model, use .cuda() to load it on the GPU model = AutoModelForSequenceClassification.from_pretrained("neuraly/bert-base-italian-cased-sentiment") sentence = 'Huggingface è un team fantastico!' input_ids = tokenizer.encode(sentence, add_special_tokens=True) # Create tensor, use .cuda() to transfer the tensor to GPU tensor = torch.tensor(input_ids).long() # Fake batch dimension tensor = tensor.unsqueeze(0) # Call the model and get the logits logits, = model(tensor) # Remove the fake batch dimension logits = logits.squeeze(0) # The model was trained with a Log Likelyhood + Softmax combined loss, hence to extract probabilities we need a softmax on top of the logits tensor proba = nn.functional.softmax(logits, dim=0) # Unpack the tensor to obtain negative, neutral and positive probabilities negative, neutral, positive = proba
Limitations and bias
A possible drawback (or bias) of this model is related to the fact that it was trained on a tweet dataset, with all the limitations that come with it. The domain is strongly related to football players and teams, but it works surprisingly well even on other topics.
We trained the model by combining the two tweet datasets taken from Sentipolc EVALITA 2016. Overall the dataset consists of 45K pre-processed tweets.
The model weights come from a pre-trained instance of bert-base-italian-cased. A huge "thank you" goes to that team, brilliant work!
We tried to save as much information as possible, since BERT captures extremely well the semantic of complex text sequences. Overall we removed only @mentions, urls and emails from every tweet and kept pretty much everything else.
- GPU: Nvidia GTX1080ti
- CPU: AMD Ryzen7 3700x 8c/16t
- RAM: 64GB DDR4
- Optimizer: AdamW with learning rate of 2e-5, epsilon of 1e-8
- Max epochs: 5
- Batch size: 32
- Early Stopping: enabled with patience = 1
Early stopping was triggered after 3 epochs.
The model achieves an overall accuracy on the test set equal to 82% The test set is a 20% split of the whole dataset.
Neuraly is a young and dynamic startup committed to designing AI-driven solutions and services through the most advanced Machine Learning and Data Science technologies. You can find out more about who we are and what we do on our website.
Thanks to the generous support from the Hugging Face team, it is possible to download the model from their S3 storage and live test it from their inference API 🤗.
- Downloads last month