VerificadoProfesional
/

SaBERT-Spanish-Sentiment-Analysis

Text Classification

Inference Endpoints

Model card Files Files and versions Community

VerificadoProfesional commited on Apr 24

Commit

b175f15

•

1 Parent(s): 90ecfdc

Update README.md

Files changed (1) hide show

README.md +88 -0

README.md CHANGED Viewed

@@ -1,3 +1,91 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+language:
+- es
+metrics:
+- accuracy
+pipeline_tag: text-classification
 ---
+# Spanish Sentiment Analysis Classifier
+## Overview
+This BERT-based text classifier was developed as a thesis project for the Computer Engineering degree at Universidad de Buenos Aires (UBA).
+The model is designed to detect sentiments in Spanish and was fine-tuned on the *dccuchile/bert-base-spanish-wwm-uncased* model using a specific set of hyperparameters.
+It was trained on a dataset containing 11,500 Spanish tweets collected from various regions, both positive and negative.
+## Team Members
+- **[Azul Fuentes](https://github.com/azu26)**
+- **[Dante Reinaudo](https://github.com/DanteReinaudo)**
+- **[Lucía Pardo](https://github.com/luciaPardo)**
+- **[Roberto Iskandarani](https://github.com/Robert-Iskandarani)**
+## Model Details
+* **Base Mode**: dccuchile/bert-base-spanish-wwm-uncased
+* **Hyperparameters**:
+  * **dropout_rate = 0.1**
+  * **num_classes = 2**
+  * **max_length = 128**
+  * **batch_size = 16**
+  * **num_epochs = 10**
+  * **learning_rate = 3e-5**
+* **Dataset**: 11,500 Spanish tweets (Positive and Negative)
+## Metrics
+The model's performance was evaluated using the following metrics:
+  * **Accuracy = _85.50%_**
+  * **F1-Score = _85.49%_**
+  * **Precision = _85.50%_**
+  * **Recall = _85.49%_**
+## Usage
+### Installation
+You can install the required dependencies using pip:
+```bash
+pip install transformers torch
+```
+### Loading the Model
+```python
+from transformers import BertForSequenceClassification, BertTokenizer
+model = BertForSequenceClassification.from_pretrained("VerificadoProfesional/SaBERT-Spanish-Sentiment-Analysis")
+tokenizer = BertTokenizer.from_pretrained("VerificadoProfesional/SaBERT-Spanish-Sentiment-Analysis")
+```
+### Predict Function
+```python
+def predict(model,tokenizer,text,threshold = 0.5):
+        inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
+        with torch.no_grad():
+            outputs = model(**inputs)
+        logits = outputs.logits
+        probabilities = torch.softmax(logits, dim=1).squeeze().tolist()
+        predicted_class = torch.argmax(logits, dim=1).item()
+        if probabilities[predicted_class] <= threshold and predicted_class == 1:
+            predicted_class = 0
+        return bool(predicted_class), probabilities
+```
+### Making Predictions
+```python
+text = "Your Spanish news text here"
+predicted_label,probabilities = predict(model,tokenizer,text)
+print(f"Text: {text}")
+print(f"Predicted Class: {predicted_label}")
+print(f"Probabilities: {probabilities}")
+```
+## License
+Apache License 2.0
+## Acknowledgments
+Special thanks to DCC UChile for the base Spanish BERT model and to all contributors to the dataset used for training.