VerificadoProfesional
/

SaBERT-Spanish-Fake-News

Text Classification

Inference Endpoints

Model card Files Files and versions Community

VerificadoProfesional commited on Apr 24

Commit

f2ee3ba

•

1 Parent(s): 596fa66

Update README.md

Files changed (1) hide show

README.md +79 -0

README.md CHANGED Viewed

	@@ -10,3 +10,82 @@ pipeline_tag: text-classification
10
11	# Spanish Fake News Classifier
12

 # Spanish Fake News Classifier
+## Overview
+This BERT-based text classifier was developed as a thesis project for the Computer Engineering degree at Universidad de Buenos Aires (UBA).
+The model is designed to detect fake news in Spanish and was fine-tuned on the *dccuchile/bert-base-spanish-wwm-uncased* model using a specific set of hyperparameters.
+It was trained on a dataset containing 125,000 Spanish news articles collected from various regions, both true and false.
+## Model Details
+* **Base Mode**: dccuchile/bert-base-spanish-wwm-uncased
+* **Hyperparameters**:
+  * **dropout_rate = 0.1**
+  * **num_classes = 2**
+  * **max_length = 128**
+  * **batch_size = 16**
+  * **num_epochs = 5**
+  * **learning_rate = 3e-5**
+* **Dataset**: 125,000 Spanish news articles (True and False)
+## Metrics
+The model's performance was evaluated using the following metrics:
+  * **Accuracy = _83.17%_**
+  * **F1-Score = _81.94%_**
+  * **Precision = _85.62%_**
+  * **Recall = _81.10%_**
+## Usage
+### Installation
+You can install the required dependencies using pip:
+```bash
+pip install transformers torch
+```
+### Loading the Model
+```python
+from transformers import BertForSequenceClassification, BertTokenizer
+model = BertForSequenceClassification.from_pretrained("VerificadoProfesional/SaBERT-Spanish-Fake-News ")
+tokenizer = BertTokenizer.from_pretrained("VerificadoProfesional/SaBERT-Spanish-Fake-News ")
+```
+### Predict Function
+```python
+def predict(model,tokenizer,text,threshold = 0.5):
+        inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
+        with torch.no_grad():
+            outputs = model(**inputs)
+        logits = outputs.logits
+        probabilities = torch.softmax(logits, dim=1).squeeze().tolist()
+        predicted_class = torch.argmax(logits, dim=1).item()
+        if probabilities[predicted_class] <= threshold and predicted_class == 1:
+            predicted_class = 0
+        return bool(predicted_class), probabilities
+```
+### Making Predictions
+```python
+text = "Your Spanish news text here"
+predicted_label,probabilities = predict(model,tokenizer,text)
+print(f"Text: {text}")
+print(f"Predicted Class: {predicted_label}")
+print(f"Probabilities: {probabilities}")
+```
+## License
+Apache License 2.0
+## Acknowledgments
+Special thanks to DCC UChile for the base Spanish BERT model and to all contributors to the dataset used for training.