File size: 2,522 Bytes
c8a1822
 
 
 
 
a5cddfe
d2883e2
ffeaa1c
 
ea7f910
ffeaa1c
ea7f910
 
158d3a2
ffeaa1c
 
 
c8a1822
 
a5cddfe
17bedf3
5d9a776
 
a5cddfe
 
33c0b77
 
 
a5cddfe
66fbf0a
dce7b74
66fbf0a
5c7616b
66fbf0a
8ce5e82
 
66fbf0a
 
158d3a2
66fbf0a
 
a5cddfe
c8fe846
81fb06f
5d9a776
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
language:
- en
tags:
- Text Classification
co2_eq_emissions: 0.319355 Kg
widget:
- text: "Nevertheless, Trump and other Republicans have tarred the protests as havens for terrorists intent on destroying property."
  example_title: "Biased example 1"
- text: "Billie Eilish issues apology for mouthing an anti-Asian derogatory term in a resurfaced video."
  example_title: "Biased example 2"
- text: "Christians should make clear that the perpetuation of objectionable vaccines and the lack of alternatives is a kind of coercion."
  example_title: "Biased example 3"
- text: "There have been a protest by a group of people"
  example_title: "Non-Biased example 1"
- text: "While emphasizing he’s not singling out either party, Cohen warned about the danger of normalizing white supremacist ideology."
  example_title: "Non-Biased example 2"
---

## About the Model
An English sequence classification model, trained on MBAD Dataset to detect bias and fairness in sentences (news articles). This model was built on top of distilbert-base-uncased model and trained for 30 epochs with a batch size of 16, a learning rate of 5e-5, and a maximum sequence length of 512.

- Dataset : MBAD Data
- Carbon emission 0.319355 Kg

| Train Accuracy | Validation Accuracy | Train loss | Test loss |
|---------------:| -------------------:| ----------:|----------:|
|          76.97 |               62.00 |       0.45 |      0.96 |

## Usage
The easiest way is to load the inference api from huggingface and second method is through the pipeline object offered by transformers library.
```python
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("d4data/bias-detection-model")
model = TFAutoModelForSequenceClassification.from_pretrained("d4data/bias-detection-model")

classifier = pipeline('text-classification', model=model, tokenizer=tokenizer) # cuda = 0,1 based on gpu availability
classifier("The irony, of course, is that the exhibit that invites people to throw trash at vacuuming Ivanka Trump lookalike reflects every stereotype feminists claim to stand against, oversexualizing Ivanka’s body and ignoring her hard work.")
```

## Author
This model is part of the Research topic "Bias and Fairness in AI" conducted by Deepak John Reji, Shaina Raza. If you use this work (code, model or dataset), please cite as:
> Bias & Fairness in AI, (2022), GitHub repository, <https://github.com/dreji18/Fairness-in-AI/tree/dev>