JungleLee commited on
Commit
c5769ef
1 Parent(s): 3f0efc6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: afl-3.0
3
+ datasets:
4
+ - jigsaw_toxicity_pred
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ library_name: transformers
10
+ pipeline_tag: text-classification
11
+ ---
12
+
13
+ ## Model description
14
+ This model is a fine-tuned version of the [bert-base-uncased model](https://huggingface.co/transformers/model_doc/bert.html) to classify toxic comments.
15
+
16
+ ## How to use
17
+
18
+ You can use the model with the following code.
19
+
20
+ ```python
21
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer, TextClassificationPipeline
22
+
23
+ model_path = "JungleLee/bert-toxic-comment-classification"
24
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
25
+ model = AutoModelForSequenceClassification.from_pretrained(model_path)
26
+
27
+ pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
28
+ print(pipeline('You're a fucking nerd.'))
29
+ ```
30
+
31
+ ## Training data
32
+ The training data comes this [Kaggle competition](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data). We use 90% of the `train.csv` data to train the model.
33
+
34
+ ## Evaluation results
35
+
36
+ The model achieves 0.95 AUC in a 1500 rows held-out test set.