vluz commited on
Commit
1e6869c
1 Parent(s): 61398dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md CHANGED
@@ -1,3 +1,74 @@
1
  ---
2
  license: cc0-1.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc0-1.0
3
  ---
4
+
5
+ **Note:** Due to nature of toxic comments, data and code contain explicit language.
6
+
7
+ Data is from kaggle, the *Toxic Comment Classification Challenge*
8
+ <br>
9
+ https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge/data?select=train.csv.zip
10
+
11
+ Dataset used for training: https://huggingface.co/datasets/vluz/Tox
12
+
13
+ Trained over 30 epoch in a runpod
14
+
15
+ ### 🤗 Running demo here:
16
+ https://huggingface.co/spaces/vluz/Tox
17
+
18
+ <hr>
19
+
20
+ Code requires pandas, tensorflow, and streamlit. All can be installed via `pip`.
21
+
22
+ ```python
23
+ import os
24
+ import pickle
25
+ import streamlit as st
26
+ import tensorflow as tf
27
+ from tensorflow.keras.layers import TextVectorization
28
+
29
+
30
+ @st.cache_resource
31
+ def load_model():
32
+ model = tf.keras.models.load_model(os.path.join("model", "toxmodel.keras"))
33
+ return model
34
+
35
+
36
+ @st.cache_resource
37
+ def load_vectorizer():
38
+ from_disk = pickle.load(open(os.path.join("model", "vectorizer.pkl"), "rb"))
39
+ new_v = TextVectorization.from_config(from_disk['config'])
40
+ new_v.adapt(tf.data.Dataset.from_tensor_slices(["xyz"])) # Keras bug
41
+ new_v.set_weights(from_disk['weights'])
42
+ return new_v
43
+
44
+
45
+ st.title("Toxic Comment Test")
46
+ st.divider()
47
+ model = load_model()
48
+ vectorizer = load_vectorizer()
49
+ input_text = st.text_area("Comment:", "I love you man, but fuck you!", height=150)
50
+ if st.button("Test"):
51
+ with st.spinner("Testing..."):
52
+ inputv = vectorizer([input_text])
53
+ output = model.predict(inputv)
54
+ res = (output > 0.5)
55
+ st.write(["toxic","severe toxic","obscene","threat","insult","identity hate"], res)
56
+ st.write(output)
57
+ ```
58
+
59
+
60
+ Put `toxmodel.keras` and `vectorizer.pkl` into the `model` dir.
61
+
62
+ Then do:
63
+ ```
64
+ stramlit run toxtest.py
65
+ ```
66
+
67
+ Expected results from default prompt are positive for 0 and 2
68
+
69
+ <hr>
70
+
71
+ Full code can be found here:
72
+ <br>
73
+ https://github.com/vluz/ToxTest/
74
+ <br>