FredZhang7
/

malphish-eater-v1

Text Classification

Inference Endpoints

Model card Files Files and versions Community

FredZhang7 commited on Jul 20, 2023

Commit

2aa9ee7

•

1 Parent(s): d724999

Update README.md

Files changed (1) hide show

README.md +30 -3

README.md CHANGED Viewed

@@ -1,6 +1,33 @@
 ---
-license: cc-by-nd-4.0
 wget:
-- text: "https://chat.openai.com/"
-- text: "https://huggingface.co/FredZhang7/aivance-safesearch-v3"
 ---

 ---
+license: cc-by-nc-4.0
+dataset:
+- FredZhang7/malicious-website-features-2.4M
 wget:
+- text: https://chat.openai.com/
+- text: https://huggingface.co/FredZhang7/aivance-safesearch-v3
 ---
+The classification task is split into two stages:
+1. URL features model
+    - 96.5%+ accuracy on training and validation data
+    - 2,436,727 rows of labelled URLs
+2. Website features model
+    - 98.2% on training data, 98.7% accuracy on validation
+    - 911,180 rows of 11 features
+## URL Features
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained("FredZhang7/malware-phisher")
+model = AutoModelForSequenceClassification.from_pretrained("FredZhang7/malware-phisher")
+```
+## Website Features
+```bash
+pip install lightgbm
+```
+```python
+import lightgbm as lgb
+lgb.Booster(model_file="malicious_features_combined.txt")
+```