MUHAMMADSAADAMIN commited on
Commit
6d14ec4
·
verified ·
1 Parent(s): 9b5a75b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - code
5
+ - security
6
+ - vulnerability-detection
7
+ - codebert
8
+ - classification
9
+ license: mit
10
+ ---
11
+
12
+ # PolyGuard — Code Vulnerability Scanner
13
+
14
+ A fine-tuned [CodeBERT](https://huggingface.co/microsoft/codebert-base) model
15
+ for detecting security vulnerabilities in source code.
16
+
17
+ ## Supported Languages
18
+ Python, JavaScript, SQL, PHP, Java, C, C++, Go, Ruby, Rust
19
+
20
+ ## Performance
21
+ - **F1 Score**: 0.6698
22
+ - **Training samples**: 16681
23
+ - **Base model**: microsoft/codebert-base
24
+ - **Trained at**: 2026-04-29
25
+
26
+ ## Usage
27
+
28
+ ```python
29
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
30
+ import torch
31
+
32
+ model_id = "MUHAMMADSAADAMIN/PolyGuard"
33
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
34
+ model = AutoModelForSequenceClassification.from_pretrained(model_id)
35
+ model.eval()
36
+
37
+ code = "eval(input())"
38
+ inputs = tokenizer(code, return_tensors="pt", truncation=True, max_length=512)
39
+ with torch.no_grad():
40
+ logits = model(**inputs).logits
41
+
42
+ probs = torch.softmax(logits, dim=1).squeeze().tolist()
43
+ print(f"Clean: {probs[0]*100:.1f}% Vulnerable: {probs[1]*100:.1f}%")
44
+ ```
45
+
46
+ ## Labels
47
+ - 0 = Clean / Safe
48
+ - 1 = Vulnerable
49
+
50
+ ## Training Data
51
+ Fine-tuned on CrossVUL dataset (~9,300 real-world CVE pairs) with
52
+ curated augmentation examples covering common CWEs.