File size: 726 Bytes
d68d080
 
c2cbb97
 
 
 
 
 
 
 
 
 
 
 
d68d080
c2cbb97
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
license: mit
tags:
- Text Classification
- Transformers
- PyTorch
- JAX
- MSR
- English
- roberta
- Inference Endpoints
metrics:
- accuracy
pipeline_tag: text-classification
---

I finetuned a RobertaForSequenceClassification model which is initialized 
from CodeBert [https://huggingface.co/microsoft/codebert-base] to judge whether a code is vulnerable or not.
I selected balanced samples from MSR dataset [https://github.com/ZeoVan/MSR_20_Code_vulnerability_CSV_Dataset] for training, validation, and testing.
The "func_before" is used for code classification. All the data is in the file "msr.csv".

Test Reulsts: 
acc 0.7022935779816514, f1 0.6482384823848238,  precision 0.7920529801324503, recall 0.5486238532110091