|
--- |
|
license: unknown |
|
--- |
|
|
|
# Overview |
|
|
|
<!-- This model is obtained by finetuning Pre-Trained RoBERTa on dataset containing several sets of malicious prompts. |
|
Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails. |
|
This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts, as detailed in the corresponding arXiv paper. |
|
Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails. --> |
|
|
|
Our model, "ScamLLM" is designed to identify malicious prompts that can be used to generate phishing websites and emails using popular commercial LLMs like ChatGPT, Bard and Claude. |
|
This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts. |
|
|
|
Try out "ScamLLM" using the Inference API. Our model classifies prompts with "Label 1" to signify the identification of a phishing attempt, while "Label 0" denotes a prompt that is considered safe and non-malicious. |
|
|
|
## Dataset Details |
|
|
|
The dataset utilized for training this model has been created using malicious prompts generated by GPT-4. |
|
Due to being active vulnerabilities under review, our dataset of malicious prompts is available only upon request at this stage, with plans for a public release scheduled for May 2024. |
|
|
|
## Training Details |
|
|
|
The model was trained using RobertaForSequenceClassification.from_pretrained. |
|
In this process, both the model and tokenizer pertinent to the RoBERTa-base were employed and trained for 10 epochs (learning rate 2e-5 and AdamW Optimizer). |
|
|
|
## Inference |
|
|
|
There are multiple ways to test this model, with the simplest being to use the Inference API, as well as with the pipeline "text-classification" as below: |
|
|
|
```python |
|
from transformers import pipeline |
|
classifier = pipeline(task="text-classification", model="phishbot/ScamLLM", top_k=None) |
|
prompt = ["Your Sample Sentence or Prompt...."] |
|
model_outputs = classifier(prompt) |
|
print(model_outputs[0]) |
|
``` |
|
|