|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text-classification |
|
--- |
|
# Model Card for Model ID |
|
|
|
<!-- Based on https://huggingface.co/t5-small, model generates SQL from text given table list with "CREATE TABLE" statements. |
|
This is a very light weigh model and could be used in multiple analytical applications. --> |
|
|
|
Based on [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) (MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks). This model detects SQLInjection attacks in the input string (check How To Below). This is a very very light model (100mb) and can be used for edge computing use cases. Used dataset from [Kaggle](www.kaggle.com) called [SQl_Injection](https://www.kaggle.com/datasets/sajid576/sql-injection-dataset). |
|
**Please test the model before deploying into any environment**. |
|
Contact us for more info: support@cloudsummary.com |
|
### Code Repo |
|
Here is the code repo https://github.com/cssupport23/AI-Model---SQL-Injection-Attack-Detector |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** cssupport (support@cloudsummary.com) |
|
- **Model type:** Language model |
|
- **Language(s) (NLP):** English |
|
- **License:** Apache 2.0 |
|
- **Finetuned from model :** [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) |
|
|
|
### Model Sources |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
Please refer [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) for Model Sources. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
```python |
|
import torch |
|
from transformers import MobileBertTokenizer, MobileBertForSequenceClassification |
|
|
|
|
|
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') |
|
tokenizer = MobileBertTokenizer.from_pretrained('google/mobilebert-uncased') |
|
model = MobileBertForSequenceClassification.from_pretrained('cssupport/mobilebert-sql-injection-detect') |
|
model.to(device) |
|
model.eval() |
|
|
|
def predict(text): |
|
inputs = tokenizer(text, padding=False, truncation=True, return_tensors='pt', max_length=512) |
|
input_ids = inputs['input_ids'].to(device) |
|
attention_mask = inputs['attention_mask'].to(device) |
|
|
|
with torch.no_grad(): |
|
outputs = model(input_ids=input_ids, attention_mask=attention_mask) |
|
|
|
logits = outputs.logits |
|
probabilities = torch.softmax(logits, dim=1) |
|
predicted_class = torch.argmax(probabilities, dim=1).item() |
|
return predicted_class, probabilities[0][predicted_class].item() |
|
|
|
|
|
#text = "SELECT * FROM users WHERE username = 'admin' AND password = 'password';" |
|
#text = "select * from users where username = 'admin' and password = 'password';" |
|
#text = "SELECT * from USERS where id = '1' or @ @1 = 1 union select 1,version ( ) -- 1'" |
|
#text = "select * from data where id = '1' or @" |
|
text ="select * from users where id = 1 or 1#\"? = 1 or 1 = 1 -- 1" |
|
predicted_class, confidence = predict(text) |
|
|
|
if predicted_class > 0.7: |
|
print("Prediction: SQL Injection Detected") |
|
else: |
|
print("Prediction: No SQL Injection Detected") |
|
|
|
print(f"Confidence: {confidence:.2f}") |
|
# OUTPUT |
|
# Prediction: SQL Injection Detected |
|
# Confidence: 1.00 |
|
``` |
|
|
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
[More Information Needed] |
|
|
|
### Direct Use |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
Could used in application where natural language is to be converted into SQL queries. |
|
[More Information Needed] |
|
|
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
|
[More Information Needed] |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
[More Information Needed] |
|
|
|
### Recommendations |
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
|
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
|
|
[google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) |
|
|
|
### Compute Infrastructure |
|
|
|
|
|
|
|
#### Hardware |
|
|
|
one P6000 GPU |
|
|
|
#### Software |
|
|
|
Pytorch and HuggingFace |