--- license: apache-2.0 language: - en pipeline_tag: text-classification --- # Model Card for Model ID Based on [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) (MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks). This model detects SQLInjection attacks in the input string (check How To Below). This is a very very light model (100mb) and can be used for edge computing use cases. Used dataset from [Kaggle](www.kaggle.com) called [SQl_Injection](https://www.kaggle.com/datasets/sajid576/sql-injection-dataset). **Please test the model before deploying into any environment**. Contact us for more info: support@cloudsummary.com ### Code Repo Here is the code repo https://github.com/cssupport23/AI-Model---SQL-Injection-Attack-Detector ## Model Details ### Model Description - **Developed by:** cssupport (support@cloudsummary.com) - **Model type:** Language model - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model :** [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) ### Model Sources Please refer [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) for Model Sources. ## How to Get Started with the Model Use the code below to get started with the model. ```python import torch from transformers import MobileBertTokenizer, MobileBertForSequenceClassification device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') tokenizer = MobileBertTokenizer.from_pretrained('google/mobilebert-uncased') model = MobileBertForSequenceClassification.from_pretrained('cssupport/mobilebert-sql-injection-detect') model.to(device) model.eval() def predict(text): inputs = tokenizer(text, padding=False, truncation=True, return_tensors='pt', max_length=512) input_ids = inputs['input_ids'].to(device) attention_mask = inputs['attention_mask'].to(device) with torch.no_grad(): outputs = model(input_ids=input_ids, attention_mask=attention_mask) logits = outputs.logits probabilities = torch.softmax(logits, dim=1) predicted_class = torch.argmax(probabilities, dim=1).item() return predicted_class, probabilities[0][predicted_class].item() #text = "SELECT * FROM users WHERE username = 'admin' AND password = 'password';" #text = "select * from users where username = 'admin' and password = 'password';" #text = "SELECT * from USERS where id = '1' or @ @1 = 1 union select 1,version ( ) -- 1'" #text = "select * from data where id = '1' or @" text ="select * from users where id = 1 or 1#\"? = 1 or 1 = 1 -- 1" predicted_class, confidence = predict(text) if predicted_class > 0.7: print("Prediction: SQL Injection Detected") else: print("Prediction: No SQL Injection Detected") print(f"Confidence: {confidence:.2f}") # OUTPUT # Prediction: SQL Injection Detected # Confidence: 1.00 ``` ## Uses [More Information Needed] ### Direct Use Could used in application where natural language is to be converted into SQL queries. [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## Technical Specifications ### Model Architecture and Objective [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) ### Compute Infrastructure #### Hardware one P6000 GPU #### Software Pytorch and HuggingFace