Model Card for SQL Injection Classifier
This model is a classifier that detects SQL injection attacks in SQL queries. It is based on the google/gemma-2b-it
model and uses the peft
library for training and evaluation. This model is trained on a dataset of SQL queries with and without SQL injection attacks.
Model Details
Model Description
This SQL injection classifier is a fine-tuned version of the google/gemma-2b-it model, optimized to detect potential SQL injection vulnerabilities in SQL queries. It uses the PEFT (Parameter-Efficient Fine-Tuning) library to achieve high performance while maintaining efficiency.
The model demonstrates exceptional performance in classifying SQL queries as either secure or vulnerable:
Accuracy: 0.9984
Precision: 0.9974
Recall: 0.9993
F1-score: 0.9984
Classification Report:
precision recall f1-score support
Secure 1.00 1.00 1.00 5658
Vulnerable 1.00 1.00 1.00 5467
accuracy 1.00 11125
macro avg 1.00 1.00 1.00 11125
weighted avg 1.00 1.00 1.00 11125
- Developed by: Mahesh Jamdade
- Model type: Text Classification
- Language(s) (NLP): SQL, English
- License: [More Information Needed]
- Finetuned from model: google/gemma-2b-it
Model Sources
Uses
Direct Use
This model can be directly used to classify SQL queries as either secure or vulnerable to SQL injection attacks. It can be integrated into security tools, database management systems, or web application firewalls to provide an additional layer of protection against SQL injection vulnerabilities.
Downstream Use
The model can be further fine-tuned or integrated into larger security ecosystems. It could be used as a component in:
- Code review tools
- Automated security testing suites
- Real-time query analysis systems in database applications
Out-of-Scope Use
This model is specifically trained for SQL injection detection and should not be used for:
- Detecting other types of security vulnerabilities
- Generating or correcting SQL queries
- Analyzing queries in languages other than SQL
Bias, Risks, and Limitations
- The model's performance may vary on SQL dialects or patterns not well-represented in the training data.
- False positives or negatives, while rare given the high accuracy, could still occur and should be considered in critical applications.
- The model may not catch highly sophisticated or novel SQL injection techniques.
Recommendations
- Always use this model as part of a comprehensive security strategy, not as the sole defense against SQL injection.
- Regularly update and retrain the model with new, real-world SQL injection patterns.
- Implement additional security measures such as parameterized queries and input sanitization.
How to Get Started with the Model
Use the following code to get started with the model:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model_path = "maheshj01/sql-injection-classifier"
model = AutoModelForSequenceClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
# Function to classify a SQL query
def classify_query(query):
inputs = tokenizer(query, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()
return "Vulnerable" if prediction == 1 else "Secure"
# Example usage
query = "SELECT * FROM users WHERE username = 'admin' OR '1'='1'"
result = classify_query(query)
print(f"The query is classified as: {result}")
Training Details
Training Data
The model was trained on a dataset of SQL queries, including both secure queries and queries containing SQL injection vulnerabilities. [More specific information about the dataset is needed]
Training Procedure
The model was fine-tuned using the PEFT library, which allows for efficient adaptation of the pre-trained Gemma 2B model to the SQL injection classification task.
Training Hyperparameters
- Training regime: [More Information Needed]
Evaluation
The model was evaluated on a held-out test set of SQL queries, achieving high performance across all metrics as shown in the classification report above.
Environmental Impact
[More Information Needed]
Technical Specifications
Model Architecture and Objective
The model is based on the google/gemma-2b-it architecture, fine-tuned for binary classification of SQL queries.
Compute Infrastructure
Software
- PEFT 0.8.2
- Transformers [version needed]
- PyTorch [version needed]
Model Card Contact
For questions or concerns about this model, please contact Mahesh Jamdade through the Hugging Face repository.
- Downloads last month
- 22
Model tree for maheshj01/sql-injection-classifier
Base model
google/gemma-2b-it