metadata
license: apache-2.0
datasets:
- lambdasec/cve-single-line-fixes
- lambdasec/gh-top-1000-projects-vulns
language:
- code
tags:
- code
programming_language:
- Java
- JavaScript
- Python
inference: false
model-index:
- name: SantaFixer
results:
- task:
type: text-generation
dataset:
type: openai/human-eval-infilling
name: HumanEval
metrics:
- name: single-line infilling pass@1
type: pass@1
value: 0.28
verified: false
- name: single-line infilling pass@10
type: pass@10
value: 0.28
verified: false
- task:
type: text-generation
dataset:
type: lambdasec/gh-top-1000-projects-vulns
name: GH Top 1000 Projects Vulnerabilities
metrics:
- name: pass@10 (Java)
type: pass@10
value: 0.1
verified: false
- name: pass@10 (Python)
type: pass@10
value: 0.2
verified: false
- name: pass@10 (JavaScript)
type: pass@10
value: 0.3
verified: false
Model Card for SantaFixer
This is a LLM for code that is focussed on generating bug fixes using infilling.
Model Details
Model Description
- Developed by: codelion
- Model type: GPT-2
- Finetuned from model: bigcode/santacoder
Uses
Direct Use
[More Information Needed]
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
- GPU: Tesla P100
- Time: ~5 hrs
Training Data
The model was fine-tuned on the CVE single line fixes dataset
Training Procedure
Supervised Fine Tuning (SFT)
Training Hyperparameters
- optim: adafactor
- gradient_accumulation_steps: 4
- gradient_checkpointing: true
- fp16: false
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Results
[More Information Needed]
Summary
[More Information Needed]