README.md · saikatc/NatGen at eb2d9f860fb7e3bcb91a298e73d24dbd92a84e86

metadata

language:
  - code
thumbnail: https://to-be-updated
tags:
  - code generation
  - code translation
  - bug fixing
license: mit
datasets:
  - CodeSearchNet
  - CodeXGLUE
metrics:
  - EM
  - CodeBLEU

Pretrained model for NatGen: Generative Pre-training by “Naturalizing” Source Code [paper],[code],[slide].

To load the model,

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("saikatc/NatGen")
model = AutoModelForSeq2SeqLM.from_pretrained("saikatc/NatGen")

For citation,

@inproceedings{chakraborty2022natgen,
    author = {Chakraborty, Saikat and Ahmed, Toufique and Ding, Yangruibo and Devanbu, Premkumar T. and Ray, Baishakhi},
    title = {NatGen: Generative Pre-Training by “Naturalizing” Source Code},
    year = {2022},
    isbn = {9781450394130},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3540250.3549162},
    doi = {10.1145/3540250.3549162},
    booktitle = {Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
    pages = {18–30},
    numpages = {13},
    keywords = {Neural Network, Semantic Preserving Transformation, Source Code Transformer, Source Code Pre-training},
    location = {Singapore, Singapore},
    series = {ESEC/FSE 2022}
}