NatGen / README.md
saikatc's picture
Update README.md
f01e829
---
language:
- "code"
thumbnail: "https://to-be-updated"
tags:
- code generation
- code translation
- bug fixing
license: "mit"
datasets:
- CodeSearchNet
- CodeXGLUE
metrics:
- EM
- CodeBLEU
---
Pretrained model for NatGen: Generative Pre-training by “Naturalizing” Source Code [[`paper`]](https://dl.acm.org/doi/abs/10.1145/3540250.3549162),[[`code`]](https://github.com/saikat107/NatGen),[[`slide`]](https://docs.google.com/presentation/d/1T6kjiohAAR1YvcNvTASR94HptA3xHGCl/edit?usp=sharing&ouid=111755026725574085503&rtpof=true&sd=true).
To load the model,
```
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("saikatc/NatGen")
model = AutoModelForSeq2SeqLM.from_pretrained("saikatc/NatGen")
```
For citation,
```
@inproceedings{chakraborty2022natgen,
author = {Chakraborty, Saikat and Ahmed, Toufique and Ding, Yangruibo and Devanbu, Premkumar T. and Ray, Baishakhi},
title = {NatGen: Generative Pre-Training by “Naturalizing” Source Code},
year = {2022},
isbn = {9781450394130},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3540250.3549162},
doi = {10.1145/3540250.3549162},
booktitle = {Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
pages = {18–30},
numpages = {13},
keywords = {Neural Network, Semantic Preserving Transformation, Source Code Transformer, Source Code Pre-training},
location = {Singapore, Singapore},
series = {ESEC/FSE 2022}
}
```