--- language: - "code" thumbnail: "https://to-be-updated" tags: - code generation - code translation - bug fixing license: "mit" datasets: - CodeSearchNet - CodeXGLUE metrics: - EM - CodeBLEU --- Pretrained model for NatGen: Generative Pre-training by “Naturalizing” Source Code [[`paper`]](https://dl.acm.org/doi/abs/10.1145/3540250.3549162),[[`code`]](https://github.com/saikat107/NatGen),[[`slide`]](https://docs.google.com/presentation/d/1T6kjiohAAR1YvcNvTASR94HptA3xHGCl/edit?usp=sharing&ouid=111755026725574085503&rtpof=true&sd=true). To load the model, ``` from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("saikatc/NatGen") model = AutoModelForSeq2SeqLM.from_pretrained("saikatc/NatGen") ``` For citation, ``` @inproceedings{chakraborty2022natgen, author = {Chakraborty, Saikat and Ahmed, Toufique and Ding, Yangruibo and Devanbu, Premkumar T. and Ray, Baishakhi}, title = {NatGen: Generative Pre-Training by “Naturalizing” Source Code}, year = {2022}, isbn = {9781450394130}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3540250.3549162}, doi = {10.1145/3540250.3549162}, booktitle = {Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering}, pages = {18–30}, numpages = {13}, keywords = {Neural Network, Semantic Preserving Transformation, Source Code Transformer, Source Code Pre-training}, location = {Singapore, Singapore}, series = {ESEC/FSE 2022} } ```