--- language: code datasets: - code_search_net --- # CoText (1-CC) ## Introduction Paper: [CoTexT: Multi-task Learning with Code-Text Transformer](https://arxiv.org/abs/2105.08645) Authors: _Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Anibal, Alec Peltekian, Yanfang Ye_ ## How to use Supported languages: ```shell "go" "java" "javascript" "php" "python" "ruby" ``` For more details, do check out [our Github repo](https://github.com/justinphan3110/CoTexT). ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM ​ tokenizer = AutoTokenizer.from_pretrained("razent/cotext-1-cc") model = AutoModelForSeq2SeqLM.from_pretrained("razent/cotext-1-cc") ​ sentence = "def add(a, b): return a + b" text = "python: " + sentence + " " encoding = tokenizer.encode_plus(text, pad_to_max_length=True, return_tensors="pt") input_ids, attention_masks = encoding["input_ids"].to("cuda"), encoding["attention_mask"].to("cuda") outputs = model.generate( input_ids=input_ids, attention_mask=attention_masks, max_length=256, early_stopping=True ) for output in outputs: line = tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True) print(line) ``` ## Citation ``` @misc{https://doi.org/10.48550/arxiv.2105.08645, doi = {10.48550/ARXIV.2105.08645}, url = {https://arxiv.org/abs/2105.08645}, author = {Phan, Long and Tran, Hieu and Le, Daniel and Nguyen, Hieu and Anibal, James and Peltekian, Alec and Ye, Yanfang}, title = {CoTexT: Multi-task Learning with Code-Text Transformer}, publisher = {arXiv}, year = {2021}, copyright = {Creative Commons Attribution 4.0 International} } ```