razent
/

cotext-1-cc

Feature Extraction

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cotext-1-cc / README.md

razent's picture

Update README.md

e1efbca over 1 year ago

|

history blame contribute delete

No virus

1.81 kB

	---
	language: code
	datasets:
	- code_search_net

	---

	# CoText (1-CC)

	## Introduction
	Paper: [CoTexT: Multi-task Learning with Code-Text Transformer](https://arxiv.org/abs/2105.08645)

	Authors: _Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Anibal, Alec Peltekian, Yanfang Ye_

	## How to use

	Supported languages:

	```shell
	"go"
	"java"
	"javascript"
	"php"
	"python"
	"ruby"
	```

	For more details, do check out [our Github repo](https://github.com/justinphan3110/CoTexT).
	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	tokenizer = AutoTokenizer.from_pretrained("razent/cotext-1-cc")
	model = AutoModelForSeq2SeqLM.from_pretrained("razent/cotext-1-cc")

	sentence = "def add(a, b): return a + b"
	text = "python: " + sentence + " </s>"

	encoding = tokenizer.encode_plus(text, pad_to_max_length=True, return_tensors="pt")
	input_ids, attention_masks = encoding["input_ids"].to("cuda"), encoding["attention_mask"].to("cuda")

	outputs = model.generate(
	input_ids=input_ids, attention_mask=attention_masks,
	max_length=256,
	early_stopping=True
	)

	for output in outputs:
	line = tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)
	print(line)
	```

	## Citation
	```
	@inproceedings{phan-etal-2021-cotext,
	title = "{C}o{T}ex{T}: Multi-task Learning with Code-Text Transformer",
	author = "Phan, Long and Tran, Hieu and Le, Daniel and Nguyen, Hieu and Annibal, James and Peltekian, Alec and Ye, Yanfang",
	booktitle = "Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021)",
	year = "2021",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2021.nlp4prog-1.5",
	doi = "10.18653/v1/2021.nlp4prog-1.5",
	pages = "40--47"
	}
	```