uw-hai
/

polyjuice

Text Generation

counterfactual generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

polyjuice / README.md

tongshuangwu

update readme

f028088 over 3 years ago

|

No virus

1.98 kB

	---
	language: "en"
	tags:
	- counterfactual generation
	widget:
	- text: "It is great for kids. <perturb> [negation] It [BLANK] great for kids. [SEP]"
	---

	# Polyjuice

	## Model description

	This is a ported version of [Polyjuice](https://homes.cs.washington.edu/~wtshuang/static/papers/2021-arxiv-polyjuice.pdf), the general-purpose counterfactual generator.

	#### How to use

	```python
	from transformers import AutoTokenizer, AutoModelWithLMHead

	tokenizer = AutoTokenizer.from_pretrained("uw-hai/polyjuice")
	model = AutoModelWithLMHead.from_pretrained("uw-hai/polyjuice")


	prompt_text = "A dog is embraced by the woman. <perturb> [negation] A dog is [BLANK] the woman."
	# or try: "A dog is embraced by the woman. <perturb> [restructure] A dog is [BLANK] the woman."
	perturb_tok, end_tok = "<\|perturb\|>", "<\|endoftext\|>"
	encoded_prompt = tokenizer.encode(prompt_text, add_special_tokens=False, return_tensors="pt")
	input_ids = encoded_prompt
	stop_token= '\n'
	repetition_penalty=1
	output_sequences = model.generate(
	input_ids=input_ids,
	max_length=100 + len(encoded_prompt[0]),
	temperature=0.1,
	num_beams=10,
	num_return_sequences=3)

	if len(output_sequences.shape) > 2:
	output_sequences.squeeze_()

	for generated_sequence_idx, generated_sequence in enumerate(output_sequences):
	generated_sequence = generated_sequence.tolist()
	# Decode text
	text = tokenizer.decode(generated_sequence, clean_up_tokenization_spaces=True)
	# Remove all text after the stop token
	text = text[: text.find(stop_token) if stop_token and text.find(stop_token)>-1 else None]
	text = text[: text.find(end_tok) if end_tok and text.find(end_tok)>-1 else None]
	print(text)
	```

	### BibTeX entry and citation info

	```bibtex
	@article{wu2021polyjuice,
	title={Polyjuice: Automated, General-purpose Counterfactual Generation},
	author = {Wu, Tongshuang and Ribeiro, Marco Tulio and Heer, Jeffrey and Weld Daniel S.},
	journal={arXiv preprint},
	year={2021}
	}
	```