tongshuangwu commited on
Commit
32cde59
1 Parent(s): d6cdf3a

add readme

Browse files
Files changed (1) hide show
  1. README.md +14 -32
README.md CHANGED
@@ -11,50 +11,32 @@ widget:
11
  ## Model description
12
 
13
  This is a ported version of [Polyjuice](https://homes.cs.washington.edu/~wtshuang/static/papers/2021-arxiv-polyjuice.pdf), the general-purpose counterfactual generator.
 
14
 
15
  #### How to use
16
 
17
  ```python
18
  from transformers import AutoTokenizer, AutoModelWithLMHead
 
19
 
20
- tokenizer = AutoTokenizer.from_pretrained("uw-hai/polyjuice")
21
- model = AutoModelWithLMHead.from_pretrained("uw-hai/polyjuice")
22
-
 
 
23
 
24
  prompt_text = "A dog is embraced by the woman. <|perturb|> [negation] A dog is [BLANK] the woman."
25
- # or try: "A dog is embraced by the woman. <|perturb|> [restructure] A dog is [BLANK] the woman."
26
- perturb_tok, end_tok = "<|perturb|>", "<|endoftext|>"
27
- encoded_prompt = tokenizer.encode(prompt_text, add_special_tokens=False, return_tensors="pt")
28
- input_ids = encoded_prompt
29
- stop_token= '\n'
30
- repetition_penalty=1
31
- output_sequences = model.generate(
32
- input_ids=input_ids,
33
- max_length=100 + len(encoded_prompt[0]),
34
- temperature=0.1,
35
- num_beams=10,
36
- num_return_sequences=3)
37
-
38
- if len(output_sequences.shape) > 2:
39
- output_sequences.squeeze_()
40
-
41
- for generated_sequence_idx, generated_sequence in enumerate(output_sequences):
42
- generated_sequence = generated_sequence.tolist()
43
- # Decode text
44
- text = tokenizer.decode(generated_sequence, clean_up_tokenization_spaces=True)
45
- # Remove all text after the stop token
46
- text = text[: text.find(stop_token) if stop_token and text.find(stop_token)>-1 else None]
47
- text = text[: text.find(end_tok) if end_tok and text.find(end_tok)>-1 else None]
48
- print(text)
49
  ```
50
 
51
  ### BibTeX entry and citation info
52
 
53
  ```bibtex
54
- @article{wu2021polyjuice,
55
- title={Polyjuice: Automated, General-purpose Counterfactual Generation},
56
- author = {Wu, Tongshuang and Ribeiro, Marco Tulio and Heer, Jeffrey and Weld Daniel S.},
57
- journal={arXiv preprint},
58
- year={2021}
 
59
  }
60
  ```
 
11
  ## Model description
12
 
13
  This is a ported version of [Polyjuice](https://homes.cs.washington.edu/~wtshuang/static/papers/2021-arxiv-polyjuice.pdf), the general-purpose counterfactual generator.
14
+ For more code release, please refer to [this github page](https://github.com/tongshuangwu/polyjuice).
15
 
16
  #### How to use
17
 
18
  ```python
19
  from transformers import AutoTokenizer, AutoModelWithLMHead
20
+ from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
21
 
22
+ model_path = "uw-hai/polyjuice"
23
+ generator = pipeline("text-generation",
24
+ model=AutoModelForCausalLM.from_pretrained(model_path),
25
+ tokenizer=AutoTokenizer.from_pretrained(model_path),
26
+ framework="pt", device=0 if is_cuda else -1)
27
 
28
  prompt_text = "A dog is embraced by the woman. <|perturb|> [negation] A dog is [BLANK] the woman."
29
+ generator(prompt_text, num_beams=3, num_return_sequences=3)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ```
31
 
32
  ### BibTeX entry and citation info
33
 
34
  ```bibtex
35
+ @inproceedings{polyjuice:acl21,
36
+ title = "{P}olyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models",
37
+ author = "Tongshuang Wu and Marco Tulio Ribeiro and Jeffrey Heer and Daniel S. Weld",
38
+ booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics",
39
+ year = "2021",
40
+ publisher = "Association for Computational Linguistics"
41
  }
42
  ```