iansotnek commited on
Commit
4489d83
1 Parent(s): 56e63d3

updated to use instruct_pipeline

Browse files
Files changed (1) hide show
  1. README.md +25 -66
README.md CHANGED
@@ -43,85 +43,44 @@ Just as with any other LLM, we advise users of this technology to exercise good
43
 
44
  ## Usage
45
 
46
- The code below shows how to use `chopt-1_3b` in the way which it was trained. While the model can be used "out of the box" using the
47
- `transformers` library, using the function defined below to create a response from the model will achieve better results.
48
-
49
- ### Load Model and Tokenizer from this Repository Using the `transformers` Package
50
 
51
  ```python
52
- from transformers import AutoModelForCausalLM, AutoTokenizer
53
- import numpy as np
54
- import re
55
-
56
- model_id = 'aisquared/chopt-1_3b'
57
-
58
- tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side = 'left')
59
- model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code = True, device_map = 'auto')
60
  ```
61
 
62
-
63
- ### Create the Prompt Format and Other Variables
 
 
64
 
65
  ```python
66
- PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
 
67
 
68
- ### Instruction:
69
- {instruction}
70
 
71
- ### Response:
72
- """
73
 
74
- END_KEY = '### End'
75
- RESPONSE_KEY = '### Response:\n'
 
76
  ```
77
 
78
-
79
- ### Create a Function to Retrieve a Response
80
 
81
  ```python
82
- def create_response(
83
- instruction,
84
- model,
85
- tokenizer,
86
- do_sample = True,
87
- max_new_tokens = 256,
88
- top_p = 0.92,
89
- top_k = 0,
90
- **kwargs
91
- ):
92
- """
93
- Create a response from the model by using a formatted prompt
94
- """
95
- input_ids = tokenizer(
96
- PROMPT.format(instruction=instruction), return_tensors="pt"
97
- ).input_ids
98
-
99
- gen_tokens = model.generate(
100
- input_ids,
101
- pad_token_id=tokenizer.pad_token_id,
102
- do_sample=do_sample,
103
- max_new_tokens=max_new_tokens,
104
- top_p=top_p,
105
- top_k=top_k,
106
- **kwargs,
107
- )
108
- decoded = tokenizer.batch_decode(gen_tokens)[0]
109
-
110
- # The response appears after "### Response:". The model has been trained to append "### End" at the end.
111
- m = re.search(r"#+\s*Response:\s*(.+?)#+\s*End", decoded, flags=re.DOTALL)
112
-
113
- response = None
114
- if m:
115
- response = m.group(1).strip()
116
- else:
117
- # The model might not generate the "### End" sequence before reaching the max tokens. In this case, return
118
- # everything after "### Response:".
119
- m = re.search(r"#+\s*Response:\s*(.+)", decoded, flags=re.DOTALL)
120
- if m:
121
- response = m.group(1).strip()
122
- else:
123
- pass
124
- return response
125
  ```
126
 
127
  ### Model Performance Metrics
 
43
 
44
  ## Usage
45
 
46
+ To use the model with the `transformers` library on a machine with GPUs, first make sure you have the `transformers` and `accelerate` libraries installed.
47
+ From your terminal, run:
 
 
48
 
49
  ```python
50
+ pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"
 
 
 
 
 
 
 
51
  ```
52
 
53
+ The instruction following pipeline can be loaded using the `pipeline` function as shown below. This loads a custom `InstructionTextGenerationPipeline`
54
+ found in the model repo [here](https://huggingface.co/aisquared/chopt-1_3b/blob/main/instruct_pipeline.py), which is why `trust_remote_code=True` is required.
55
+ Including `torch_dtype=torch.bfloat16` is generally recommended if this type is supported in order to reduce memory usage. It does not appear to impact output quality.
56
+ It is also fine to remove it if there is sufficient memory.
57
 
58
  ```python
59
+ from transformers import pipeline
60
+ import torch
61
 
62
+ generate_text = pipeline(model="aisquared/chopt-1_3b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
63
+ ```
64
 
65
+ You can then use the pipeline to answer instructions:
 
66
 
67
+ ```python
68
+ res = generate_text("Who was George Washington?")
69
+ print(res[0]["generated_text"])
70
  ```
71
 
72
+ Alternatively, if you prefer to not use `trust_remote_code=True` you can download [instruct_pipeline.py](https://huggingface.co/aisquared/chopt-1_3b/blob/main/instruct_pipeline.py),
73
+ store it alongside your notebook, and construct the pipeline yourself from the loaded model and tokenizer:
74
 
75
  ```python
76
+ from instruct_pipeline import InstructionTextGenerationPipeline
77
+ from transformers import AutoModelForCausalLM, AutoTokenizer
78
+ import torch
79
+
80
+ tokenizer = AutoTokenizer.from_pretrained("aisquared/chopt-1_3b", padding_side="left")
81
+ model = AutoModelForCausalLM.from_pretrained("aisquared/chopt-1_3b", device_map="auto", torch_dtype=torch.bfloat16)
82
+
83
+ generate_text = InstructionTextGenerationPipeline(model=model, tokenizer=tokenizer)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
  ```
85
 
86
  ### Model Performance Metrics