File size: 7,712 Bytes
00e9ce3
 
eb7a6f0
df68b1b
 
 
981080a
 
cf8eaec
981080a
 
 
 
 
 
 
 
47d5dee
 
981080a
377f88a
981080a
eb7a6f0
00e9ce3
981080a
 
 
47d5dee
981080a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
377f88a
981080a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
PREFIX = """You are an Internet Search Scraper with acces to an external set of tools.
Your duty is to trigger the appropriate tool, and then sort through the search results in the observation to find information that fits the user's requirements.
Deny the users request to perform any search that can be considered dangerous, harmful, illegal, or potentially illegal
Make sure your information is current
Current Date and Time is:
{timestamp}
You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH_ENGINE action_input=SEARCH_ENGINE_URL/?q=SEARCH_QUERY
- action: SCRAPE_WEBSITE action_input=WEBSITE_URL
- action: COMPLETE
Search Purpose:
{purpose}
"""

FINDER = """
Instructions
- Use the provided tool to find a website to scrape
- Use the tool provided tool to scrape the text from the website url
- Find the pertinent information in the text that you scrape
- When you are finished, return with\naction: COMPLETE
Use the following format:
task: choose the next action from your available tools
action: the action to take (should be one of [UPDATE-TASK, SEARCH_ENGINE, SCRAPE_WEBSITE, COMPLETE]) action_input=XXX
Example:
User command: Find me the breaking news from today
action: SEARCH_ENGINE action_input=https://www.google.com/search?q=todays+breaking+news
Progress:
{history}"""

MODEL_FINDER_PRE = """
You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH action_input=SEARCH_QUERY
- action: COMPLETE
Instructions
- Generate a search query for the requested model from these options: 
>{TASKS}
- Return the search query using the search tool
- Wait for the search to return a result
- After observing the search result, choose a model
- Return the name of the repo and model ("repo/model")
- When you are finished, return with  action: COMPLETE
Use the following format:
task: the input task you must complete
thought: you should always think about what to do
action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX
observation: the result of the action
thought: you should always think after an observation
action: SEARCH action_input='text-generation'
... (thought/action/observation/thought can repeat N times)
Example:
***************************
User command: Find me a text generation model with less than 50M parameters.
thought: I will use the option 'text-generation'
action: SEARCH action_input=text-generation
--- pause and wait for data to be returned ---
Response:
Assistant: I found the 'distilgpt2' model which has around 82M parameters. It is a distilled version of the GPT-2 model from OpenAI, trained by Hugging Face. Here's how to load it:
action: COMPLETE
***************************
You are attempting to complete the task
task: {task}
{history}"""


ACTION_PROMPT = """
You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH action_input=SEARCH_QUERY
- action: COMPLETE
Instructions
- Generate a search query for the requested model
- Return the search query using the search tool
- Wait for the search to return a result
- After observing the search result, choose a model
- Return the name of the repo and model ("repo/model")
Use the following format:
task: the input task you must complete
action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX
observation: the result of the action
action: SEARCH action_input='text generation'
You are attempting to complete the task
task: {task}
{history}"""

ACTION_PROMPT_PRE = """
You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH action_input=SEARCH_QUERY
- action: COMPLETE
Instructions
- Generate a search query for the requested model
- Return the search query using the search tool
- Wait for the search to return a result
- After observing the search result, choose a model
- Return the name of the repo and model ("repo/model")
Use the following format:
task: the input task you must complete
thought: you should always think about what to do
action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX
observation: the result of the action
thought: you should always think after an observation
action: SEARCH action_input='text generation'
... (thought/action/observation/thought can repeat N times)
You are attempting to complete the task
task: {task}
{history}"""

TASK_PROMPT = """
You are attempting to complete the task
task: {task}
Progress:
{history}
Tasks should be small, isolated, and independent
To start a search use the format:
action: SEARCH_ENGINE action_input=URL/?q='SEARCH_QUERY'
What should the task be for us to achieve the purpose?
task: """


COMPRESS_DATA_PROMPT_SMALL = """
You are attempting to complete the task
task: {task}
Current data:
{knowledge}
New data:
{history}
Compress the data above into a concise data presentation of relevant data
Include datapoints that will provide greater accuracy in completing the task
Return the data in JSON format to save space
"""




COMPRESS_DATA_PROMPT = """
You are attempting to complete the task
task: {task}
Current data:
{knowledge}
New data:
{history}
Compress the data above into a concise data presentation of relevant data
Include a datapoints and source urls that will provide greater accuracy in completing the task
"""

COMPRESS_HISTORY_PROMPT = """
You are attempting to complete the task
task: {task}
Progress:
{history}
Compress the timeline of progress above into a single summary (as a paragraph)
Include all important milestones, the current challenges, and implementation details necessary to proceed
"""

LOG_PROMPT = """
PROMPT
**************************************
{}
**************************************
"""

LOG_RESPONSE = """
RESPONSE
**************************************
{}
**************************************
"""


FINDER1 = """
Example Response 1:
User command: Find me a text generation model with less than 50M parameters.
Query: text generation
--- pause and wait for data to be returned ---
Assistant: I found the 'distilgpt2' model which has around 82M parameters. It is a distilled version of the GPT-2 model from OpenAI, trained by Hugging Face. Here's how to load it:
```python
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
model = AutoModelForMaskedLM.from_pretrained("distilgpt2")
```
Example Response 2:
User command: Help me locate a multilingual Named Entity Recognition model.
Query: named entity recognition
--- pause and wait for data to be returned ---
Assistant: I discovered the 'dbmdz/bert-base-multilingual-cased' model, which supports named entity recognition across multiple languages. Here's how to load it:
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-multilingual-cased")
model = AutoModelForTokenClassification.from_pretrained("dbmdz/bert-base-multilingual-cased")
```
Example Response 3:
User command: Search for a question-answering model fine-tuned on the SQuAD v2 dataset with more than 90% accuracy.
action: SEARCH action_input=named entity recognition
--- pause and wait for data to be returned ---
Assistant: I found the 'pranavkv/roberta-base-squad2' model, which was fine-tuned on the SQuAD v2 dataset and achieves approximately 91% accuracy. Below is the way to load it:
```python
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
tokenizer = AutoTokenizer.from_pretrained("pranavkv/roberta-base-squad2")
model = AutoModelForQuestionAnswering.from_pretrained("pranavkv/roberta-base-squad2")
```
"""