Cuiunbo commited on
Commit
ce7e343
β€’
1 Parent(s): 2b0f823

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -9
README.md CHANGED
@@ -20,13 +20,37 @@ Introducing the GUIDance, Model that trained on [GUICourse](https://arxiv.org/pd
20
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63f706dfe94ed998c463ed66/5d4rJFWjKn-c-iOXJKYXF.png)
21
 
22
  # News
23
- - 2024-07-09: πŸš€ We released MiniCPM-GUIDance on huggingface.
24
  - 2024-03-09: πŸ“¦ We have open-sourced guicourse, [GUIAct](https://huggingface.co/datasets/yiye2023/GUIAct),[GUIChat](https://huggingface.co/datasets/yiye2023/GUIChat), [GUIEnv](https://huggingface.co/datasets/yiye2023/GUIEnv)
25
 
26
  # ToDo
27
- [ ] Update detailed task type prompt
28
  [ ] Batch inference
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  # Example
31
  Pip install all dependencies:
32
  ```
@@ -49,6 +73,7 @@ or
49
  huggingface-cli download RhapsodyAI/minicpm-guidance
50
 
51
  ```
 
52
  ```python
53
  from transformers import AutoProcessor, AutoTokenizer, AutoModel
54
  from PIL import Image
@@ -68,11 +93,11 @@ example_messages = [
68
  [
69
  {
70
  "role": "user",
71
- "content": Image.open("/home/jeeves/cuiunbo/minicpmv/examples/test.png").convert('RGB')
72
  },
73
  {
74
  "role": "user",
75
- "content": "What this is?"
76
  }
77
  ]
78
  ]
@@ -80,21 +105,30 @@ example_messages = [
80
  input = processor(example_messages, padding_side="right")
81
 
82
  for key in input:
83
- if isinstance(a[key], list):
84
- for i in range(len(a[key])):
85
- if isinstance(a[key][i], torch.Tensor):
86
- input[key][i] = a[key][i].cuda()
87
  if isinstance(input[key], torch.Tensor):
88
  input[key] = input[key].cuda()
89
 
90
  with torch.no_grad():
91
- outputs = model.generate(input, max_new_tokens=64, do_sample=False, num_beams=3)
92
  text = tokenizer.decode(outputs[0].cpu().tolist())
93
  text = tokenizer.batch_decode(outputs.cpu().tolist())
94
 
95
  for i in text:
96
  print('-'*20)
97
  print(i)
 
 
 
 
 
 
 
 
 
98
  ```
99
 
100
  # Citation
 
20
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63f706dfe94ed998c463ed66/5d4rJFWjKn-c-iOXJKYXF.png)
21
 
22
  # News
23
+ - 2024-07-09: πŸš€ We released MiniCPM-GUIDance on [huggingface](https://huggingface.co/RhapsodyAI/minicpm-guidance).
24
  - 2024-03-09: πŸ“¦ We have open-sourced guicourse, [GUIAct](https://huggingface.co/datasets/yiye2023/GUIAct),[GUIChat](https://huggingface.co/datasets/yiye2023/GUIChat), [GUIEnv](https://huggingface.co/datasets/yiye2023/GUIEnv)
25
 
26
  # ToDo
 
27
  [ ] Batch inference
28
 
29
+ # CookBook
30
+ - Prompt for Actions
31
+ ```
32
+ Your Task
33
+ {Task}
34
+ Generate next actions to do this task.
35
+ ```
36
+ ```
37
+ Actions History
38
+ {hover, select_text, click, scroll}
39
+ Information
40
+ {Information about the web}
41
+ Your Task
42
+ {TASK}
43
+ Generate next actions to do this task.
44
+ ```
45
+ - Prompt for Chat w or w/o Grounding
46
+ ```
47
+ {Query}
48
+
49
+ OR
50
+
51
+ {Query} Grounding all objects in the image.
52
+ ```
53
+
54
  # Example
55
  Pip install all dependencies:
56
  ```
 
73
  huggingface-cli download RhapsodyAI/minicpm-guidance
74
 
75
  ```
76
+ Example case image: ![case](https://cdn-uploads.huggingface.co/production/uploads/63f706dfe94ed998c463ed66/KJFeGDBj3SOgQqGAU7lU5.png)
77
  ```python
78
  from transformers import AutoProcessor, AutoTokenizer, AutoModel
79
  from PIL import Image
 
93
  [
94
  {
95
  "role": "user",
96
+ "content": Image.open("./case.png").convert('RGB')
97
  },
98
  {
99
  "role": "user",
100
+ "content": "How could I use this model from this web? Grounding all objects in the image."
101
  }
102
  ]
103
  ]
 
105
  input = processor(example_messages, padding_side="right")
106
 
107
  for key in input:
108
+ if isinstance(input[key], list):
109
+ for i in range(len(input[key])):
110
+ if isinstance(input[key][i], torch.Tensor):
111
+ input[key][i] = input[key][i].cuda()
112
  if isinstance(input[key], torch.Tensor):
113
  input[key] = input[key].cuda()
114
 
115
  with torch.no_grad():
116
+ outputs = model.generate(input, max_new_tokens=1024, do_sample=False, num_beams=3)
117
  text = tokenizer.decode(outputs[0].cpu().tolist())
118
  text = tokenizer.batch_decode(outputs.cpu().tolist())
119
 
120
  for i in text:
121
  print('-'*20)
122
  print(i)
123
+
124
+ '''
125
+ To use the model from this webpage, you would typically follow these steps:
126
+ 1. **Access the Model**: Navigate to the section of the webpage where the model is described. In this case, it's under the heading "Use this model"<box> 864 238 964 256</box> .
127
+ 2. **Download the Model**: There should be a link or button that allows you to download the model. Look for a button or link that says "Download" or something similar.
128
+ 3. **Install the Model**: Once you've downloaded the model, you'll need to install it on your system. This typically involves extracting the downloaded file and placing it in a directory where the model can be found.
129
+ 4. **Use the Model**: After installation, you can use the model in your application or project. This might involve importing the model into your programming environment and using it to perform specific tasks.
130
+ The exact steps would depend on the specifics of the model and the environment in which you're using it, but these are the general steps you would follow to use the model from this webpage.</s>
131
+ '''
132
  ```
133
 
134
  # Citation