microsoft
/

Florence-2-base-ft

Image-Text-to-Text

text-generation

Model card Files Files and versions Community

add_grounding_example

#2

by haipingwu - opened Jun 18

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -85,7 +85,10 @@ processor = AutoProcessor.from_pretrained("microsoft/Florence-2-base-ft", trust_
 url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
 image = Image.open(requests.get(url, stream=True).raw)
-def run_example(prompt):
     inputs = processor(text=prompt, images=image, return_tensors="pt")
     generated_ids = model.generate(
@@ -169,6 +172,16 @@ prompt = <REGION_PROPOSAL>
 run_example(prompt)
 ```
 for More detailed examples, please refer to [notebook](https://huggingface.co/microsoft/Florence-2-large/blob/main/sample_inference.ipynb)
 </details>

 url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
 image = Image.open(requests.get(url, stream=True).raw)
+def run_example(prompt, text_input=None):
+    if text_input is not None:
+        prompt = prompt + text_input
     inputs = processor(text=prompt, images=image, return_tensors="pt")
     generated_ids = model.generate(
 run_example(prompt)
 ```
+### Caption to Phrase Grounding
+caption to phrase grounding task requires additional text input, i.e. caption.
+Caption to phrase grounding results format:
+{'\<CAPTION_TO_PHRASE_GROUNDING>': {'bboxes': [[x1, y1, x2, y2], ...], 'labels': ['', '', ...]}}
+```python
+task_prompt = '<CAPTION_TO_PHRASE_GROUNDING>'
+results = run_example(task_prompt, text_input="A green car parked in front of a yellow building.")
+```
 for More detailed examples, please refer to [notebook](https://huggingface.co/microsoft/Florence-2-large/blob/main/sample_inference.ipynb)
 </details>