|
--- |
|
tags: |
|
- image-to-text |
|
- image-captioning |
|
- endpoints-template |
|
license: bsd-3-clause |
|
library_name: generic |
|
duplicated_from: florentgbelidji/blip_captioning |
|
--- |
|
|
|
# Fork of [salesforce/BLIP](https://github.com/salesforce/BLIP) for a `image-captioning` task on 🤗Inference endpoint. |
|
|
|
This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints. |
|
To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `pipeline.py` file. -> _double check if it is selected_ |
|
### expected Request payload |
|
```json |
|
{ |
|
"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes |
|
} |
|
``` |
|
below is an example on how to run a request using Python and `requests`. |
|
## Run Request |
|
1. prepare an image. |
|
```bash |
|
!wget https://huggingface.co/datasets/mishig/sample_images/resolve/main/palace.jpg |
|
``` |
|
2.run request |
|
|
|
```python |
|
import json |
|
from typing import List |
|
import requests as r |
|
import base64 |
|
|
|
ENDPOINT_URL = "" |
|
HF_TOKEN = "" |
|
|
|
def predict(path_to_image: str = None): |
|
with open(path_to_image, "rb") as i: |
|
image = i.read() |
|
payload = { |
|
"inputs": [image], |
|
"parameters": { |
|
"do_sample": True, |
|
"top_p":0.9, |
|
"min_length":5, |
|
"max_length":20 |
|
} |
|
} |
|
response = r.post( |
|
ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload |
|
) |
|
return response.json() |
|
prediction = predict( |
|
path_to_image="palace.jpg" |
|
) |
|
|
|
``` |
|
Example parameters depending on the decoding strategy: |
|
|
|
1. Beam search |
|
|
|
``` |
|
"parameters": { |
|
"num_beams":5, |
|
"max_length":20 |
|
} |
|
``` |
|
|
|
2. Nucleus sampling |
|
|
|
``` |
|
"parameters": { |
|
"num_beams":1, |
|
"max_length":20, |
|
"do_sample": True, |
|
"top_k":50, |
|
"top_p":0.95 |
|
} |
|
``` |
|
|
|
3. Contrastive search |
|
|
|
``` |
|
"parameters": { |
|
"penalty_alpha":0.6, |
|
"top_k":4 |
|
"max_length":512 |
|
} |
|
``` |
|
|
|
See [generate()](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail |
|
|
|
|
|
expected output |
|
```python |
|
['buckingham palace with flower beds and red flowers'] |
|
``` |
|
|