File size: 2,515 Bytes
5318d99 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
---
tags:
- image-to-text
- image-captioning
- endpoints-template
license: bsd-3-clause
library_name: generic
duplicated_from: florentgbelidji/blip_captioning
---
# Fork of [salesforce/BLIP](https://github.com/salesforce/BLIP) for a `image-captioning` task on 🤗Inference endpoint.
This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/florentgbelidji/blip_captioning/blob/main/pipeline.py).
To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `pipeline.py` file. -> _double check if it is selected_
### expected Request payload
```json
{
"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes
}
```
below is an example on how to run a request using Python and `requests`.
## Run Request
1. prepare an image.
```bash
!wget https://huggingface.co/datasets/mishig/sample_images/resolve/main/palace.jpg
```
2.run request
```python
import json
from typing import List
import requests as r
import base64
ENDPOINT_URL = ""
HF_TOKEN = ""
def predict(path_to_image: str = None):
with open(path_to_image, "rb") as i:
image = i.read()
payload = {
"inputs": [image],
"parameters": {
"do_sample": True,
"top_p":0.9,
"min_length":5,
"max_length":20
}
}
response = r.post(
ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
)
return response.json()
prediction = predict(
path_to_image="palace.jpg"
)
```
Example parameters depending on the decoding strategy:
1. Beam search
```
"parameters": {
"num_beams":5,
"max_length":20
}
```
2. Nucleus sampling
```
"parameters": {
"num_beams":1,
"max_length":20,
"do_sample": True,
"top_k":50,
"top_p":0.95
}
```
3. Contrastive search
```
"parameters": {
"penalty_alpha":0.6,
"top_k":4
"max_length":512
}
```
See [generate()](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail
expected output
```python
['buckingham palace with flower beds and red flowers']
```
|