pdich2085
/

new-blip

image-captioning

endpoints-template

Inference Endpoints

Model card Files Files and versions Community

new-blip / README.md

pdich2085's picture

Upload 8 files

feb5784 over 1 year ago

|

2.44 kB

	---
	tags:
	- image-to-text
	- image-captioning
	- endpoints-template
	license: bsd-3-clause
	library_name: generic
	---

	# Fork of [Salesforce/blip-image-captioning-large](https://huggingface.co/Salesforce/blip-image-captioning-large) for a `image-captioning` task on 🤗Inference endpoint.

	This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/florentgbelidji/blip_captioning/blob/main/pipeline.py).
	To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `pipeline.py` file. -> _double check if it is selected_
	### expected Request payload
	```json
	{
	"image": "/9j/4AAQSkZJRgA.....", #encoded image
	"text": "a photography of a"
	}
	```
	below is an example on how to run a request using Python and `requests`.
	## Run Request
	1. Use any online image.
	```bash
	!wget https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg
	```
	2.run request

	```python
	import json
	from typing import List
	import requests as r
	import base64

	with open("/content/demo.jpg", "rb") as image_file:
	encoded_string = base64.b64encode(image_file.read()).decode()

	ENDPOINT_URL = ""
	HF_TOKEN = ""

	def query(payload):
	response = requests.post(API_URL, headers=headers, json=payload)
	return response.json()


	output = query({
	"inputs": {
	"images": [encoded_string], # using the base64 encoded string
	"texts": ["a photography of"] # Optional, based on your current class logic
	}
	})
	print(output)
	```

	Example parameters depending on the decoding strategy:

	1. Beam search

	```
	"parameters": {
	"num_beams":5,
	"max_length":20
	}
	```

	2. Nucleus sampling

	```
	"parameters": {
	"num_beams":1,
	"max_length":20,
	"do_sample": True,
	"top_k":50,
	"top_p":0.95
	}
	```

	3. Contrastive search

	```
	"parameters": {
	"penalty_alpha":0.6,
	"top_k":4
	"max_length":512
	}
	```

	See [generate()](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail


	expected output
	```python
	{'captions': ['a photography of a woman and her dog on the beach']}
	```