Xenova
/

clipseg-rd16

Image Segmentation

Transformers.js

Model card Files Files and versions Community

clipseg-rd16 / README.md

Xenova's picture

Xenova HF staff

Update README.md

7d35ae8 10 months ago

|

2.93 kB

	---
	library_name: transformers.js
	tags:
	- vision
	- image-segmentation
	---

	https://huggingface.co/CIDAS/clipseg-rd16 with ONNX weights to be compatible with Transformers.js.

	## Usage (Transformers.js)

	If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
	```bash
	npm i @xenova/transformers
	```

	Example: Perform zero-shot image segmentation with a `CLIPSegForImageSegmentation` model.

	```js
	import { AutoTokenizer, AutoProcessor, CLIPSegForImageSegmentation, RawImage } from '@xenova/transformers';

	// Load tokenizer, processor, and model
	const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clipseg-rd16');
	const processor = await AutoProcessor.from_pretrained('Xenova/clipseg-rd16');
	const model = await CLIPSegForImageSegmentation.from_pretrained('Xenova/clipseg-rd16');

	// Run tokenization
	const texts = ['a glass', 'something to fill', 'wood', 'a jar'];
	const text_inputs = tokenizer(texts, { padding: true, truncation: true });

	// Read image and run processor
	const image = await RawImage.read('https://github.com/timojl/clipseg/blob/master/example_image.jpg?raw=true');
	const image_inputs = await processor(image);

	// Run model with both text and pixel inputs
	const { logits } = await model({ ...text_inputs, ...image_inputs });
	// logits: Tensor {
	// dims: [4, 352, 352],
	// type: 'float32',
	// data: Float32Array(495616)[ ... ],
	// size: 495616
	// }
	```

	You can visualize the predictions as follows:
	```js
	// Visualize images
	const preds = logits
	.unsqueeze_(1)
	.sigmoid_()
	.mul_(255)
	.round_()
	.to('uint8');

	for (let i = 0; i < preds.dims[0]; ++i) {
	const img = RawImage.fromTensor(preds[i]);
	img.save(`prediction_${i}.png`);
	}
	```

	\| Original \| `"a glass"` \| `"something to fill"` \| `"wood"` \| `"a jar"` \|
	\|--------\|--------\|--------\|--------\|--------\|
	\| ![image](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/B4wAIseP3SokRd7Flu1Y9.png) \| ![prediction_0](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/bM2k70sh6ZKFCXXaYTb5Z.png) \| ![prediction_1](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/vOIMMt2scOwz1BuM39pnH.png) \| ![prediction_2](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/jIxiYl2QWrhYZf45Vruja.png) \| ![prediction_3](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/zXXs42jekMdwZ-Mjfbgtv.png) \|

	---

	Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).