|
--- |
|
library_name: transformers.js |
|
tags: |
|
- vision |
|
- image-segmentation |
|
--- |
|
|
|
https://huggingface.co/CIDAS/clipseg-rd16 with ONNX weights to be compatible with Transformers.js. |
|
|
|
## Usage (Transformers.js) |
|
|
|
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using: |
|
```bash |
|
npm i @xenova/transformers |
|
``` |
|
|
|
**Example:** Perform zero-shot image segmentation with a `CLIPSegForImageSegmentation` model. |
|
|
|
```js |
|
import { AutoTokenizer, AutoProcessor, CLIPSegForImageSegmentation, RawImage } from '@xenova/transformers'; |
|
|
|
// Load tokenizer, processor, and model |
|
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clipseg-rd16'); |
|
const processor = await AutoProcessor.from_pretrained('Xenova/clipseg-rd16'); |
|
const model = await CLIPSegForImageSegmentation.from_pretrained('Xenova/clipseg-rd16'); |
|
|
|
// Run tokenization |
|
const texts = ['a glass', 'something to fill', 'wood', 'a jar']; |
|
const text_inputs = tokenizer(texts, { padding: true, truncation: true }); |
|
|
|
// Read image and run processor |
|
const image = await RawImage.read('https://github.com/timojl/clipseg/blob/master/example_image.jpg?raw=true'); |
|
const image_inputs = await processor(image); |
|
|
|
// Run model with both text and pixel inputs |
|
const { logits } = await model({ ...text_inputs, ...image_inputs }); |
|
// logits: Tensor { |
|
// dims: [4, 352, 352], |
|
// type: 'float32', |
|
// data: Float32Array(495616)[ ... ], |
|
// size: 495616 |
|
// } |
|
``` |
|
|
|
You can visualize the predictions as follows: |
|
```js |
|
// Visualize images |
|
const preds = logits |
|
.unsqueeze_(1) |
|
.sigmoid_() |
|
.mul_(255) |
|
.round_() |
|
.to('uint8'); |
|
|
|
for (let i = 0; i < preds.dims[0]; ++i) { |
|
const img = RawImage.fromTensor(preds[i]); |
|
img.save(`prediction_${i}.png`); |
|
} |
|
``` |
|
|
|
| Original | `"a glass"` | `"something to fill"` | `"wood"` | `"a jar"` | |
|
|--------|--------|--------|--------|--------| |
|
| ![image](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/B4wAIseP3SokRd7Flu1Y9.png) | ![prediction_0](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/bM2k70sh6ZKFCXXaYTb5Z.png) | ![prediction_1](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/vOIMMt2scOwz1BuM39pnH.png) | ![prediction_2](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/jIxiYl2QWrhYZf45Vruja.png) | ![prediction_3](https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/zXXs42jekMdwZ-Mjfbgtv.png) | |
|
|
|
--- |
|
|
|
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`). |