Hugging Face + Google Visual Blocks

Community Article Published May 16, 2024


transformers.js javascript library logo

At Google I/O 2024, we're collaborating with the Google Visual Blocks team to release custom Hugging Face nodes. Visual Blocks for ML is a browser-based tool that allows users to create machine learning pipelines using a visual interface. We've partnered with them to build custom Hugging Face nodes that can run fully in-browser with Transformers.js or using our Hugging Face Serverless API for larger models on the server side and Text Generation Inference for selected LLMs.

You can learn more about Visual Blocks and how to use it here and check out Hugging Face custom nodes source-code here.

We're looking for feedback on this integration as well as contributions with new nodes and improvements. Please open an issue in the Visual Blocks repository or a PR with your changes.

How to use the custom components

To start playing with our custom components you need to Add a custom node to your Visual Blocks project. First you need to start a new project https://visualblocks.withgoogle.com/#/edit/new, then click on the "+" button in the bottom left corner to add a new node.

Then input the pre-bundled code from our npm package. You can do this by pasting the following link into the input field and clicking "Submit":

https://cdn.jsdelivr.net/npm/huggingface-visualblocks-nodes@latest

Then you will be able to see three Hugging Face Collections: Client, Server and Common.

Client-Side Nodes

Using only client-side nodes, you can try to combine fun image processing nodes, webcam, and Transformers.js image segmentation models.

Image Segmentation

Depth Estimation

Server-Side Nodes

With the Hugging Face Server Nodes, you can access thousands of state-of-the-art models directly from the hub.

Server + Client-Side Example

Here is an example using the Hugging Face Hub Login node to get your personal token, then using the Mistral-7B LLM to generate an image using Stable Diffusion XL Text to Image, and then piping it to Transformer.js Depth Estimation running on the client side.

Another cool example uses Stable Diffusion XL Text-to-Image to generate a background image and Transformers.js to remove the background of the webcam input using either briaai/RMBG-1.4 or Modnet.

More Examples

Here is a list of examples showcasing the new nodes. You just need to click on an example to load it in on the editor.

Client Nodes

Translation Node Example
Token Classification Node Example
Text Classification Node Example
Object Detection Node Example
Image Segmentation Node Example
Image Classification Node Example
Depth Estimation Node Example
Background Removal Node Example

Server Nodes

Chat Template Text Generation Node Example
Chat Completion Node Example
Fill Mask Node Example
Image Classification Node Example
Summarization Node Example
Text Classification Node Example
Text Generation Node Example
Text to Image Node Example
Token Classification Node Example

Extra Examples

Background Removal Text to Image
Chat Completion Text to Image Depth
Image Segmentation Webcam Client

Acknowledgements

Thanks to @Xenova for building Transformers.js and for kickstarting the custom nodes project, @Jason Mayes, and the Visual Blocks team.