how to run it in server mode?

by usermma - opened 3 days ago

Discussion

usermma

3 days ago

how to run it locally in server mode? i tried to run it, i fail, this is the code

https://huggingface.co/spaces/usermma/supergemma4-e4b-abliterated-multimodal-gguf-4bit/blob/main/Dockerfile

InsecureErasure

Owner 3 days ago

The code you linked seems to be a way to run llama.cpp from Python, right? This is a UNet model, not a transformer-based LLM. Did you link to the wrong URL?

usermma

2 days ago

okay sorry, so there is no way from running it from gguf like llama.cpp?

InsecureErasure

Owner 2 days ago

Unfortunately no, and it's totally outside of the scope of llama.cpp. llama.cpp's purpose is to run inference on LLM models. This repo contains different quantizations of an image generation model based on SDXL. They're meant to be run with tools like ComfyUI.

InsecureErasure changed discussion status to closed 2 days ago

usermma

2 days ago

okay good, i will take my time into spending it towards into how to use it into ComfyUI, if it didn't work, i will still trying...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment