leditsplusplus

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

PSaiml authored a paper about 2 months ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

felfri authored a paper about 2 months ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

mbrack authored a paper 3 months ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

View all activity

leditsplusplus's activity

PSaiml

authored a paper about 2 months ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published Jan 17 • 8

felfri

authored a paper about 2 months ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published Jan 17 • 8

mbrack

authored a paper 3 months ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

felfri

authored a paper 3 months ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

PSaiml

authored a paper 3 months ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

multimodalart

posted an update 8 months ago

Post

26460

New feature 🔥
Image models and LoRAs now have little previews 🤏

If you don't know where to start to find them, I invite you to browse cool LoRAs in the profile of some amazing fine-tuners: @artificialguybr , @alvdansen , @DoctorDiffusion , @e-n-v-y , @KappaNeuro @ostris

3 replies

mbrack

authored a paper 9 months ago

T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

Paper • 2406.19223 • Published Jun 27, 2024 • 11

felfri

updated a Space 9 months ago

LEDITS++ Project

📔

Edit images using text descriptions efficiently

felfri

posted an update 9 months ago

Post

2325

🚀 Excited to announce the release of our new research paper, "LLAVAGUARD: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment"!
In this work, we introduce LLAVAGUARD, a family of cutting-edge Vision-Language Model (VLM) judges designed to enhance the safety and integrity of vision datasets and generative models. Our approach leverages flexible policies for assessing safety in diverse settings. This context awareness ensures robust data curation and model safeguarding alongside comprehensive safety assessments, setting a new standard for vision datasets and models. We provide three versions (7B, 13B, and 34B) and our data, see below. This achievement wouldn't have been possible without the incredible teamwork and dedication of my great colleagues @LukasHug , @PSaiml , @mbrack . 🙏 Together, we've pushed the boundaries of what’s possible at the intersection of large generative models and safety.
🔍 Dive into our paper to explore:
Innovative methodologies for dataset curation and model safeguarding.
State-of-the-art safety assessments.
Practical implications for AI development and deployment.
Find more at AIML-TUDA/llavaguard-665b42e89803408ee8ec1086 and https://ml-research.github.io/human-centered-genai/projects/llavaguard/index.html

2 replies

radames

posted an update 10 months ago

Post

6528

Thanks to @OzzyGT for pushing the new Anyline preprocessor to https://github.com/huggingface/controlnet_aux. Now you can use the TheMistoAI/MistoLine ControlNet with Diffusers completely.

Here's a demo for you: radames/MistoLine-ControlNet-demo
Super resolution version: radames/Enhance-This-HiDiffusion-SDXL

from controlnet_aux import AnylineDetector

anyline = AnylineDetector.from_pretrained(
    "TheMistoAI/MistoLine", filename="MTEED.pth", subfolder="Anyline"
).to("cuda")

source = Image.open("source.png")
result = anyline(source, detect_resolution=1280)

radames

posted an update 10 months ago

Post

6963

At Google I/O 2024, we're collaborating with the Google Visual Blocks team (https://visualblocks.withgoogle.com) to release custom Hugging Face nodes. Visual Blocks for ML is a browser-based tool that allows users to create machine learning pipelines using a visual interface. We're launching nodes with Transformers.js, running models on the browser, as well as server-side nodes running Transformers pipeline tasks and LLMs using our hosted inference. With @Xenova @JasonMayes

You can learn more about it here https://huggingface.co/blog/radames/hugging-face-google-visual-blocks

Source-code for the custom nodes:
https://github.com/huggingface/visual-blocks-custom-components

radames

posted an update 10 months ago

Post

2083

AI-town now runs on Hugging Face Spaces with our API for LLMs and embeddings, including the open-source Convex backend, all in one container. Easy to duplicate and config on your own

Demo: radames/ai-town
Instructions: https://github.com/radames/ai-town-huggingface

9 replies

multimodalart

posted an update 10 months ago

Post

28181

The first open Stable Diffusion 3-like architecture model is JUST out 💣 - but it is not SD3! 🤔

It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model 🖼️✨, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english 🤝 chinese understanding

Try it out by yourself here ▶️ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)

In the paper they claim to be SOTA open source based on human preference evaluation!

radames

posted an update 10 months ago

Post

2588

HiDiffusion SDXL now supports Image-to-Image, so I've created an "Enhance This" version using the latest ControlNet Line Art model called MistoLine. It's faster than DemoFusion

Demo: radames/Enhance-This-HiDiffusion-SDXL

Older version based on DemoFusion radames/Enhance-This-DemoFusion-SDXL

New Controlnet SDXL Controls Every Line TheMistoAI/MistoLine

HiDiffusion is compatible with diffusers and support many SD models - https://github.com/megvii-research/HiDiffusion

1 reply

radames

posted an update 11 months ago

Post

2493

I've built a custom component that integrates Rerun web viewer with Gradio, making it easier to share your demos as Gradio apps.

Basic snippet

# pip install gradio_rerun gradio
import gradio as gr
from gradio_rerun import Rerun

gr.Interface(
    inputs=gr.File(file_count="multiple", type="filepath"),
    outputs=Rerun(height=900),
    fn=lambda file_path: file_path,
).launch()

More details here radames/gradio_rerun
Source https://github.com/radames/gradio-rerun-viewer

Follow Rerun here https://huggingface.co/rerun

radames

posted an update 11 months ago

Post

2469

ByteDance released new distillation technique Hyper-SD ( ByteDance/Hyper-SD) for efficient image generation.
Here a few demos:

Official:
Hyper-SDXL-1Step-T2I ByteDance/Hyper-SDXL-1Step-T2I

Hyper-SD15-Scribble ByteDance/Hyper-SD15-Scribble

Unofficial Demos: InstantStyle + Hyper SD1.5 (not great but super fast) radames/InstantStyle-Hyper-SD

InstantStyle + Hyper SDXL radames/InstantStyle-Hyper-SDXL

radames

posted an update 11 months ago

Post

2204

InstantStyle works with the 2-step SDXL-Lightning distilled model, reducing generation time from ~20s to ~9s!

In a big related update, as of today, Diffusers@main supports InstantStyle. I'm looking forward to playing with it!

https://github.com/huggingface/diffusers/pull/7668

radames/InstantStyle-SDXL-Lightning
ByteDance/SDXL-Lightning

PSaiml

authored a paper 11 months ago

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Paper • 2404.12241 • Published Apr 18, 2024 • 11

radames

posted an update 11 months ago

Post

3860

Here's a utility component for integrating your Gradio app with Hugging Face. This custom component enables you to search for models, spaces, datasets, and users.

pip install gradio_huggingfacehub_search

You can see it in action here. arcee-ai/mergekit-config-generator

And learn how to use it here radames/gradio_huggingfacehub_search

radames

posted an update 12 months ago

Post

2775

Following up on @vikhyatk 's Moondream2 update and @santiagomed 's implementation on Candle, I quickly put togheter the WASM module so that you could try running the ~1.5GB quantized model in the browser. Perhaps the next step is to rewrite it using https://github.com/huggingface/ratchet and run it even faster with WebGPU, @FL33TW00D-HF .

radames/Candle-Moondream-2

ps: I have a collection of all Candle WASM demos here radames/candle-wasm-examples-650898dee13ff96230ce3e1f

AI & ML interests

Recent Activity

Team members 7

leditsplusplus's activity

LEDITS++ Project