Apolinário from multimodal AI art's picture

Apolinário from multimodal AI art

multimodalart

·

https://multimodal.art

AI & ML interests

None yet

Articles

LoRA training scripts of the world, unite!

SDXL in 4 steps with Latent Consistency LoRAs

Running IF with 🧨 diffusers on a Free Tier Google Colab

Train your ControlNet with diffusers

Organizations

multimodalart's activity

upvoted a paper about 10 hours ago

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Paper • 2405.08748 • Published 2 days ago • 13

upvoted a collection 6 days ago

Perturbed Attention Guidance pipelines

Pipelines for Perturbed Attention Guidance with 🧨 library • 6 items • Updated 6 days ago • 2

upvoted a paper 20 days ago

From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation

Paper • 2404.15267 • Published 23 days ago • 4

upvoted a collection 22 days ago

OpenELM Instruct Models

4 items • Updated Apr 12 • 96

upvoted 2 papers 22 days ago

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published 24 days ago • 120

HiDiffusion: Unlocking High-Resolution Creativity and Efficiency in Low-Resolution Trained Diffusion Models

Paper • 2311.17528 • Published Nov 29, 2023 • 4

upvoted a collection 22 days ago

Leaderboards and benchmarks ✨

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 61 items • Updated 2 days ago • 59

upvoted an article 23 days ago

Article

LoRA training scripts of the world, unite!

Jan 2

• 13

upvoted a collection 23 days ago

Phi-3

Phi-3 family of models • 6 items • Updated 3 days ago • 196

upvoted 2 papers 23 days ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published 24 days ago • 230

Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis

Paper • 2404.13686 • Published 25 days ago • 25

upvoted 2 papers 29 days ago

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published about 1 month ago • 12

HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

Paper • 2404.01094 • Published Apr 1 • 3

upvoted a paper about 1 month ago

Natural language guidance of high-fidelity text-to-speech with synthetic annotations

Paper • 2402.01912 • Published Feb 2 • 7

upvoted a collection about 1 month ago

HF-curated models available on Workers AI

A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 48

upvoted 2 collections about 2 months ago

🎭 Avatars

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 33 items • Updated 2 days ago • 49

VILA: On Pre-training for Visual Language Models

10 items • Updated 10 days ago • 26

upvoted 2 papers about 2 months ago

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Paper • 2403.12008 • Published Mar 18 • 18

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Paper • 2403.12015 • Published Mar 18 • 60

upvoted 5 papers 2 months ago

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Paper • 2403.05135 • Published Mar 8 • 39

StableDrag: Stable Dragging for Point-based Image Editing

Paper • 2403.04437 • Published Mar 7 • 23

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Paper • 2403.04132 • Published Mar 7 • 38

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7 • 35

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5 • 40

upvoted 2 papers 3 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 123

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29 • 30

upvoted 2 collections 3 months ago

Playground v2.5

2 items • Updated Feb 27 • 20

Gemma release

Groups the Gemma models released by the Google team. • 40 items • Updated 2 days ago • 302

upvoted a paper 3 months ago

Neural Network Diffusion

Paper • 2402.13144 • Published Feb 20 • 92

upvoted a collection 3 months ago

Text-to-Image Base Models

All text-to-image open source base models, with their respective license • 28 items • Updated 6 days ago • 17

upvoted 4 papers 3 months ago

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

Paper • 2402.06178 • Published Feb 9 • 12

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5 • 61

Training-Free Consistent Text-to-Image Generation

Paper • 2402.03286 • Published Feb 5 • 62

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Paper • 2402.00769 • Published Feb 1 • 17

upvoted 2 collections 4 months ago

OLMo Suite

Artifacts for the first set of OLMo models. • 12 items • Updated about 23 hours ago • 35

AIM

AIM: Autoregressive Image Models • 5 items • Updated Jan 29 • 43

upvoted 2 papers 4 months ago

PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

Paper • 2401.05252 • Published Jan 10 • 43

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3 • 26

upvoted 2 papers 5 months ago

DreamTuner: Single Image is Enough for Subject-Driven Generation

Paper • 2312.13691 • Published Dec 21, 2023 • 23

HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles

Paper • 2312.11666 • Published Dec 18, 2023 • 12

upvoted a collection 5 months ago

Playground v2

Collection of Playground v2 models • 4 items • Updated Dec 6, 2023 • 5

upvoted a collection 6 months ago

Seamless Communication

A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 123

upvoted 6 papers 6 months ago

LEDITS++: Limitless Image Editing using Text-to-Image Models

Paper • 2311.16711 • Published Nov 28, 2023 • 14

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Paper • 2311.10093 • Published Nov 16, 2023 • 54

Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying

Paper • 2311.09578 • Published Nov 16, 2023 • 11

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Paper • 2311.09257 • Published Nov 14, 2023 • 43

Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 43

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 73

upvoted 5 collections 6 months ago

Diffusion guided image and video editing

Spaces for image and video editing using diffusion based techniques • 4 items • Updated Oct 25, 2023 • 2

Latent Consistency Models Weights

Full-fine tune weights for Latent Consistency models • 3 items • Updated Nov 9, 2023 • 9

Latent Consistency Models LoRAs

Latent Consistency Models for Stable Diffusion - LoRAs and full fine-tuned weights • 4 items • Updated Nov 10, 2023 • 94

MusicGen Stereo

A collection of stereo music generation models as part of the v2 MusicGen release. • 4 items • Updated 22 days ago • 7

Diffusion model Spaces

253 items • Updated about 6 hours ago • 21

upvoted 3 papers 7 months ago

FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 30

De-Diffusion Makes Text a Strong Cross-Modal Interface

Paper • 2311.00618 • Published Nov 1, 2023 • 21

Controllable Music Production with Diffusion Models and Guidance Gradients

Paper • 2311.00613 • Published Nov 1, 2023 • 23

upvoted a collection 7 months ago

Image Generation and Manipulation Tools

12 items • Updated Oct 27, 2023 • 1

upvoted 3 papers 7 months ago

CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Paper • 2310.16825 • Published Oct 25, 2023 • 24

Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 39

DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics

Paper • 2310.13268 • Published Oct 20, 2023 • 15