4 4 4

Hexiang Hu

hexianghu

https://www.hexianghu.com/

AI & ML interests

Multimodal learning: Vision, Language, etc.

Recent Activity

authored a paper 14 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

new activity about 2 months ago

Spawning/PD12M:Is it possible to get metadata of the images?

liked a dataset 3 months ago

Spawning/PD12M

View all activity

Organizations

hexianghu's activity

authored a paper 14 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 14 days ago • 66

New activity in Spawning/PD12M about 2 months ago

Is it possible to get metadata of the images?

#4 opened about 2 months ago by

hexianghu

liked a dataset 3 months ago

Spawning/PD12M

Viewer • Updated 21 days ago • 12.4M • 1.22k • 150

upvoted a paper 4 months ago

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published Oct 14, 2024 • 38

authored a paper 4 months ago

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published Oct 14, 2024 • 38

liked a model 6 months ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Aug 16, 2024 • 1.54M • 8.37k

authored 3 papers 6 months ago

upvoted a paper 6 months ago

Imagen 3

Paper • 2408.07009 • Published Aug 13, 2024 • 61

authored 5 papers 10 months ago

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

Paper • 2403.19651 • Published Mar 28, 2024 • 22

Re-Imagen: Retrieval-Augmented Text-to-Image Generator

Paper • 2209.14491 • Published Sep 29, 2022

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers

Paper • 2311.17136 • Published Nov 28, 2023 • 7

From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces

Paper • 2306.00245 • Published May 31, 2023

PreSTU: Pre-Training for Scene-Text Understanding

Paper • 2209.05534 • Published Sep 12, 2022

upvoted a paper 10 months ago

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

Paper • 2403.19651 • Published Mar 28, 2024 • 22

upvoted a paper 12 months ago

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31

authored a paper about 1 year ago

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31

liked a dataset about 1 year ago

tarungupta83/MidJourney_v5_Prompt_dataset

Viewer • Updated May 21, 2023 • 4.27M • 100 • 30

authored a paper about 1 year ago

Gemini: A Family of Highly Capable Multimodal Models

Paper • 2312.11805 • Published Dec 19, 2023 • 45