16 22 15

Xiangtai Li

LXT

https://lxtgh.github.io/

AI & ML interests

Computer Vision, Multi-Modal Understanding, Generative AI

Recent Activity

updated a dataset 3 days ago

General-Level/General-Bench-Openset

updated a dataset 3 days ago

General-Level/General-Bench-Closeset

published a model 4 days ago

General-Level/General-Bench-Openset

View all activity

Organizations

LXT's activity

commented 4 papers 7 days ago

The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer

Paper • 2504.10462 • Published 9 days ago • 14 •

The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer

Paper • 2504.10462 • Published 9 days ago • 14 •

Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding

Paper • 2504.10465 • Published 9 days ago • 28 •

Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding

Paper • 2504.10465 • Published 9 days ago • 28 •

New activity in ByteDance/Sa2VA-4B 3 months ago

Dependency conflicts

#4 opened 3 months ago by

tbomez

commented 4 papers 4 months ago

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 46 •

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 49 •

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 49 •

EMOv2: Pushing 5M Vision Model Frontier

Paper • 2412.06674 • Published Dec 9, 2024 • 13 •

commented a paper 6 months ago

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Paper • 2410.08261 • Published Oct 10, 2024 • 52 •

commented 5 papers 10 months ago

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 55 •

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 55 •

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Paper • 2406.20085 • Published Jun 28, 2024 • 13 •

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 55 •

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 55 •

New activity in Dense-World/OMG-LLaVA 10 months ago

Upload omg_llava_7b_xxl_pretrain_1024image_8gpus.pth

#1 opened 10 months ago by

LXT

New activity in LXT/OMG_Seg over 1 year ago

Apply for community grant: Academic project (gpu)

#2 opened over 1 year ago by

LXT

Update main.py

#4 opened over 1 year ago by

HarborYuan

add spaces lib

#3 opened over 1 year ago by

HarborYuan

Apply for community grant: Personal project (gpu)

#1 opened over 1 year ago by

LXT