vincentyang

vincent88

AI & ML interests

None yet

Recent Activity

upvoted a collection 7 days ago

RLVR

upvoted a paper 7 days ago

Inference-Time Scaling for Generalist Reward Modeling

liked a model 28 days ago

starvector/starvector-8b-im2svg

View all activity

Organizations

None yet

vincent88's activity

upvoted a collection 7 days ago

RLVR

Collection

Model and data for 'Expanding RL with Verifiable Rewards Across Diverse Domains' • 3 items • Updated 18 days ago • 11

upvoted a paper 7 days ago

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published 15 days ago • 52

upvoted a paper 5 months ago

Video-Guided Foley Sound Generation with Multimodal Controls

Paper • 2411.17698 • Published Nov 26, 2024 • 10

upvoted a paper 6 months ago

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published Oct 11, 2024 • 89

upvoted 2 papers 7 months ago

DITTO: Diffusion Inference-Time T-Optimization for Music Generation

Paper • 2401.12179 • Published Jan 22, 2024 • 22

Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4, 2024 • 98

upvoted a paper 8 months ago

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28, 2024 • 88

upvoted a paper 9 months ago

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24, 2024 • 42

upvoted a paper 11 months ago

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90

upvoted a paper 12 months ago

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Paper • 2404.16821 • Published Apr 25, 2024 • 60

upvoted a collection about 1 year ago

GPT-4 generated datasets

Collection

Collection of some GPT-4 generated datasets. It may be useful for those looking for the best-quality datasets to train competitive LLMs. • 18 items • Updated Apr 16, 2024 • 10

upvoted 2 papers about 1 year ago

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 98

VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis

Paper • 2403.08764 • Published Mar 13, 2024 • 37

upvoted a collection over 1 year ago

Awesome SDXL LoRAs

Collection

A curated set of amazing Stable Diffusion XL LoRAs (they power the LoRA the Explorer Space) • 36 items • Updated Jun 24, 2024 • 20

upvoted 5 papers over 1 year ago

LLaMA Beyond English: An Empirical Study on Language Capability Transfer

Paper • 2401.01055 • Published Jan 2, 2024 • 56

MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance

Paper • 2312.11396 • Published Dec 18, 2023 • 11

FreeInit: Bridging Initialization Gap in Video Diffusion Models

Paper • 2312.07537 • Published Dec 12, 2023 • 27

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Paper • 2312.06109 • Published Dec 11, 2023 • 21

Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models

Paper • 2312.02969 • Published Dec 5, 2023 • 15