Barry Li's picture

32 2

Barry Li

Brilliant-B

·

Brilliant-B

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence

upvoted a paper 4 days ago

Thinking Preference Optimization

upvoted a paper 4 days ago

Qwen2.5-VL Technical Report

View all activity

Organizations

None yet

Brilliant-B's activity

upvoted 3 papers 4 days ago

AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence

Paper • 2502.13943 • Published 5 days ago • 7

Thinking Preference Optimization

Paper • 2502.13173 • Published 7 days ago • 14

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 5 days ago • 137

liked a model 4 days ago

Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • Updated 18 days ago • 1.49M • • 1.13k

liked a dataset 5 days ago

Xiaodong/open-r1-video-4k

Viewer • Updated 6 days ago • 4.66k • 91 • 3

upvoted a paper 9 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 83

upvoted 2 papers 11 days ago

Next Block Prediction: Video Generation via Semi-Autoregressive Modeling

Paper • 2502.07737 • Published 13 days ago • 9

Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving

Paper • 2502.07640 • Published 13 days ago • 8

upvoted a paper 12 days ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published Jan 23 • 37

upvoted 5 papers about 1 month ago

Temporal Preference Optimization for Long-Form Video Understanding

Paper • 2501.13919 • Published Jan 23 • 22

Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding

Paper • 2501.07888 • Published Jan 14 • 15

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10 • 67

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 61

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14 • 32

upvoted 6 papers about 2 months ago

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Paper • 2412.21059 • Published Dec 30, 2024 • 18

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 55

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Paper • 2412.18525 • Published Dec 24, 2024 • 75

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Paper • 2412.20005 • Published Dec 28, 2024 • 18

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Paper • 2412.20631 • Published Dec 30, 2024 • 15

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models

Paper • 2412.18609 • Published Dec 24, 2024 • 17