Judge Anything: MLLM as a Judge Across Any Modality Paper • 2503.17489 • Published 17 days ago • 19
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey Paper • 2503.12605 • Published 22 days ago • 32
Personalize Anything for Free with Diffusion Transformer Paper • 2503.12590 • Published 22 days ago • 43
Tree of Thoughts: Deliberate Problem Solving with Large Language Models Paper • 2305.10601 • Published May 17, 2023 • 12
Promptor: A Conversational and Autonomous Prompt Generation Agent for Intelligent Text Entry Techniques Paper • 2310.08101 • Published Oct 12, 2023 • 2
Ilya Sutskever's Top 30 Papers Collection List of 30 Articles/Papers to help gain insight in the AI space. • 1 item • Updated Sep 19, 2024 • 1
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines Paper • 2410.21220 • Published Oct 28, 2024 • 10
LongReward: Improving Long-context Large Language Models with AI Feedback Paper • 2410.21252 • Published Oct 28, 2024 • 18
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction Paper • 2410.21169 • Published Oct 28, 2024 • 30
Improve Vision Language Model Chain-of-thought Reasoning Paper • 2410.16198 • Published Oct 21, 2024 • 26
How to Design Translation Prompts for ChatGPT: An Empirical Study Paper • 2304.02182 • Published Apr 5, 2023 • 1