-
UI Layout Generation with LLMs Guided by UI Grammar
Paper • 2310.15455 • Published • 2 -
You Only Look at Screens: Multimodal Chain-of-Action Agents
Paper • 2309.11436 • Published • 1 -
Never-ending Learning of User Interfaces
Paper • 2308.08726 • Published • 1 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 61
Collections
Discover the best community collections!
Collections including paper arxiv:2401.00908
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 13 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 23 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 5 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 19
-
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 61 -
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Paper • 2309.01131 • Published • 1 -
On the Hidden Mystery of OCR in Large Multimodal Models
Paper • 2305.07895 • Published -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 174
-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 82 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 61 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 36 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 94
-
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts
Paper • 2309.07430 • Published • 25 -
MindAgent: Emergent Gaming Interaction
Paper • 2309.09971 • Published • 11 -
Cure the headache of Transformers via Collinear Constrained Attention
Paper • 2309.08646 • Published • 12 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 37