LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 115
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Paper • 2404.07973 • Published Apr 11 • 28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions Paper • 2403.19651 • Published Mar 28 • 22
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error Paper • 2403.04746 • Published Mar 7 • 21
FlashTex: Fast Relightable Mesh Texturing with LightControlNet Paper • 2402.13251 • Published Feb 20 • 13
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper • 2402.10986 • Published Feb 16 • 73
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29 • 46
MM-LLMs: Recent Advances in MultiModal Large Language Models Paper • 2401.13601 • Published Jan 24 • 41
Secrets of RLHF in Large Language Models Part II: Reward Modeling Paper • 2401.06080 • Published Jan 11 • 23
LLM Augmented LLMs: Expanding Capabilities through Composition Paper • 2401.02412 • Published Jan 4 • 35
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation Paper • 2401.04092 • Published Jan 8 • 18
DocGraphLM: Documental Graph Language Model for Information Extraction Paper • 2401.02823 • Published Jan 5 • 32
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4 • 59
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models Paper • 2401.00788 • Published Jan 1 • 21
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training Paper • 2401.00849 • Published Jan 1 • 14
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models Paper • 2312.17661 • Published Dec 29, 2023 • 10
Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models Paper • 2312.12487 • Published Dec 19, 2023 • 6
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model Paper • 2312.13252 • Published Dec 20, 2023 • 25
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing Paper • 2312.11392 • Published Dec 18, 2023 • 18
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation Paper • 2312.09251 • Published Dec 14, 2023 • 6