view article Article Docmatix - a huge dataset for Document Visual Question Answering 5 days ago • 43
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published 24 days ago • 84
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images Paper • 2407.06191 • Published 14 days ago • 9
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models Paper • 2407.06938 • Published 13 days ago • 20
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities Paper • 2407.07080 • Published 13 days ago • 20
Towards Building Specialized Generalist AI with System 1 and System 2 Fusion Paper • 2407.08642 • Published 11 days ago • 9
MambaVision: A Hybrid Mamba-Transformer Vision Backbone Paper • 2407.08083 • Published 12 days ago • 19
StyleSplat: 3D Object Style Transfer with Gaussian Splatting Paper • 2407.09473 • Published 10 days ago • 10
GAVEL: Generating Games Via Evolution and Language Models Paper • 2407.09388 • Published 10 days ago • 12
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published 10 days ago • 106
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published 5 days ago • 26
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published 5 days ago • 62
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Paper • 2407.11963 • Published 6 days ago • 37
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control Paper • 2407.12781 • Published 5 days ago • 10
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation Paper • 2406.19215 • Published 25 days ago • 28
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published 21 days ago • 41
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper • 2407.01370 • Published 21 days ago • 81
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation Paper • 2406.19251 • Published 25 days ago • 8
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 32
view article Article XLSCOUT Unveils ParaEmbed 2.0: a Powerful Embedding Model Tailored for Patents and IP with Expert Support from Hugging Face 28 days ago • 8
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published about 1 month ago • 57
nabla^2DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials Paper • 2406.14347 • Published Jun 20 • 99
MotionLLM: Understanding Human Behaviors from Human Motions and Videos Paper • 2405.20340 • Published May 30 • 19
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Paper • 2405.11582 • Published May 19 • 12
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache Paper • 2401.02669 • Published Jan 5 • 14
DRAGON Models Collection Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models • 11 items • Updated Feb 3 • 42
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 50
Resonance RoPE: Improving Context Length Generalization of Large Language Models Paper • 2403.00071 • Published Feb 29 • 19
DeepSeek-VL: Towards Real-World Vision-Language Understanding Paper • 2403.05525 • Published Mar 8 • 39
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8 • 52
VideoMamba: State Space Model for Efficient Video Understanding Paper • 2403.06977 • Published Mar 11 • 26
Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM Paper • 2403.07487 • Published Mar 12 • 12
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12 • 73
Gemma: Open Models Based on Gemini Research and Technology Paper • 2403.08295 • Published Mar 13 • 45
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba Paper • 2403.09977 • Published Mar 15 • 8
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 56
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction Paper • 2403.18795 • Published Mar 27 • 17
Garment3DGen: 3D Garment Stylization and Texture Generation Paper • 2403.18816 • Published Mar 27 • 19
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text Paper • 2403.18421 • Published Mar 27 • 21
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs Paper • 2403.20041 • Published Mar 29 • 34
Gecko: Versatile Text Embeddings Distilled from Large Language Models Paper • 2403.20327 • Published Mar 29 • 47
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes Paper • 2404.00987 • Published Apr 1 • 21
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 102
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation Paper • 2404.05674 • Published Apr 8 • 12
RULER: What's the Real Context Size of Your Long-Context Language Models? Paper • 2404.06654 • Published Apr 9 • 32
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published Apr 11 • 40
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published Apr 11 • 46