Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published 3 days ago • 58
E5-V: Universal Embeddings with Multimodal Large Language Models Paper • 2407.12580 • Published 3 days ago • 31
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published 3 days ago • 42
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens Paper • 2401.09985 • Published Jan 18 • 14
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion Paper • 2407.10973 • Published 5 days ago • 8
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated Paper • 2407.10969 • Published 5 days ago • 16
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism Paper • 2407.10457 • Published 5 days ago • 19
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs Paper • 2407.10058 • Published 6 days ago • 28
GAVEL: Generating Games Via Evolution and Language Models Paper • 2407.09388 • Published 8 days ago • 12
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models Paper • 2407.09012 • Published 8 days ago • 8
StyleSplat: 3D Object Style Transfer with Gaussian Splatting Paper • 2407.09473 • Published 8 days ago • 10
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis Paper • 2407.09732 • Published 7 days ago • 7
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers Paper • 2407.09413 • Published 8 days ago • 9
MUSCLE: A Model Update Strategy for Compatible LLM Evolution Paper • 2407.09435 • Published 8 days ago • 17
Toto: Time Series Optimized Transformer for Observability Paper • 2407.07874 • Published 10 days ago • 27
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published 8 days ago • 102
TokenPacker: Efficient Visual Projector for Multimodal LLM Paper • 2407.02392 • Published 18 days ago • 20
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models Paper • 2407.02687 • Published 18 days ago • 20
TabReD: A Benchmark of Tabular Machine Learning in-the-Wild Paper • 2406.19380 • Published 23 days ago • 46
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper • 2407.03320 • Published 17 days ago • 87
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages Paper • 2407.03321 • Published 17 days ago • 14
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Paper • 2407.01906 • Published 18 days ago • 33
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion Paper • 2407.01392 • Published 19 days ago • 39
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective Paper • 2407.08583 • Published 9 days ago • 10
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception Paper • 2407.08303 • Published 9 days ago • 17
Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge Paper • 2407.03958 • Published 16 days ago • 15
Granular Privacy Control for Geolocation with Vision Language Models Paper • 2407.04952 • Published 14 days ago • 3
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams Paper • 2406.08085 • Published Jun 12 • 11
CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion Blur Images Paper • 2407.03923 • Published 16 days ago • 7
HEMM: Holistic Evaluation of Multimodal Foundation Models Paper • 2407.03418 • Published 17 days ago • 8
On scalable oversight with weak LLMs judging strong LLMs Paper • 2407.04622 • Published 15 days ago • 11
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Paper • 2407.04078 • Published 16 days ago • 14
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs Paper • 2407.03963 • Published 16 days ago • 13
RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models Paper • 2407.05131 • Published 14 days ago • 19
Learning to (Learn at Test Time): RNNs with Expressive Hidden States Paper • 2407.04620 • Published 15 days ago • 22
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild Paper • 2407.04172 • Published 16 days ago • 19
Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction Paper • 2407.03651 • Published 16 days ago • 14
InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct Paper • 2407.05700 • Published 12 days ago • 8
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System Paper • 2407.06027 • Published 12 days ago • 8
How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions Paper • 2407.05015 • Published 14 days ago • 4
LETS-C: Leveraging Language Embedding for Time Series Classification Paper • 2407.06533 • Published 11 days ago • 2
VIMI: Grounding Video Generation through Multi-modal Instruction Paper • 2407.06304 • Published 12 days ago • 8
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty Paper • 2407.06071 • Published 12 days ago • 7
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps Paper • 2407.07071 • Published 11 days ago • 10
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling Paper • 2407.02880 • Published 17 days ago • 9
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Paper • 2407.03203 • Published 17 days ago • 9
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions Paper • 2407.06723 • Published 11 days ago • 9
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions Paper • 2407.06358 • Published 12 days ago • 14
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities Paper • 2407.07080 • Published 11 days ago • 20
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published 17 days ago • 34
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models Paper • 2407.06938 • Published 11 days ago • 20