VideoGameBunny: Towards vision assistants for video games Paper • 2407.15295 • Published 5 days ago • 16
Internal Consistency and Self-Feedback in Large Language Models: A Survey Paper • 2407.14507 • Published 7 days ago • 39
Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation Paper • 2407.10817 • Published 11 days ago • 11
The Vision of Autonomic Computing: Can LLMs Make It a Reality? Paper • 2407.14402 • Published 7 days ago • 10
Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study Paper • 2406.07057 • Published Jun 11 • 10
From Local to Global: A Graph RAG Approach to Query-Focused Summarization Paper • 2404.16130 • Published Apr 24 • 3
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs Paper • 2407.04051 • Published 22 days ago • 33
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation Paper • 2407.02869 • Published 24 days ago • 16
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages Paper • 2407.03321 • Published 23 days ago • 14
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Paper • 2407.01906 • Published 25 days ago • 33
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion Paper • 2407.01392 • Published 25 days ago • 39
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published 25 days ago • 72
MotionBooth: Motion-Aware Customized Text-to-Video Generation Paper • 2406.17758 • Published Jun 25 • 18
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets Paper • 2406.18518 • Published about 1 month ago • 22
Repulsive Score Distillation for Diverse Sampling of Diffusion Models Paper • 2406.16683 • Published Jun 24 • 4
OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far? Paper • 2406.16772 • Published Jun 24 • 2
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization Paper • 2406.16008 • Published Jun 23 • 6
ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians Paper • 2406.16815 • Published Jun 24 • 7
How Many Parameters Does it Take to Change a Light Bulb? Evaluating Performance in Self-Play of Conversational Games as a Function of Model Characteristics Paper • 2406.14051 • Published Jun 20 • 9
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models Paper • 2406.16714 • Published Jun 24 • 10
Preference Tuning For Toxicity Mitigation Generalizes Across Languages Paper • 2406.16235 • Published Jun 23 • 12
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs Paper • 2406.15927 • Published Jun 22 • 13
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models Paper • 2406.15718 • Published Jun 22 • 14
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers Paper • 2406.16747 • Published Jun 24 • 16
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24 • 18
WARP: On the Benefits of Weight Averaged Rewarded Policies Paper • 2406.16768 • Published Jun 24 • 21
Efficient Continual Pre-training by Mitigating the Stability Gap Paper • 2406.14833 • Published Jun 21 • 19
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models Paper • 2406.16338 • Published Jun 24 • 23
Evaluating D-MERIT of Partial-annotation on Information Retrieval Paper • 2406.16048 • Published Jun 23 • 34
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper • 2406.16860 • Published Jun 24 • 53
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22 • 43
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published Jun 24 • 53
Reward Steering with Evolutionary Heuristics for Decoding-time Alignment Paper • 2406.15193 • Published Jun 21 • 12
Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework Paper • 2406.14783 • Published Jun 20 • 15
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution Paper • 2406.13457 • Published Jun 19 • 15
Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task Paper • 2406.14213 • Published Jun 20 • 20
Stylebreeder: Exploring and Democratizing Artistic Styles through Text-to-Image Models Paper • 2406.14599 • Published Jun 20 • 16
Towards Retrieval Augmented Generation over Large Video Libraries Paper • 2406.14938 • Published Jun 21 • 18
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published Jun 21 • 57
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges Paper • 2406.12624 • Published Jun 18 • 35
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models Paper • 2206.04615 • Published Jun 9, 2022 • 5
Can Large Language Models Be an Alternative to Human Evaluations? Paper • 2305.01937 • Published May 3, 2023 • 2
VoCo-LLaMA: Towards Vision Compression with Large Language Models Paper • 2406.12275 • Published Jun 18 • 29
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published Jun 17 • 54
TroL: Traversal of Layers for Large Language and Vision Models Paper • 2406.12246 • Published Jun 18 • 34
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities Paper • 2406.11768 • Published Jun 17 • 20
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models Paper • 2406.09416 • Published Jun 13 • 28
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling Paper • 2406.07522 • Published Jun 11 • 35