AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Paper • 2407.04363 • Published 4 days ago • 21 • 2
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Paper • 2407.04078 • Published 5 days ago • 10 • 3
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild Paper • 2407.04172 • Published 5 days ago • 13 • 5
TabReD: A Benchmark of Tabular Machine Learning in-the-Wild Paper • 2406.19380 • Published 12 days ago • 44 • 6
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models Paper • 2406.16863 • Published 15 days ago • 10 • 4
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper • 2407.03320 • Published 6 days ago • 84 • 5
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models Paper • 2407.01920 • Published 7 days ago • 13 • 4
What Matters in Detecting AI-Generated Videos like Sora? Paper • 2406.19568 • Published 12 days ago • 12 • 4
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation Paper • 2407.02371 • Published 7 days ago • 43 • 6
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models Paper • 2407.01519 • Published 8 days ago • 22 • 4
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published 8 days ago • 24 • 5
EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records Paper • 2406.16341 • Published 15 days ago • 11 • 7
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published 8 days ago • 69 • 6
Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs Paper • 2406.20086 • Published 11 days ago • 3 • 4
MIRAI: Evaluating LLM Agents for Event Forecasting Paper • 2407.01231 • Published 8 days ago • 14 • 3
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Paper • 2407.00114 • Published 12 days ago • 12 • 5
InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation Paper • 2407.00788 • Published 9 days ago • 20 • 4
Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models Paper • 2407.00111 • Published 12 days ago • 4 • 2
MatchTime: Towards Automatic Soccer Game Commentary Generation Paper • 2406.18530 • Published 13 days ago • 11 • 4
ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs Paper • 2406.18120 • Published 13 days ago • 5 • 5
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Paper • 2406.19389 • Published 12 days ago • 51 • 6
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published 12 days ago • 12 • 2
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding Paper • 2406.19263 • Published 12 days ago • 9 • 2
ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models Paper • 2406.18125 • Published 13 days ago • 3 • 3
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Paper • 2406.18676 • Published 13 days ago • 5 • 5
T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings Paper • 2406.19223 • Published 12 days ago • 8 • 2
GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality Paper • 2406.18462 • Published 13 days ago • 10 • 3
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model Paper • 2406.20076 • Published 11 days ago • 6 • 3