mini09999's picture

mini09999

mini09999

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

GENEB: Why Genomic Models Are Hard to Compare

upvoted a paper about 13 hours ago

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

upvoted a paper about 13 hours ago

Orchestra-o1: Omnimodal Agent Orchestration

View all activity

Organizations

None yet

upvoted 20 papers about 13 hours ago

GENEB: Why Genomic Models Are Hard to Compare

Paper • 2606.04525 • Published 25 days ago • 49

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

Paper • 2606.19338 • Published 11 days ago • 48

Orchestra-o1: Omnimodal Agent Orchestration

Paper • 2606.13707 • Published 18 days ago • 48

HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry

Paper • 2606.14249 • Published 16 days ago • 49

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Paper • 2606.05553 • Published 24 days ago • 50

Playful Agentic Robot Learning

Paper • 2606.19419 • Published 11 days ago • 50

ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining

Paper • 2606.17200 • Published 13 days ago • 51

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Paper • 2606.08415 • Published 21 days ago • 51

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

Paper • 2606.10804 • Published 19 days ago • 51

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

Paper • 2606.18558 • Published 11 days ago • 53

Benchmarking Visual State Tracking in Multimodal Video Understanding

Paper • 2606.03920 • Published 26 days ago • 52

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Paper • 2606.09730 • Published 20 days ago • 54

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

Paper • 2605.23895 • Published May 22 • 54

TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders

Paper • 2606.09323 • Published 20 days ago • 53

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

Paper • 2606.05563 • Published 24 days ago • 55

LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories

Paper • 2606.13578 • Published 17 days ago • 56

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Paper • 2606.02060 • Published 27 days ago • 57

Mellum2 Technical Report

Paper • 2605.31268 • Published 30 days ago • 57

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

Paper • 2606.17861 • Published 12 days ago • 58

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Paper • 2606.02373 • Published 27 days ago • 59