Adaptive Orchestration for Large-Scale Inference on Heterogeneous Accelerator Systems Balancing Cost, Performance, and Resilience Paper • 2503.20074 • Published 17 days ago • 5