Adaptive Orchestration for Large-Scale Inference on Heterogeneous Accelerator Systems Balancing Cost, Performance, and Resilience Paper • 2503.20074 • Published Mar 25 • 5