company
DoorDash
companyactiveprovisional
doordash-ffa6f2c9·1 events·first seen 5d agoAliases: DoorDash
Co-occurring entities
More like this (12)
Recent events (1)
DoorDash deploys multi-agent RL system for adaptive dispatch objective weights in food-delivery marketplace
Researchers at DoorDash present a deployed reinforcement learning system that adapts dispatch objective weights in a three-sided food-delivery marketplace using delayed operational feedback signals. Rather than replacing the combinatorial optimizer, a store-level policy selects discrete multipliers that shift the optimizer's tradeoff between delivery quality and batching efficiency. The system uses centralized offline training with Double Q-learning and a conservative regularizer to handle out-of-distribution overestimation, then executes decentrally per store. A production switchback experiment shows increased batching and reduced courier time costs without degrading customer delivery quality.