Zhiwei Qin, Xiaocheng Tang, Yan Jiao, DiDi Research America, Mountain View, CA
Fan Zhang, Zhe Xu, Hongtu Zhu, Jieping Ye, Didi Chuxing, Beijing, China
Order dispatching (or order matching) is instrumental to the marketplace engine of a large-scale ride-hailing platform like DiDi. Due to the dynamic nature of supply and demand, the ride-hailing order dispatching problem is very challenging to solve over a long horizon.
Added to the complexity are considerations of system performance and multi-objectives. In this paper, we describe the evolution of our approach to this optimization problem from a myopic combinatorial optimization approach to one that encompasses a semi-MDP model and deep reinforcement learning for long-term optimization.