Convexity and Feedback in Approximate Dynamic Programming for Delivery Time Slot Pricing

Denis Lebedev, Kostas Margellos and Paul Goulart

IEEE Transactions on Control Systems Technology, pp. 1-8, 2021.
BibTeX  URL  Preprint 

@article{LMG:2021b,
  author = {Lebedev, Denis and Margellos, Kostas and Goulart, Paul},
  title = {Convexity and Feedback in Approximate Dynamic Programming for Delivery Time Slot Pricing},
  journal = {IEEE Transactions on Control Systems Technology},
  year = {2021},
  pages = {1-8},
  url = {https://ieeexplore.ieee.org/document/9486899},
  doi = {10.1109/TCST.2021.3093648}
}

We consider the revenue management problem of finding profit-maximizing prices for delivery time slots in the context of attended home delivery. This multistage optimal control problem admits a dynamic programming (DP) formulation that is intractable for realistic problem sizes due to the so-called ‘‘curse of dimensionality.’’ We therefore study three approximate DP algorithms both from a numerical and control-theoretical perspective. Our analysis is based on real-world data, from which we generate multiple scenarios to stress-test the robustness of the pricing policies to errors in model parameter estimates. Our theoretical analysis and numerical benchmark tests indicate that one of these algorithms, namely gradient-bounded DP, dominates the others with respect to computation time and profit-generation capabilities of the pricing policies that it generates.