Deep Reinforcement Learning for Dynamic Pricing of Perishable Products

Dynamic pricing is a strategy for setting flexible prices for products based on existing market demand. In this paper, we address the problem of dynamic pricing of perishable products using DQN value function approximator. A model-free reinforcement learning approach is used to maximize revenue for...

Full description

Saved in:

Bibliographic Details
Published in	Optimization and Learning Vol. 1443; pp. 132 - 143
Main Authors	Burman, Vibhati, Vashishtha, Rajesh Kumar, Kumar, Rajan, Ramanan, Sharadha
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2021 Springer International Publishing
Series	Communications in Computer and Information Science
Subjects	Deep Q-network Deep reinforcement learning Dynamic pricing Fashion industry Grocery Perishable items Retail Revenue management
Online Access	Get full text
ISBN	3030856712 9783030856717
ISSN	1865-0929 1865-0937
DOI	10.1007/978-3-030-85672-4_10

Cover

More Information
Summary:	Dynamic pricing is a strategy for setting flexible prices for products based on existing market demand. In this paper, we address the problem of dynamic pricing of perishable products using DQN value function approximator. A model-free reinforcement learning approach is used to maximize revenue for a perishable item with fixed initial inventory and selling horizon. The demand is influenced by the price and freshness of the product. The conventional tabular Q-learning method involves storing the Q-values for each state-action pair in a lookup table. This approach is not suitable for control problems with large state spaces. Hence, we use function approximation approach to address the limitations of a tabular Q-learning method. Using DQN function approximator we generalize the unseen states from the seen states, which reduces the space requirements for storing value function for each state-action combination. We show that using DQN we can model the problem of pricing perishable products. Our results demonstrate that the DQN based dynamic pricing algorithm generates higher revenue when compared with conventional one-step price optimization and constant pricing strategy.
Bibliography:	Rajan was an employee of TCS when this work was done.
ISBN:	3030856712 9783030856717
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-3-030-85672-4_10