Time: 11:00am to 12:30pm
Venue: Room 6-213, 6/F, Lau Ming Wai Academic Building
We consider a dynamic pricing problem where the firm tries to maximize the profit upon selling a product over the course of T periods. We do not assume decision maker’s foreknowledge on the demand. Traditionally, the cost is fixed and the problem may be formulated as a multi armed bandit problem, which is known to have an O(log T) lower bound on the expected regret. In this paper, we consider a setting where the cost may change over time and the optimal price is thus a function of the cost. We develop an upper confidence bound like (UCB-Like) algorithm to solve the problem. We show that our algorithm is robust and efficient in terms of the upper bound on the expected regret.
Mr Ying Zhong is a third-year PhD student in the Department of Management Sciences at City University of Hong Kong. He holds a Master Degree of Applied Statistics from University of Michigan, Ann Arbor and a Bachelor Degree of Computing Mathematics from City University of Hong Kong. His research interests lie primarily on the ranking-and-selection problem and the multi armed bandit problem.