Abstract: | We consider a dynamic pricing problem where the firm tries to maximize the profit upon selling a product over the course of T periods. We do not assume decision maker’s foreknowledge on the demand. Traditionally, the cost is fixed and the problem may be formulated as a multi armed bandit problem, which is known to have an O(log T) lower bound on the expected regret. In this paper, we consider a setting where the cost may change over time and the optimal price is thus a function of the cost. We develop an upper confidence bound like (UCB-Like) algorithm to solve the problem. We show that our algorithm is robust and efficient in terms of the upper bound on the expected regret. |
Date: | Oct 13 (Fri), 2017 11:00 am - 12:30 pm |
Time: | 11:00AM - 12:30PM |
Speaker: |
Mr Ying Zhong PhD Student Department of Management Sciences City University of Hong Kong |
Venue: | Room 6-213, 6/F, Lau Ming Wai Academic Building |