| Abstract: | We consider a dynamic pricing problem where the firm tries to maximize the profit upon selling a product over the course of T periods. We do not assume decision maker’s foreknowledge on the demand. Traditionally, the cost is fixed and the problem may be formulated as a multi armed bandit problem, which is known to have an O(log T) lower bound on the expected regret. In this paper, we consider a setting where the cost may change over time and the optimal price is thus a function of the cost. We develop an upper confidence bound like (UCB-Like) algorithm to solve the problem. We show that our algorithm is robust and efficient in terms of the upper bound on the expected regret. | 
| Date: | Oct 13 (Fri), 2017 11:00 am - 12:30 pm | 
| Time: | 11:00AM - 12:30PM | 
| Speaker: | Mr Ying Zhong PhD Student Department of Management Sciences City University of Hong Kong | 
| Venue: | Room 6-213, 6/F, Lau Ming Wai Academic Building | 

