Learning to Price Against a Moving Target

Renato Paes Leme,u00a0Balasubramanian Sivan,u00a0Yifeng Teng,u00a0Pratik Worah

In the Learning to Price setting, a seller posts prices over time with the goal of maximizing revenue while learning the buyeru2019s valuation. This problem is very well understood when values are stationary (fixed or iid). Here we study the problem where the buyeru2019s value is a moving target, i.e., they change over time either by a stochastic process or adversarially with bounded variation. In either case, we provide matching upper and lower bounds on the optimal revenue loss. Since the target is moving, any information learned soon becomes out-dated, which forces the algorithms to keep switching between exploring and exploiting phases.