Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
Xiaotian Hao,u00a0Zhaoqing Peng,u00a0Yi Ma,u00a0Guan Wang,u00a0Junqi Jin,u00a0Jianye Hao,u00a0Shan Chen,u00a0Rongquan Bai,u00a0Mingzhou Xie,u00a0Miao Xu,u00a0Zhenzhe Zheng,u00a0Chuan Yu,u00a0Han Li,u00a0Jian Xu,u00a0Kun Gai
In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiseru2019s cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing advertising systems mainly focus on the immediate revenue with single ad exposures, ignoring the contribution of each exposure to the final conversion, thus usually falls into suboptimal solutions. In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space while ensuring the solution quality. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. Extensive offline and online experiments show the superior performance of our approaches over state-of-the-art baselines in terms of cumulative revenue.