Online Non-Convex Optimization with Imperfect Feedback

NIPS2020

Amu00e9lie Hu00e9liou,Matthieu Martin,Panayotis Mertikopoulos,Thibaud Rahier

We consider the problem of online learning with non-convex losses. In terms of feedback, we assume that the learner observes u2013 or otherwise constructs u2013 an inexact model for the loss function encountered at each stage, and we propose a mixed-strategy learning policy based on dual averaging. In this general context, we derive a series of tight regret minimization guarantees, both for the learneru2019s static (external) regret, as well as the regret incurred against the best dynamic policy in hindsight. Subsequently, we apply this general template to the case where the learner only has access to the actual loss incurred at each stage of the process. This is achieved by means of a kernel-based estimator which generates an inexact model for each roundu2019s loss function using only the learneru2019s realized losses as input.

Online Non-Convex Optimization with Imperfect Feedback

Amu00e9lie Hu00e9liou,Matthieu Martin,Panayotis Mertikopoulos,Thibaud Rahier

Discussion

Related Contents