The linear bandit problem has received a lot of attention in the past decade due to its applications in new recommendation systems and online ad placements where the feedback is binary such as thumbs up/down or click/no click. Linear bandits, however, assume the standard linear regression model and thus are not well-suited for binary feedback. While logistic linear bandits, the logistic regression counterpart of linear bandits, are more attractive for these applications, developments have been slow and practitioners often end up using linear bandits for binary feedback -- this corresponds to using linear regression for classification tasks.In this talk, I will present recent breakthroughs in logistic linear bandits leading to tight performance guarantees and lower bounds. These developments are based on self-concordant analysis, improved fixed design concentration inequalities, and novel methods for the design of experiments. I will also discuss open problems and conjectures on concentration inequalities. This talk will be based on our recent paper accepted to ICML'21 (https://arxiv.org/abs/2011.11222).
|