Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan
Motivated by the pressing need for efficient optimization in online recommender systems, we revisit the cascading bandit model proposed by Kveton et al. (2015). While Thompson sampling (TS) algorithms have been shown to be empirically superior to Upper Confidence Bound (UCB) algorithms for cascading bandits, theoretical guarantees are only…
Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan
Motivated by the pressing need for efficient optimization in online recommender systems, we revisit the cascading bandit model proposed by Kveton et al. (2015). While Thompson sampling (TS) algorithms have been shown to be empirically superior to Upper Confidence Bound (UCB) algorithms for cascading bandits, theoretical guarantees are only…
Z Journal of Machine Learning Research, Vol. 22, No. 218, Pages 1 – 66, 2021