Abstract We explore whether quantum advantages can be found for the zeroth-order online convex optimization problem, which is also known as bandit with multi-point feedback. In this setting, given access to oracles (that is, loss function accessed a black box that returns value any queried input), player attempts minimize sequence of adversarially generated functions. This procedure described $...