On the Gittins index for multiarmed bandits
R.R. Weber,
Ann. Appl. Prob. 2 1024-33, 1992.
Abstract
This paper considers the multiarmed bandit problem and presents a new proof of
the optimality of the Gittins index policy. The proof is intuitive and does not
require an interchange argument. The insight it affords is used to give a
streamlined summary of previous research and to prove a new result: The optimal
value function is a submodular set function of the available projects.
back to list of papers