Abstract
This research studies the use of predetermined experimental plans in a live setting with a finite implementation horizon. In this context, we seek to determine the optimal experimental budget in different environments using a Bayesian framework. We derive theoretical results on the optimal allocation of resources to treatments with the objective of minimizing cumulative regret, a metric commonly used in online statistical learning. Our base case studies a setting with two treatments assuming Gaussian priors for the treatment means and noise distributions. We extend our study through analytical and semi-analytical techniques which explore worst-case bounds, the presence of unequal prior distributions, and the generalization to k treatments. We determine theoretical limits for the experimental budget across all possible scenarios. The optimal level of experimentation that is recommended by this study varies extensively and depends on the experimental environment as well as the number of available units. This highlights the importance of such an approach which incorporates these factors to determine the budget.