By: Prof Alastair Young, Imperial College London
Title:“Optimal bootstrapping with dependent data”
Abstract: We consider nonparametric bootstrap techniques, used to estimate error quantities in problems of statistical inference. Given a data sample of n data points, bootstrap quantities are generally constructed by generation, through resampling, of a large number of `bootstrap samples’, each consisting also of n data points. Exceptions are techniques such as subsampling and the m out of n bootstrap, which may be required to ensure consistency of the bootstrap estimator. In general, when a conventional n out of n bootstrap is asymptotically valid, there is accuracy loss in sampling fewer than n out of n data points. In settings involving large datasets, which are increasingly prevalent in applications, computation of bootstrap-based quantities can therefore be prohibitively demanding computationally. In this talk, we establish that in key settings involving dependent data samples there is actually statistical advantage, in an optimal error rate sense, to basing inference on bootstrap samples of fewer than n data points, with significant implications for computational efficiency and scalability in such settings.