Joël Akeret, Sebastian Seehars, Adam Amara, Alexandre Refregier, André Csillaghy
We study the benefits and limits of parallelised Markov chain Monte Carlo (MCMC) sampling in cosmology. MCMC methods are widely used for the estimation of cosmological parameters from a given set of observations and are typically based on the Metropolis-Hastings algorithm. Some of the required calculations, such as evaluating the likelihood, can however be computationally intensive, meaning that a single long chain can take several hours or days to calculate. In practice, this can be limiting, since the MCMC process needs to be performed many times to test the impact of possible systematics and to understand the robustness of the measurements being made. To achieve greater speed through parallelisation, algorithms need to have short auto-correlation times and minimal overheads caused by tuning and burn-in. In order to efficiently distribute the MCMC sampling over thousands of cores on modern cloud computing infrastructure, we developed a Python framework called CosmoHammer which embeds emcee, an implementation by Foreman-Mackey et al. (2012) of the affine invariant ensemble sampler by Goodman and Weare (2010). We test the performance of CosmoHammer for cosmological parameter estimation from cosmic microwave background data. While Metropolis-Hastings is constrained by overheads, CosmoHammer is able to accelerate the sampling process from a wall time of 30 hours on a single machine to 16 minutes by the efficient use of 2048 cores. Such short wall times for complex data sets opens possibilities for extensive model testing and control of systematics.
View original:
http://arxiv.org/abs/1212.1721
No comments:
Post a Comment