A backward elimination discrete optimization algorithm for model selection in...

Yadav, V., K. L. Mueller, and A. M. Michalak (2013), A backward elimination discrete optimization algorithm for model selection in spatio-temporal regression models, Environmental Modelling & Software, 42, 88-98.
Abstract: 

Regression models are used in geosciences to extrapolate data and identify significant predictors of a response variable. Criterion approaches based on the residual sum of squares (RSS), such as the Akaike Information Criterion, Bayesian Information Criterion (BIC), Deviance Information Criterion, or Mallows’ Cp can be used to compare non-nested models to identify an optimal subset of covariates. Computational limitations arise when the number of observations or candidate covariates is large in comparing all possible combinations of the available covariates, and in characterizing the covariance of the residuals for each examined model when the residuals are autocorrelated, as is often the case in spatial and temporal regression analysis. This paper presents computationally efficient algorithms for identifying the optimal model as defined using any RSS-based model selection criterion. The proposed dual criterion optimal branch and bound (DCO B&B) algorithm is guaranteed to identify the optimal model, while a single criterion heuristic (SCH) B&B algorithm provides further computational savings and approximates the optimal solution. These algorithms are applicable both to multiple linear regression (MLR) and to response variables with correlated residuals. We also propose an approach for iterative model selection, where a single set of covariance parameters is used in each iteration rather than a different set of parameters being used for each examined model. Simulation experiments are performed to evaluate the performance of the algorithms for regression models, using MLR and geostatistical regression as prototypical regression tools and BIC as a prototypical model selection approach. Results show massive computational savings using the DCO B&B algorithm relative to performing an exhaustive search. The SCH B&B is shown to provide a good approximation of the optimal model in most cases, while the DCO B&B with iterative covariance parameter optimization yields the closest approximation to the DCO B&B algorithm while also providing additional computational savings.

Research Program: 
Interdisciplinary Science Program (IDS)
Modeling Analysis and Prediction Program (MAP)