Comparing Spatial and Multilevel Regression Models for Binary Outcomes in Neighborhood Studies

The standard multilevel regressions that are widely used in neighborhood research typically ignore potential between-neighborhood correlations due to underlying spatial processes, and hence they produce inappropriate inferences about neighborhood effects. In contrast, spatial models make estimations and predictions across areas by explicitly modeling the spatial correlations among observations in different locations. A better understanding of the strengths and limitations of spatial models as compared with the standard multilevel model is needed to improve the research on neighborhood and spatial effects. This research systematically compares model estimations and predictions for binary outcomes between (distance- and lattice-based) spatial and the standard multilevel models in the presence of both within- and between-neighborhood correlations, through simulations. Results from simulation analysis reveal that the standard multilevel and spatial models produce similar estimates of fixed effects but different estimates of random effects variances. Both the standard multilevel and pure spatial models tend to overestimate the corresponding random effects variances compared with hybrid models when both nonspatial within-neighborhood and spatial between-neighborhood effects exist. Spatial models also outperform the standard multilevel model by a narrow margin in case of fully out-of-sample predictions. Distance-based spatial models provide additional spatial information and have stronger predictive power than lattice-based models under certain circumstances. These merits of spatial modeling are exhibited in an empirical analysis of the child mortality data from 1880 Newark, New Jersey.