OJPHI: Vol. 5
Journal Information
Journal ID (publisher-id): OJPHI
ISSN: 1947-2579
Publisher: University of Illinois at Chicago Library
Article Information
©2013 the author(s)
open-access: This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes.
Electronic publication date: Day: 4 Month: 4 Year: 2013
collection publication date: Year: 2013
Volume: 5E-location ID: e18
Publisher Id: ojphi-05-18

Disease Mapping with Spatially Uncertain Data
Justin Manjourides*1
Ted Cohen23
Caroline Jeffery4
Marcello Pagano5
1Dept of Health Sciences, Northeastern University, Boston, MA, USA;
2Div of Global Health Equity, Brigham & Women’s Hospital, Boston, MA, USA;
3Dept of Epidemiology, Harvard School of Public Health, Boston, MA, USA;
4Intl Health Group, Liverpool School of Tropical Medicine, Liverpool, United Kingdom;
5Dept of Biostatistics, Harvard School of Public Health, Boston, MA, USA
*Justin Manjourides, E-mail: Justin.manjourides@gmail.com


Uncertainty regarding the location of disease acquisition, as well as selective identification of cases, may bias maps of risk. We propose an extension to a distance-based mapping method (DBM) that incorporates weighted locations to adjust for these biases. We demonstrate this method by mapping potential drug-resistant tuberculosis (DRTB) transmission hotspots using programmatic data collected in Lima, Peru.


Uncertainty introduced by the selective identification of cases must be recognized and corrected for in order to accurately map the distribution of risk. Consider the problem of identifying geographic areas with increased risk of DRTB. Most countries with a high TB burden only offer drug sensitivity testing (DST) to those cases at highest risk for drug-resistance. As a result, the spatial distribution of confirmed DRTB cases under-represents the actual number of drug-resistant cases[1]. Also, using the locations of confirmed DRTB cases to identify regions of increased risk of drug-resistance may bias results towards areas of increased testing. Since testing is neither done on all incident cases nor on a representative sample of cases, current mapping methods do not allow standard inference from programmatic data about potential locations of DRTB transmission.


We extend a DBM method [2] to adjust for this uncertainty. To map the spatial variation of the risk of a disease, such as DRTB, in a setting where the available data consist of a non-random sample of cases and controls, we weight each address in our study by the probability that the individual at that address is a case (or would test positive for DRTB in this setting). Once all locations are assigned weights, a prespecified number of these locations (from previously published country-wide surveillance estimates) will be sampled, based on these weights, defining our cases. We assign these sampled cases to DRTB status, calculate our DBM, repeat this random selection and create a consensus map[3].


Following [2], we select reassignment weights by the inverse probability of each untested case receiving DST at their given location. These weights preferentially reassign untested cases located in regions of reduced testing, reflecting an assumption that in areas where testing is common, individuals most at risk are tested. Fig. 1 shows two risk maps created by this weighted DBM, one on the unadjusted data (Fig.1, L) and one using the informative weights (Fig. 1, R). This figure shows the difference, and potentially the improvement, made when information related to the missingness mechanism, which introduces spatial uncertainty, is incorporated into the analysis.


The weighted DBM has the potential to analyze spatial data more accurately, when there is uncertainty regarding the locations of cases. Using a weighted DBM in combination with programmatic data from a high TB incidence community, we are able to make use of routine data in which a non-random sample of drug resistant cases are detected to estimate the true underlying burden of disease.

1.. Lin H, et al. Assessing spatiotemporal patterns of multidrug-resistant and drug-sensitive tuberculosis in a south american settingEpi Infect. 2010
2.. Jeffery C. Disease mapping and statistical issues in public health surveillance PhD thesisHarvard University; 2010
3.. Manjourides J, et al. Identifying multidrug resistant tuberculosis transmission hotspots using routinely collected dataTuberculosis 92(3)2012;

[Figure ID: f1-ojphi-05-18]

(L) Unweighted DBM of risk of a new TB case that received DST being positive for DRTB, compared all new TB cases that received DST. (R) Weighted DBM of the risk of a new TB case that received DST being positive for DRTB, based on lab-confirmed DRTB cases and IPW selected non-DST TB cases, compared to all new TB cases.

Article Categories:
  • ISDS 2012 Conference Abstracts

Keywords: surveillance, multiple addresses, distance based.

Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org