Back to Previous Page

Modeling the probability distribution of positional errors incurred by residential address geocoding

Jan 10 2007

By Zimmerman, Dale L ; Fang, Xiangming ; Mazumdar, Soumya ; ...

http://dx.doi.org/10.1186/1476-072X-6-1

Source: Int J Health Geogr. 2007; 6:1.

[PDF-1.38 MB]

Details Supporting Files You May Also Like

Details:

Alternative Title:

Int J Health Geogr

Personal Author:

Zimmerman, Dale L ; Fang, Xiangming ; Mazumdar, Soumya ; Rushton, Gerard

Description:

Background

The assignment of a point-level geocode to subjects' residences is an important data assimilation component of many geographic public health studies. Often, these assignments are made by a method known as automated geocoding, which attempts to match each subject's address to an address-ranged street segment georeferenced within a streetline database and then interpolate the position of the address along that segment. Unfortunately, this process results in positional errors. Our study sought to model the probability distribution of positional errors associated with automated geocoding and E911 geocoding.

Results

Positional errors were determined for 1423 rural addresses in Carroll County, Iowa as the vector difference between each 100%-matched automated geocode and its true location as determined by orthophoto and parcel information. Errors were also determined for 1449 60%-matched geocodes and 2354 E911 geocodes. Huge (> 15 km) outliers occurred among the 60%-matched geocoding errors; outliers occurred for the other two types of geocoding errors also but were much smaller. E911 geocoding was more accurate (median error length = 44 m) than 100%-matched automated geocoding (median error length = 168 m). The empirical distributions of positional errors associated with 100%-matched automated geocoding and E911 geocoding exhibited a distinctive Greek-cross shape and had many other interesting features that were not capable of being fitted adequately by a single bivariate normal or t distribution. However, mixtures of t distributions with two or three components fit the errors very well.

Conclusion

Mixtures of bivariate t distributions with few components appear to be flexible enough to fit many positional error datasets associated with geocoding, yet parsimonious enough to be feasible for nascent applications of measurement-error methodology to spatial epidemiology.

Subjects:

[+]

Source:

Int J Health Geogr. 2007; 6:1.

Document Type:

Journal Article

Funding:

3 R01 EH000056-01S1/EH/NCEH CDC HHS/United States

Place as Subject:

Iowa

Collection(s):

CDC Public Access

Main Document Checksum:

[+]

Download URL:

https://stacks.cdc.gov/view/cdc/31254/cdc_31254_DS1.pdf

File Type:

	1476-072X-6-1-5.gif	gif
	1476-072X-6-1-5.jpg	jpeg
	1476-072X-6-1.nxml	txt
	license.txt	txt
	1476-072X-6-1-1.gif	gif
	1476-072X-6-1-1.jpg	jpeg
	1476-072X-6-1-2.gif	gif
	1476-072X-6-1-2.jpg	jpeg
	1476-072X-6-1-3.gif	gif
	1476-072X-6-1-3.jpg	jpeg
	1476-072X-6-1-4.gif	gif
	1476-072X-6-1-4.jpg	jpeg

More +

You May Also Like

Focusing on fast food restaurants alone underestimates the relationship between neighborhood deprivation and exposure to fast food in a large rural area

Cite

Sharkey, Joseph R ;

Johnson, Cassandra M

...

Jan 25 2011 | Nutr J. 2011; 10:10.

BackgroundIndividuals and families are relying more on food prepared outside the home as a source for at-home and away-from-home consumption. Restrict...

[PDF - 4.22 MB]

The Healthy Worker Survivor Effect: Target Parameters and Target Populations

Cite

Brown, Daniel M. ;

Picciotto, Sally

...

9 2017 | Curr Environ Health Rep. 4(3):364-372

Purpose of ReviewWe offer an in-depth discussion of the time-varying confounding and selection bias mechanisms that give rise to the healthy worker su...

[PDF - 360.29 KB]

Checkout today's featured content at stacks.cdc.gov

Modeling the probability distribution of positional errors incurred by residential address geocoding

Details:

You May Also Like

Have Questions?

CDC INFORMATION

CONNECT WITH CDC