Space-Time Behavior in Contemporary Disease Patterns and Models

A WORK IN PROGRESS SITE

…………………….

Note: As of May 2013, I have a sister site for the national population health grid mapping project. Though not as detailed as these pages, it is standalone that reads a lot easier and is easier to navigate. LINK

……………………..

Four Surface Modeling techniques I engaged in with GIS are described below. They are:

Basic Surface modeling (3D, DEM, rotating, Idrisi and others)
DEM/Pseudo-DEM/Contour (isoline) display techniques (ArcView, etc.)
Hexagonal Grid Mapping (Arcview, etc.; formulas released for student downloads)
3D Bar-Surface Modeling techniques (any and all, but with formulas not released)

Data sources are described below . . .

Introduction

When I first demonstrated my completed 3D modeling plan to a class of students around March of 1998, it barely resembled a final product. The purpose of the lab activity was to apply several new concepts together into one final project. They were: 1) the development of an artificial surface mimicking the natural surface along which a river flowed, but with that surfaced adjusted so that it was perfectly flat and no longer tapered downhill, 2) the design of a math equation that could be used to describe or duplicate that surface as best as possible, 3) the development of an equation for comparing these two rasterized images to see how much errors ensued and where and why, and 4) the application of that new surface to demonstrate where the shoreline would be submerged if the waterline were increased by 50 or 100 feet.

The reason for this project was simple. Elevation on maps is only provided as a value above sea level. We want to see where areas of a certain elevation above the river edge lie. Since the river surface and edge are constantly changing as the water flows downhill, to lower and lower elevations, this constantly changing elevation feature had to be zeroed, so we can define points that are a certain level above (z-axis) every given research point along the river surface, for which the true elevation above sea level was corrected to a zero value and then every land surface elevation point also changed based upon this correction factor. All shoreline elevations thus had to be corrected for the nearest river surface values. This way we could see how high up the water goes along the river edge into the local terrain when it is 500 feet above sea level, or just 5 feet above sea level. The final product of this project was a 3D image displayed at various angles of the river flooding its adjacent terrain, with that terrain lying perfectly flat, no longer angled in a downhill direction, and with flood levels defined and depicted on the map.

This 3D map was a grid like representation of pillars that could be rotated 360 degrees, depicting the surface and lay out for any angle or tilt, for the entire area being researched. Thus my 3D video mapping technique was born. At the time (1998), 3D data was fairly accessible through government offices and research supporting federal agencies. (However, usually you needed a professor to back you up and a federal remote sensing/GIS expert’s name and number to get your data.) During this time health data was available in aggregate forms for counties, zip codes, census tracts and regions, with more detailed datasets that could be requested in person. The state, regional or federal worker then pulled that data for you and provided you the product either by email, as a ftp download, or via a private ftp folder. Seven years later, when I began teaching my classes on GIS and remote sensing in another state, these ftps were still available and in greater numbers, with SAGE and SIERRA the most important for the time. The following are the links that still exist. These sites are free to peruse but some links are not to be accessed unless you have the proper permissions, and these sites are monitored. The other top site out there right now is Geocommunications or Geocommun.com. Private companies provide zip code data as well, but for a considerable cost.

********************************************************

EXAMPLES of ftp sites (updated 11-12)

A. Google Search “USGS ftp sites” search

B. Search in text form (all lines): http://www.google.com/search?q=USGS+ftp+sites&sourceid=ie7&rls=com.microsoft:en-us:IE-SearchBox&ie=&oe=

C. Current, popular http sites leading you to ftps:

This last site provides details about getting to the ftp sites. It is important to note that not all of these sites are acessible, some may have open door settings, and policies posted there for your to read before clicking in that direction; event though you can access it you are still not supposed to be there, and some or many of these sites monitor everything you do when accessing these sites. Some of these ftps are specifically medical economics related, some business-medical data providers (i.e. zips and health/claims), and some are privately secured person request sites targeting just one person–the researcher–these are for grant-sponsored research, professional academic inquiries personally made with special insiders, etc. There are dozens if not between 100 and 200 of these sites and links. Examples of the more directly related GIS-medical-business ftps follow.

D. Anonymous FTP Sites In Domain GOV

E. The following is an example of subfolders for downloads

Final Note: With iCloud coming to be, all of the above should be more readily available and accessible, if you are allowed to access this data.

********************************************************

BASIC SURFACE MODELING

The duplication of a natural surface is a common game taken on by GIS/RS individuals as some sort of brain-teaser. In the Idrisi lab in which I was a participant, the goal was to produce an artificial surface of a flood plain using formulas, and then to modify the values on that surface for purposes of reassigning the elevations in such a way that height above a water surface could be calculated and displayed.

[Insert Watershed Image with Flood depicted]

The purpose of this was to see how a flood at a constant height, above an ever changing river surface elevation above sea level, would disperse across the floodplain’s land surface. This would allow you to identify where the houses most at risk to flooding were built. The equation I developed was a combined quadratic (2D) model and cuboid equation; the surface that best mimicked the original surface was more than 85% 2D in nature and 15% or less 3D/cuboidal in nature. This high dependency of water flow patterns on quadratic equations and surface planar form is fairly constant across all natural surfaces. The most important feature effecting overall water is the highest and lowest points, with topography having on a 12-17% impact on where the water ends up on the edge of the quadrant being analyzed. This means that the highest risk areas for cholera are also based primarily upon only the linear-2D aspect of a river, with minimal impact by the overall 3D environment (only that part of the 3D right next to the riveredge matters). To display the results of this study, the tilt and angle of the map were changed, presenting the flood plain from several different perspectives, in a semi-rotational, changing aspect pattern. .

DEM-OVERLAY, PSEUDO-DEM & ISOLINE (CONTOUR) MODELING OF MEDICAL DATA

In 2003, using a very finely laid state grid I developed (0.25 x 0.25 mile), I produced an isoline map of Leukemia and Lymphoma cases in relation to chemical exposure [see above]. A similar map was produced of the Multnomah watershed region (a superfund site), relying upon a DEM “skeletal” base to lay the data upon [DEM-overlay]. [Not Found for display.] A “Pseudodem” or artificial surface was developed first using population density and then case data. (Same technique as that in the previous section, resulting in 3D surface.) Similar attempts were then made using amount of chemicals spilled per grid cell and amount of carcinogenity of these spills per cell, both resulting in contour line maps. Each of these display a very easy way to map out specific ICDs and risk factors for a given spatial setting. .

A Hex Grid map located and produced by two other separate researchers

HEXAGONAL GRID MAPPING

Hexagonal grid mapping is by far the most important Medical GIS mapping skill in terms of disease ecological and population health risk research. The trend now is to use traditional grid mapping techniques, or to replace grid mapping with moving window techniques. Not spatial methods out there to statistically analyze disease without concern for space follow the many Monte Carlo and Bayes theorem methods out there being promoted. This work well for correlating, but do little for detail level work focused on small or narrow age groups and specific regions in the country. Hexagonal Grid mapping is the go-between for moving circles/windows and square grid analysis techniques.

Above is displayed a hex grid map, two grids, displaying pollution-exposure data, for the Portland, OR area, produced during the 2000-2006 project years.

Back in the winter of 2003/2004 when I produced the math for this technique for an upcoming presentation in the Spring in Fort Collins, Colorado, I had limited insights into the broadness of its applications to researching more than just pollutants or west nile diffusion processes. Now I recommend it over the traditional square grid cell mapping techniques due to the reductions in errors I found this method to result in (see page on it for more). The following details the numbers of people learning this method, from visits to just the two main pages on this topic.

Visits to my site for hex grids:

Views (blue bars) and Downloads (red) of Hex Grid Excel sheet for students wishing to learn this method:

Essentially what this means is that about 25-50 students per month are downloading the hexgrid mapping tool for experimentation and use, which is equivalent to about 1-2 classrooms or GIS lab settings per month in a typical university setting. If one third of them bring this knowledge in the workplace following graduation, this means the stage is being set for hexagonal grid modeling of regions based on urban development research. This would be one of the better applications of hex grid modeling since we can use these polygons to produce better isoline depictions applicable to future urban development plans and prediction modeling processes. With demographic modeling the underlying reasons for this application, we can also use such a method to model and predict disease behavior in urban-suburban settings.

In Lewis Beck’s 1832 review of Asiatic cholera behavior at the New York-Canada border (on another page), due to an introduction by way of the St. Lawrence Seaway, we read about his recognition that hierarchical diffusion processes already dominate even during the first introduction of this epidemic to this country. This means that urban modeling even for first arriving human-borne diseases have a likelihood of behaving according to population density patterns, whereas environmental diseases (back in 1832, those truly induced by the ‘miasma’ and its equivalents from the local environment and ecology), will follow the radial and non-hierarchical diffusion patterns, and disperse spatially in any and all direction, impacted slight by other variables such as topography, local weather, people behaviors, and animal or vector spatial behaviors. With hexagonal grid mapping and interpretations made based on a better understanding of spatial disease analysis, both past and present, we have a very nice tool that can be used, regardless of the end goals.

In my mixed modeling technique for applying sequent occupancy philosophy to this type of analysis, can look at a country, a region, or very mixed socioeconomic status setting (urban-suburban settings in less economically active parts of this country), break the area down into stages or levels of development — i.e.

“wilderness” or barely inhabited,
“pioneer” or very low status SES, poorly populated settings, like Appalachia
farming/agriculture, partially self-sustaining but economic
industrial factory or technological driven
highly industrial or urban and post-modern

and come up with a model that shows us where diseases are most likely to convert from linear radial spread patterns into hierarchical, regional patterns.

The spread of the first infectious diseases in history took place in a linear and/or radial fashion, mostly non-hierarchical, during the first two periods of development of the Native American-pioneer stage in American history (see my Cree and the spread of Small Pox and Measles page). Today, if transportation or its common use is lacking in a given area, we expect to see the same spatial behavior. Transportation based spread patterns follow the traditional yellow fever diffusion and migration patterns, i.e. a possibly predominant feature for rural, but economically active community settings (farming communities). Combined transportation, people and economics patterns best match the hierarchical diffusion model for spread, i.e. industrialized and post-modern urban settings.

***************************************************************

3D Surface Patterns depicting Regional Economic Modeling program, created by the G-Econ Spatial Analyst team at Yale University

Link to rotating video image

and smoothened image

Note: the Yale G-Econ team project and National Population Grid – GridEcon Matrix project are unrelated, and developed quite separately from each other during very similar time frames. Yale professor William Nordhaus and his colleagues deserve the primary credit for creating this technique and successfully applying it.

For a brief comparison of the two:

The Yale project (above) focused on the development of a economic grid cell calculator and algorithm; my projects and techniques (the details of which follow) focus on population density, health, disease, service availability, service needs, and cost calculated at the square and hexagonal grid cell level with resulted depicted using suspended points, columns, and isolines. The Yale maps utilize a global basemap with state boundary lines depicted by a GIS software environment; my project is focused only on the US and US continental proper, and requires no base map or specific GIS program, thereby greatly shortening production time, but resulting in a reduction of quality in the presentation in terms of defining specific locations. Both of us developed rotating 3D imagery presentations, again quite separately.

3D BAR-SURFACE MODELING

Around 2001/2 the local immunization program I was performing the annual review for demonstrated an unusually high frequency of parents refusing to participation in the childhood immunization program. A number of my friends were amongst the locals who behaved in this way, and when I asked them about their various reason, I realized there was a topic here in need of further exploration. This set the stage for a several year project I worked on about childhood immunization behaviors in the Pacific Northwest (detailed more in my review of Pacific Northwest public health issues). Standard immunization data was obtainable, but mostly from urban settings, and so several university settings with similar behaviors and stature in the “world of rebellious youths” so to speak (i.e. Madison) with data were compared to out local Seattle and Portland data. Zip code data was obtained, and converted to grid layouts along with some normalization processes. These results were assigned to cell centroids and the final results mapped. This National Population Grid [NPG] method is applicable to the United States, and can be produced in still image and video forms. These images very much resemble those of the G-Econ team with the exception that my versions are depicted on a flat-surface map, not a spheroid (globar) surface object. Therefore, all math formulas developed and all manners of representation are based on projections above a flat-plane (not curved or global) projection surface, and evolved and developed separately from those of the Yale project. My formulas are specifically directed towards analyzing medical (electronic medical records/medical registries/databases) and consumer cost data (claims/service records), with very different research goals in mind (the public health questions underlying each query), and therefore consist of very different algorithms that most other economic, epidemiological or public health studies of this nature.

Very Early Prototype of National Grid Map displayed at Oregon Public Health Association meeting (ca. 2001?), depicting partial data from 7 ICDs, 3 Disease Types (infectious, environmental, sociocultural), surrounding 4 research university settings aside from OHSU/PSU, using a ~90 x ~130 cell grid

[need more recent rendering with lat-long fixes]

This methodology hasn’t yet been applied to areas larger than the U.S. and involving more northern latitudes such as Canada or Russia. It is presumably applicable to other large areas of the world however like Middle or South America, Western Europe, or Africa. Projection issues become a concern for continents traversing large latitude regions like South America and Africa, due obviously to differences in lat-long deg-dec defined problems. Equal area projections are there more applicable for the base map design when analyses of these areas are being planned.

Phase 2 essentially

Why learn these methods?

In the current marketplace and those settings where consumer demands and corporate costs drive most of the activities taking place, there are several advantages to knowing how to apply a non-GIS algorithms to the work setting. In SAS, GIS is time and manual labor consuming. This method takes just twenty minutes to run. The standard GIS basemap production can sometimes take days to develop, and hours to days to rerun for a particular set of outcomes desired. The method developed is already capable of running 10 to 20 reports per 12 hr day, one series of base maps every 10 to 20 mins (100-200 maps per report). These reports can detail all or the most important diseases per organ system, per ICD numeric series, per ethnically-linked or related ICD series, per region in the country.

The major advantage with this refined grid method is there is no need to make use of GIS for your reporting process, for example by generating periodic 50 to 100 page reports detailing 10 maps per section or page, with descriptions of significantly different small area events. This advantage is both cost saving and time/manpower saving. It eliminates the need to proceed with a massive GIS program for producing your desired outcomes when evaluating an entire program (investment company, health insurer, national business, corporate overseer, benefits manager) with millions or tens of millions of people/patients/members. 300 pages of population health tables are replaced by 50 to 100 pages of non-tabular visual depictions of your findings regionally, by thenic group, by special ICD and V- or E-code classes.

This doesn’t mean GIS is not necessary to the industry for analyzing overall business success. It simply provides a method to more rapidly produce a report and illustrate the results of this report at a national or large area level.

Phase 3: Two runs on a list developed from data on animal/vector diseases.

These maps focus on those from the USSR or Russia, depending on the year of the data.

A second benefit for this method pertains mostly to the limits of time that are built into any mapping project. The richer the base map for your work, the more time is needed to reconstruct that based map each time a print out is produced. With time, speed will improve in GIS, but for now this is the time-limiting event in any large scale data mapping program. Producing raster or grid maps is always faster that line-arc mapping techniques when small area details are required for the final project. When we first began learning GIS back in the 1990s, and deciding between raster and vector methods was the hot topic, this was true due to the size differences in base data for each of these two methods. This base map data size issue is no longer impacting such processes as storage, analysis and production due to the design of larger storage methods and development of faster data pull and analytical techniques including parallel processing techniques at the Big Data level.

For students, the advantage here is that corporations and the like have been incredibly slow at incorporating GIS into their business strategy. Many lack the skill sets needed for good data mining in relation to spatial analyses, and few make any regular use of spatial analysis techniques. By learning how to study Big Data using spatial analysis Big Analysis techniques, you are several years ahead of the current industry standards. If you are already well trained in spatial analysis statistical methods, you are possible a generation or two ahead of most non-GIS/RS corporations.

AsiaticCholeraGrid_salmon

Phase 4: Grids

The most important end product of grid mapping versus regular zip code or other forms of small area mapping techniques is the regular layout of the data. If that data is valid and accurate, this allows us to produce highly accurate isolines depicting the various patterns that exist in populations and all of their sub-populations. This method can be used to map out the distribution of such minutia as the distribution of rare diseases, genetic traits, culturally-bound syndromes, culturally-linked disease states, and population density based disease and preventive medicine behavioral patterns such as child abuse, skipped blood tests, habitual late refills, or missed cancer screening appointment habits. The current industry-based standards are geared towards income and cost-benefit analysis, not health and population health behavior analyses, which is the primary reason this methodology was developed. Time related costs for learning a new method and philosophy, along with lack of knowledge and forethought, are more than likely the primary reasons why this kind of analysis is not currently being used at the corporate level.

A Contemporary National Population Health Grid Map

to see more . . .

I have a sister site for the national population health grid mapping project, which

though not as detailed as these pages, is standalone that reads better and is easier to navigate.

LINK

. . .

Brian Altonen, MPH, MS