Brian Altonen, MPH, MS

October 24, 2024

Surveillance Projects: An Update

Posted by Brian Altonen, MPH, MS under Uncategorized | Tags: ai, gis, health, healthcare, spatial-epidemiology, surveillance, technology |
Comments Off

Recently, I had to review the level of work being done for the healthcare program I am employed with. This job is devoted to supporting and advancing the different forms of research being implemented at the NYC health care facilities, overseen and/or managed by my chief employer. Most importantly, with COVID-19 outbreaks now as much a past as they are a remaining current concern in public health, researchers have been able to “catch their breath” so to speak, with developing new research projects.

The one major consequence of the COVID-19 outbreak is that it finally demonstrated to the healthfields the need to implement GIS in health surveillance and population health analyses, at all public and private levels. Some might remember that 10 years ago, I did a survey of epidemiologists and medical staff across the country, to determine how many were prepared and using GIS as a part of their hospital programs. Only the experts in GIS chimed in on this. The majority of population health monitoring and reporting programs hadn’t yet taken the use of GIS to define areas more at risk, and their predecessors to this high risk such as low income, into account.

By this time, I had already demonstrated the use of GIS to observe and evaluate the global transport of international diseases like Ebola. I did a full evaluation of this when it began showing signs of making its way to the United States. Few probably remember these cases and the internet public discussions epidemiologists were having as the numbers of Ebola patients who died grew in number and demonstrated the value and need for better disease mapping, as a live version of studying disease patterns, not just a retrospective interpretation of these historic events.

Prior to Ebola, we had the various forms of Antibiotic resistance strains of contagion making their way to the United States. These were fairly low numbers of events, but the fact that a Medication Resistant strain was able to infect the New York City subway, and other public transportation systems, due to health care givers ignoring the voluntary quarantine imposed upon them, demonstrated as well the need to do whatever it takes to make all people, not just patients and health care workers, conscious of public health related events and biological security practices.

Implementing a GIS devoted to epidemiological surveillance and prediction modeling, makes all of the above possible.

This was a very popular and “hot topic” in my field. It was most influential on health care programs devoted to public health surveillance. But for some reason, interest in improving our tools and resources fizzled out relatively quickly after MRSA (Medication Resistant Strains), MERS (Middle East Resistant Strains), and then Ebola hit. A good portion of this I blamed, back then, on the lack of adequate experts in the field being used to build our disease surveillance programs. Now, this might seem ludicrous for me to say–but the World Health Organization and especially the US CDC had large parts of their research, work, and services devoted exactly to epidemiological monitoring, and both underreacted and were underprepared for the Ebola outbreak. At the time, the PR for all of this was that ‘in case of outbreaks’ they had people considered the experts in evaluating and managing such events. (At the time however, what we didn’t know, was that these tasks were assigned to new workers, and mostly hired students. Experts in this field like myself, were not hired.)’

West Nile, however, was the reason these programs finally grew. West Nile fever concerns were yearly concerns ever since Year 1. Since its birth in winter of 1999-2000, today’s experts in West Nile should be engaging in remote sensing and other advanced RS-GIS skills and processes for engaging in such work. This form of study is very different however, from what’s required of a human population, travel and human behavior kind of study, which COVID was.

Likewise, preceding West Nile, we had the need to engage in surveillance of Lyme’s Disease, which was a much more slowly progressive outbreak disease pattern, which became an important example og how to engage in host-vector-case analysis. When the Borrelia for Lyme first re-emerged over by the New York-Connecticut border, its cause and source weren’t fully understood. Looking at historical data on this condition or disease, one can see that what is now called Lyme Disease probably originated in the dairy farms in Sweden, decades ago, as so seemed to be reborn when it re-emerged in Lyme, Connecticut, thus resulting in its new name, for its new place of origin. Lyme was a slow migrater, making it easier to develop surveillance programs for, and test their validity and adequacy. In a way, Lyme Disease ecologic studies served us very well in preparing us for the West Nile outbreaks that emerged later.

West Nile work in turn served to improve our skills in faster moving vector transmitted diseases, providing us with GIS skills applicable to dozens more vector transported diseases, and ultimately, prepared us for looking at diseases that spread beyond the limits of their vectors, vector ecology, and hosts/carriers ecology. MRSA events emerged just in time it seems. MRSA was not a host-vector ecology related issue, it was a human ecology or population based spatial analysis issue.

But then the Ebola diverted the attention back away from internally spread human-to-human contact diseases.

For this reason, we didn’t react much to the two sets of MRSA like cases (MRSA and the Middle East origins events). AS a result, we were not at all ready for observing and research disease transmission as a live event. It took a fairly simple outbreak, like those of the various strains of COVID that emerged, to show us even more about how to look at disease in relation directly to people. It is hard to say with certainty whether or not it was the organism or the improved knowledge looking for a use, that came first for each of these changes in our medical GIS knowledge, awareness, and technology. But, whatever the cause, it is also quite event that outside the new fields and specialties emerging related to Medical GIS, the health care system as a whole, was not ready to employ such studies.

In retrospect, one has to wonder why we were/are not ready. It always takes an embarrassing lesson, to make administrators of our systems change their ways, and invest more money in the needed infrastructure of health in general. But that is what happened here due to COVID in the past several years. Is this avoided by departments having analysts begin such work early on, before it is requested?

When COVID-19 commenced, no one was ready for it, and the insurance companies in particular could have been prepared for this, based on prior West Nile, and MRSA events. The big insurers nationally had the potential at this point to already have a process developed for initiation, should such an outbreak occur. Similarly, we expect big city programs devoted to health care to have plans, at least in their heads, about what to do if that suddenly became necessary. Fortunately, Johns Hopkins had such an idea already to be implemented. They defined to the world, the reason we need to stop ignoring the increasing numbers of deadly disease outbreaks, at the global level.

A part of this plan to implement GIS in health care came to be, probably, about 2004, when the ability to manage population health data, and compare and contrast it with institutional internal datasets, became possible, memory and spacewise in the desktop computer. Storage capacity was finally >8MB (remember then?), so populations health data could be importable, useable, for mathematical analyses, in the institutional, patient health research level.

Between 5 and 8 years ago (this is really an ongoing project, broken into parts, with stages of accomplishments), I began developing a population health database for use in the workplace, for analyzing urban health. A number of 2018-2019 projects had these processes tested, on studies devoted to tuberculosis testing and incidence, drug-resistance bacteria strains development, unhealthy human behavior related problems, and specific health risks found in unique populations.

By this time, I had also added religion to the way in which we “type” and “quantify” particular people and their cultural settings, or living places. Census and family features data were placed in the datasets being developed. National stats for specific common disease rates were used. Income, work, rates and dollar values, were related to specific places, focusing on Census block groups, census block, and zip codes areas, for which these were defined by US Government offices.

When COVID-19 broke out, the residents decided to add the features needed to measure and report on poverty and health, due to the types of medical history and socioeconomic status of our very first fatalities. It perhaps took about 2 to 3 months for us to confirm that this is the path that need be taken when researching this outbreak. Need I say, the general attitude at the time, just short of panic, was impacted greatly by the high degrees of stress most students in medicine (for me, the residents) were experiencing at the time, as we were trying to manage the first hundreds of cases, per emergency room, department, etc.

First Covid project

Source: https://www.researchgate.net/publication/341164552_1254_Covid_articles_For_rapid_review_of_Jan-Mar_2020_Biblioabstracts_pulled_and_evaluated_for_country_type_of_content_major_direction_for_research_Goal_was_to_review_all_articles_with_clinical_procedu

From February to April, I spent the time researching the medical journals worldwide about the outbreak. I did a full review of Pubmed and the like for what had been published, saw the temporal patterns of articles related to this virus, and the countries. About 1200 of these articles had abstracts or full text, and so could be evaluated for the beliefs then being published about the outbreaks. The full bibliography including abstracts was downloaded, and at about 15-24 per page, with available complete articles read through over the next few weeks, providing certain key finds that could be targeted/labeled for later counts (a traditional qualitative analysis / pseudo-metaanalysis) technique for first review, I would add).

It’s interesting to note that reviewing the foreign articles on the outbreak provided a significant about of insight into how to approach researching this outbreak. China had a lead in many of the findings of this outbreak, which makes sense being it is the human geography origin of this outbreak. A couple of weeks into this review, I concluded that the one finding that really stood out was the association of outbreaks with patients with a diabetes history. We already knew that it impacted older people the most,. In fact it was very deadly to those patients. But what stood out for me were the kinds of diagnoses histories that seemed to be shared across regions.

France was the first country to publish an article almost directly relating the onset of cases to a diabetes history. That was by around April (may have to look this up later). But some of the first research done of this outbreak in China, were some incredible analyses of the disease quite early on, in areas far away which were already stricken by similar outbreaks, such as Australia. Since the initial diffusion patterns of a contagious disease are often linked to economic and sociological cross-cultural hierarchical patterns, at the national level, we almost immediately find such a pattern of spread mimicking the centuries old cholera diffusion patterns, a topic covered quite extensively in my 2000 historical epidemiology thesis. That meant to me that reversed hierarchical diffusion processes were almost immediately taking place.

The next key finding I had pertained to ACE1 and ACE2 binding sites. ACE2 appeared to be repeatedly an underlying background to patients who had it. At first, it appeared as though ACE1 binding, when it forced cells to initiate ACE2 production, may in fact be the event to give people who allow COVID to cross the cell membrane, the environment it needed to pass through and into the cell. That was the model a lot of us started to work with regarding a specific cause for the cases being developed. This ACE2 specificity also had to do with certain ACE inhibitors being used, and some had speculation that those inhibitors, by binding the ACE1 molecules in the cell membrane, caused the cell to activate its processes for making the substitute or replacement for ACE1, ACE2 receptors production. This you could tell, by simply reading the annotated bibliography I produced, looking up the disease history worldwide.

By the end of April, I had a model in my mind of how the ACE2-COVID-19 relationship defined who could be most susceptible, and seeing who and what types of disease histories were related to this, it was only natural to next go to considering the possibility that this has something to do with the “metabolic syndrome” philosophy we developed in recent decades, in the field of medicine.

Metabolic syndrome is really not anything new to myself, being that my family has it due to their ethnic history. This is a genotype-phenotype phenomenon that I began researching repeated in 1988, when a medical anthropologist (Weiss) composed his theory about how different cultures evolved this “gene” related to aging and health. Due to decades of reviewing that, I had a good background in this molecular pharmacology scenario and paradigm. This only took a day or two of percolating in my subconscious, to lead me to make my first study of COVID-19 cases related to this hypothesis.

The first cases which data came to me for, in May, pertained to the Liver functions and blood tests demonstrating the body’s reaction to what was happening in the blood–it didn’t really add to my model, focused on ACE receptors and ACE2 susceptibility and treatment.

The next set of cases ended up allowing me to pull this entire concept together, and test it in numerous ways. At first, I was told these cases were in fact going to be what the common writings were telling us should be low risk cases. By May it had been decided that age was directly related to deaths due to COVID. Nearly all studies focused on how that was happening.

But my research population was of young adult age cases, and their medical history, and disease history, in relation to how the COVID impacted then (fatal or not). Basic methods of researching something like this were already being published by now. The first article on our local cases, covering thousands of patients afflicted and serviced by a nearby health care system or program, showed using basic stats what increased versus decreased their fatalities and impacted their recoveries and outcomes. My study focused on cases that required ventilators, in relation to the other aspects of care. In essence, the first published article reviewed everything possible, but took these reviews only to what I called, and was teaching to my residents, the intermediate level of analysis. Time and mortality were not taken into account in the best way possible. For example, the French article stated that Diabetes appears to be related to a major cause for death. Did those people die quickly, or did it require some time for the effects of that predisposition to impact the COVID and what it might do?

The way to relate this to my group was to see which one of the 175 studies had different levels of complexity in their multiple disease histories, and compared death times and rates for those with more metabolic disorder like features linked to their susceptibility. The binomials set up for this explored the numbers of days until fatality, relative to disease type and organ–so post-cardiac cases with need for ED intervention, did they die in a certain time, versus those without this history, but still with several combined heart and blood vessel diseases (hypertension and hyperlipidemia), and did the second group die mostly in 4 days, 5, 7, 9, 10, 14 or even 17 days? All were tested. The results with the best lowest p value, suggested the pre-death condition was of greater risk than the rest.

This series of tests, and checking the time frames, in view of what the bibliographic research suggested, suggested to me that there was a critical time for onset of conditions that turned fatal–the critical days were 18-20, when something in the body kicked in, and resulted in a very ill patient, who would either live a long time on the ventilator, or experience very specific long term effects due to their disease history, but recover. Early deaths were due to one set of features; later deaths due to another; all of these somehow related to the ACE2 receptor and ACE1 vs ACE2 drug history and use.

I wrote up some prediction model binomials to test all of the above, and they supported the metabolic syndrome association I speculated about, but was not completely willing to accept the definitions and terminology for, as posed in the medical literature.

So, that test was included in my write up, which included a complete and full review using time Kaplan Meier (KM) to research and model the cases and their mortalities. I used KM as it related to other factors we might normally associate that with, to show that there are specific high risk factors, diabetes being one of them, but more importantly, the relationship of Diabetes to therapeutics being employed, and the underlying genetics of the patient in terms of ACE1/ACE2 reactions to the COVID infection process.

About July 1st is when I saw this in the stats, tables and preliminary graphs being developed. At that time I was being asked ‘are you ready with it yet?’ in terms of submitting the results and article for consideration. I replied, I needed a few more days to research that last series of questions of mine.

The next few days, I confirmed that suspicion, and submitted the final version in early July.

We then waited for acceptance or refusal. It underwent the first two sets of reviews–expected due to all of my math–but lagged past August, which got me worried that somebody else might publish the findings first. But everything got excepted on time. The electronic publication went online by October; the physical form came out by November.

Apparently, this was a good finding. I was not referred to much at first in other articles on COVID. Those references tell me people saw the value into how I made detailed, exceptional use of the survival plotting and prediction modeling. My impression is that this method of analysis usually is done a few months into research, not just 3 weeks into it. You have to fully understand the gist of your first two sets of findings, before using them to help you design you more advanced levels of analyses.

To summarize all of this, there was a section of a research methodology proposal I have to always mention to researchers–the really simple basics for explaining how you will analyze data. It is a pretty generic style of writing. I have seen resident teams for varying institutions all seem to state these same things in the Internal Review Board (IRB) papers submitted for project plans or approvals.

What follows that one paragraph, first paragraph that follows, is the ways in which the COVID outbreak resulted in making major changes in how we can analyze public health, at this epidemiological, surveillance level.

Prior to COVID, systems were against spatially modeling diseases, and were even more against the methods I began producing and applying to Big Data projects. The main concern was that this compromises personal health information related security.

Methodology (a standard write up)

Most of the demographic data will be assessed using descriptive and comparative methodologies related to the standard Chi Squared, Pearson’s tests, and Students t-Test methods, for reviewing continuous and non-continuous variables. Descriptive tables will be generated for information related to gender, race-ethnicity, and patient lifestyle related descriptors such as insurer group, years of experience with conditions under review. Continuous variables such as age, and numeric laboratory outcomes, are tested as individual values (age, years, lab results) and groups (age ranges, relative lab results ranges, normal/abnormal outcomes results). One or more forms of regressive analyses are typically formed on the larger variable sets, to review these sets for unique group-related differences, such as impacts due to age, gender, race-ethnicity, insurer class (low or no income vs. employed), and other lifestyle features (zip code defined areas and their socioeconomic status (such as high vs low income areas, race-ethnicity predominant or not, etc.). Where dates or patient’s age are collected, such as length of hospital stay or age at diagnosis, temporal evaluations will be run, namely Kaplan Meier and/or COX regressions, to discretely define the highest risk groups. Continuous or parametric numbers results, such as for labs, health scores, graded medical evaluation outcomes, allow for risk assessments to be performed, determining the most critical ranges where prognoses might be likely to become more reliable, based on p values produced through these tests.

Analytics

Amounts of analyses performed for this work are based upon datum quality and quantity, datum form and content, and datum types and content in terms of format (numbers or not). A standard base level review of health data results in typical tabulations of results, with p values provided wherever possible based on data form, consistency, and content. The most basic data produced from these initial studies focus mostly on demographics and patient medical and health history features, and may be presented mostly as tables and a few basic figures describing the population and depicting the results. An intermediate level of data evaluation include the implementation of regression analyses and certain cross analyses of select groups identified as important to the study due to applications of these results. These analyses generally contain more tables and more constructive figures, enabling interventions for example to be developed and followed up, in order to assess the study for validity and its use in predictive modeling.

A third level of analysis, generally termed “advanced” versus intermediate, consists of highly detailed regression analyses, with continuous and non-continuous variables assessed in multiple ways, including through the use of temporal data, thus resulting in the previously stated Kaplan Meier (KM) and COX Regression methods, and Repeated T-test methods of analyses. These allow for more detailed research plans and programs to be established in terms of the patient care program being evaluated, including post-activity and/or intervention analyses.

The fourth level of analyses makes use of space or place as a variable. This has usually been omitted from nearly all studies engaged in, prior to the development of a more useful, functional geographic information systems (GIS) program for use in population health analysis. The latter method has traditionally has the issue of releasing too much personal identifier data, based on the assumption that if a reader is given ample amounts of details about an individual, including approximately where he she lives, that the actual patient could in theory become identifiable. Due to population density related features, and the irregularity of health insurance and health care programs in terms of planning and patient-level personal identifiers/spatial data definitions, the notion that someone is known is a self-imposed biasness added to the conclusions reached. The use of regional or areal identifiers reduces the likelihood such is possible. The addition of zip code data to a study, in order to group patients into geographically similar regions, makes patient groups to be used, to define the impacts of specific living practices and regionally assessed or summarized lifestyle behaviors and habits to be evaluated for patients who share the service of a single sizeable health care program. The features of zip code defined areas most applicable to this work relate to socioeconomic status (SES) levels of identity, community, culture and income related living behaviors, and personal-professional activities engaged in at the family, neighborhood, professional contacts levels. This spatial method of patient health analysis is currently in its early initial phases. A sizeable database has been developed to allow such a process to be implemented for all studies performed at H+H, based upon data content. Such an approach has become a standard of health surveillance, engaged in by publicly operated, publicly accessible, health service providers and surveillance programs, initiated during the outbreak of COVID-19 in December of 2019.

Discussion

In most H+H residency level research, public health maps are not produced using a standard GIS, but can be approximated quite easily using a number of basic SAS and/or SPSS software analysis techniques. For reporting SES and place, relative to population health level case data, both 2-dimensional and 3-dimensional methods have been developed and employed numerous times, since their development in 2004, and implementation in the public health sector beginning 2009, reaching its first levels of success in 2019. To date, area health related SES features are reported as aggregates based upon a series of population level classification systems, merged to form a value that quantifies each regional type. Regions which have similar socioeconomic (i.e. dominant and/or mixed income, household family members counts, percent employed vs. unemployed, and/or on MCD/MCR, marriage/divorce/single parenting rates, food stamp utilization levels, and general demographics features (gender, race-ethnicity), are classified into specific groups. High, medium and low health risk regions were effectively identified using this methodology, just prior to the COVID-19 outbreak, and was implemented and tested from July 2020 on.

Final Testing

Since 2010, SQL has been the standard method used to evaluate ‘Big Data’ and develop ways to model the datasets and produce mappable results. When this project began, SQL lacked any easy, non-costly GIS-related transfer of data applicability, and so datasets were developed for develop in SQL, but then transferred and run in SAS, in order to produce final population level aggregate health data for use in spatial modeling.

Ath the time this was developed, SAS had produced a crude GIS that could be run in the system. But the results of those SAS-GIS projects were very poor (the worst I have ever seen). After about 5 years of trying to improve their SAS-GIS add in, that ;plan was ceased and the standard software companies invited in, to replace SAS-GIS. During this period of testing the mapping of GIS data using SAS, a series of new algorithms were written. The first was to publish this data as point data on a national zip code related centroid dataset, for use in spatial analysis. A given zip code had the raw data and/or normalized data added to the national map produced. This map was developed by merging two SAS programs into a single program, the first produced the centroids values, the second transferred those values to a much more SAS-GRAPHICs model producer, results in more intricate, even colorizable 3D maps of the areal health data. Due to the clearness of this means of mapping spatial centroid data in SAS, this method was immediately tested on the majority of ICDs, ICD classes, and subgroups (by age group, race, gender, etc), to produce 3D models of US maps. The nature of the SAS programming used, allowed for one map to be produced for each set of 3D angles for looking at the US image. A repeated program was written to produce a 3D map at a given degree set for x,y,z, and then the values changed for each single new run, to produce images in sequence, that rotated from 1 degrees to just under 360 degrees, in 10, 5 and finally 3 degree increments. The formula 360/10 means the first set produce results in an image that rotated crudely around from 1 degree back to 1 degree (37 images). The 5 degree model produced 75 images. The 3 degree rotation models produced 76 plus images.

The time consumer in the SAS formula is mostly related to dataset development and testing. So the production of subsequent figures for each new rendering of the dataset at a new degree of spin, only takes a second or less to occur. A 76+ image set runs in just 3 to 5 minutes. A standard run to produce a useful video required a minimum of 300 images, preferable more like 900 to 1000, for a 2-3 minute video to be produced using those maps. SAS had the capability of placing those images in sequence into a video, or a manual process could be used by exporting the SAS produce into some other software package that produces the video, which is usually of much smaller size. The size of a SAS provides video can be 32MB for example, whereas the exported SAS data, imported into some video program, produces a comparable video of just 3-7.5MB in size, which was much more postable and transferrable in 2010, on the internet.

The major improvement after producing the zip code centroid rendering of these data, was the conversion of zip code centroid data into hexagonal grid datasets, using a model and the math I developed for producing hexagonal grid centroids data. Zip Code centroid then was transferred to hexgrid modeling programs, and this led to my creation (invention) of the first hex grid models of spatial data, using the programming, mathematics formulas, and methodology that I developed in December 2003/January 2004, added to the SAS shell of this work.

Years later, 2009-10, the SPSS method was initiated for developing these and the formulas, work were published, and was first used by Canadian Urban Science researchers. It has since been applied (with the appropriate permissions) to a number of foreign institutions based human ecology and population health programs. Any U.S. based GIS utilization of this technology is by companies that did not ask for permissions to “borrow my intellectual property” and run the models. This potential “stealing” of IT began years later, when I first noticed it being used in a taped presentation for an international GIS conference devoted to public health.

Since 2020, more than 200 active studies were used to text the above state method of testing population health. Prior to 2020, it was developed and tested using only about 40 such models, with the public data drawn from numerous public data sources, and the health intervention/activity data drawn from various public and professional data sources, mostly in aggregate form based upon data provided for zip code defined areas. Approximately one fourth of these preliminary studies also implemented use of a small area census block or block group data and spatial modelling datasets, to compare small area testing with medium sized area testing, using standard GIS (QGIS c2.0) and SAS- and SPSS-produced geographic/spatial equations developed for modeling these data. To date, about 10-20 models are developed per year, for testing and refining this approach to spatial health analysis.

The most important advancements in this project are 1) the testing, validation and improvement of implementing and reporting upon (in limited fashion) the outcomes of zipcode areas-defined public health patterns, and 2) the growth, development, and repeated re-testing of zip code area defined socioeconomic features and their applicability to using such datasets to develop further, much more detailed insights into the projects engaged in by this department and program; the current SES dataset has a little more than 200 zip codes areas defined fully, at the standard US Government, US Census, US population predictions, US Postal Service, and standard MCD/MCR levels, as well as the current educational history, religious history, voters/political party history, small business investment grants history, criminal history, npo/businesses history, and regional Economic/Financial history datasets, which were developed and placed on the web for use by planners and developers across the U.S. To date, the SES dataset has 40-50 very reliable datum points used to define areas, but a database with up to 1000 columns if the complete arrays of the above are included in the working spatial analyses-applied datasets.

April 29, 2024

Welcoming the 250th

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

Two hundred and fifty years have passed since the American Revolution.

Although we mostly relate this important period in history to the famous year–1776– 1775 is the more appropriate year to begin acknowledging this celebration. That is because, many highly important events, enabling the war to commence, happened before the year formal events linked to war were initiated.

One might even argue that the Revolution had events that took place in 1774 as well (in fact it did), leading next to public controversies and ultimately the official signing of documents linked to our desires for independency. But like most points of time in history, we try to assign a specific date to the true commencement of an event as serious and important as a war, and try to define the events and time(s) when these events made this important time in history, undeniable and unforgettable.

Ten to fifteen years ago, I did much of my writing on the Revolutionary War and Colonial New York history for this blog devoted to the findings I made over the years, decades prior. And I admit, since then, I have done little to “update” these essays for the moment. Why? That is because these findings made since these initial 2009 to 2012 posts, have been quite extensive, and were worked out extensively on my two facebook pages. Those essays perhaps outshine the initial ones I developed pre-2015. Their type, and the amount of work done as these discoveries were being posted, were much more detailed and lengthier that the previous posts resulting from my work completed one the two decades before.

Why such large amounts all of a sudden appearing during this time? It is due to the world wide web, and Europe’s reaction to this new information technology. The United States educational programs were slow and resistant to sharing whatever “knowledge” they felt they “owned”. As usual, financial compensations were then required in order to learn. History was not available to the masses, so long as inquirers never tried to search the world wide web. Fortunately, a lot of these limiters are now reduced a lot.

In 1990, you had to have a PhD and the money needed to access rare documents at such special collections as those at Yale or Harvard libraries. Fortunately, Google was making an effort to having many of these documents scanned, to be made available for the world to see. But the scanning, preparation and finally release of such rare documents was a slow experience indeed. The delay of teaching the world, in a way that Google Books could accomplish, too ten to fifteen years to evolve into what it is now. As a result, knowing what I needed to see, and where to look, and how to translate these important papers, was hindered only by their availability or not on the internet.

The opening of the world libraries made it possible for me to answer questions that remained unanswered in local history for a good fifty to one hundred years. Regarding my study of colonial medicine in Hudson Valley, New York, the most important events that ensued due to references that became available on the internet were as follows:

First, the discovery of George Starkey/Stirk and his work at Harvard University very early in colonial history, on the alchemical principles of what is popularly referred to as the “philosopher’s stone”, the prima vitae or essence of life, and its relationship with two medical recipes or ingredients referred to by Dr. Cornelius Osborn: ens veneris and sal ammoniac.

Second, the actual publication of Cadwallader Colden’s work on the Flora of Coldengham, New York. This was the first such documentation done on New York flora, and revealed important natural history observations by the author on the origins of some or the earliest introduced plants in this region. This was later complemented discovery of Jane Colden’s work by Hessian botanists and Carl Linne beginning about 1787, its rediscovery by Anglican scientists when the Linnaean collection was bought by England from Sweden in 1800, and again when the local Orange-Dutchess County Floral society rediscovered her work and finally reviewed int in manuscript form in the late 1950s.

Third, details on the life history of the first local physicians, in particular, examples of writings penned and published by such local physicians previous unresearched like . . .

a) Isaac Vale Van Voorhis, who was forgotten because of his death as the third physician ever residing at Fort Dearborn Chicago during its first year of settlement. Dr. Van Voorhis experienced an unfortunate outcome due to his experience at the fort. He was one of the first graduates of Columbia University, the third from Columbia who served at Fort Dearborn, but the only one from Fishkill who was there at the worst of times. As a result, he was massacred during the initial hours of the War of 1812. His corpse not recovered for years. Due to his mistreatment by later individuals during the first major national recognition of this historical event, the descendants of other military leaders in this battle, trying to promote their relatives as “heroes” (some may in fact were not), retold some very personal talaes of what happened during the famous “Chicago massacre”. One in fact restated her family’s claim that the doctor was ‘cowardly’, when in fact he was trying to escort about 40 mothers and children from the center of the attack about to be made by Indians. These comments have since been interpreted by historians as demeaning and slanderish, but have since been remained as the versions retold by popular press writings published in fictional books, mostly for kids, providing half-truthful culture remembrances of this encounter. As a result, it has difficult to nearly impossible for historians to bring back the true story of who Dr. Van Voorhis was, his reason for servicing at the fort, and the many misfortunes he seemed to face to sign up to serve there, to obtain the recognition her earned at Columbia for earning his MD degree, for receiving the appropriate respect and recognition he needed as a very young soldier killed in a massacre. (In the image below, Dr. Van Voorhis is resting on the ground, beneath the 3 still standing in the re-rendering of this encounter produce by a famous sculpture, for site recognition day in the late 1800s; it is an object since removed from the site, ca. 1976, due to claims that it portrays the Native American aspect of this encounter in a “negative way.”)

b) the local Dutch and Quaker histories related to the popularization and public acceptance of electric healing” in two forms–the copper-iron dielectric galvanic device theory that was born just across the border in Connecticut about 1795 but promoted by residents in the Quaker hamlet of Oswego Village, and the static electric generating flat disk therapy popularized by a Quaker removed living in Dover (Jedediah Tallman) and his follower who healed using bimetal calipers in “Fishkill” (now East Fishkill area). Still, the most intriguing part of this historical tale is the reason why and how a religion like Reformed Quakerism might take on the idea that “electricity” can be used to heal, perhaps even as a sign of “god” to them, which they envisioned and verbalized using the terms “force” and “universal energy.” Now, this isn’t the first time we were “electrified” in the Hudson Valley by the notion of God’s energetic “spirits”. The famous Leyden Jar came to the valley decades earlier, during the periods of Dutch family dominance and even “rule” between 1650 and 1690, with the possession and use of perfected Leyden Jars in this region evidenced by belongings documented in historical records, property belongings penned once the English had taken over New Netherlands, including upper Hudson Valley, in the very early 1700s.

c) Quaker/Quaker Reformed Doctor, Shadrach Ricketson, was famous for arguing that Poppy could be grown and cultivated in North America. And he introduced the cow pox vaccination technique to this country around 1802/1803. His famous book, published in 1806, served as a furthering of the work on home health, domestic medicine, healthy living practices ideology, the most widely respected version of which had, until this time, been penned by an English physician, well known for his unique entrepreneural, upper class, “well-endowed”, well-fed (meat and fat, gout-stricken), morbid obesity appearance (400 lbs or more) by the mid 1700s. Many contemporary book dealers in the U.S. homnor Shadrach as the first U.S. author to support the role of sports, recreation (indoor and outdoor), and athleticism on personal healthiness; to them, Ricketsons book on how to live healthy and stay in good health, is the first medical book written and printed by an American printer and press.

d) Poughkeepsie’s most important role in the popularization and dissemination of Thomsonian medicine philosophy and knowledge today considered the knowledge and popularizations of various forms of alternative medicine. The Poughkeepsie printing press made it possible for the preachings of Thomsonism to be printed and dispersed across this region. This press even remained one of the most important in the history of this “non-allopathic” medical profession by the late 1830s; and remained one of the most important, and finally the only printer of this discipline for northeastern U.S., by the mid 1840s. By the late 1840s, this discipline broadened its versions of medical practice and related fields of interest greatly, and nearly every practice of healing ignored by regular doctors, was taken on and preached by this rapidly growing “reformed medicine” school of thought. By 1850, the Orson Fowler family, promoted how to build the healthy octagon home, and how to practice a very useful form of human psychological diagnosis and treatment known as phrenology, and used their octagon house and local public speaking places to promote these ideologies. Such places allowed Mary Baker Eddyism, Quimbyism, Herbal Medicine preachings, Shakerism, Indian Root Doctoring, Water Cure or hydropathy, Sylvester Graham’s early Vegetarianism like movement, Mountain/Climate/Fresh Air cures, wearing Galvanic Coins and Necklaces, Andrew Jackson Davis’s teachings in Seancing, to become important parts of our local culture. There is but one profession they did miss however, which had much the same preachings–Hahnemannism, introduced by the nearby German communities in Allentown area Pennsylvania. They came to this region by way of the regular doctors, Dr. Vanderbilt for one, convinced this was a valid, useful preaching or teaching.

e) the development of the medical profession and its teachings from what it was during the early Colonial Period and early Revolutionary War era, to how the State medical associations were formed and their constant changes in education, learning and requirements from the end of the war until this profession became more solidified by the 1820s, the first official medical society meetings, the educational experiences, the discoveries, and the first medical writings of local physicians were documented in their quarterly trade journal for this region.

A total family life medical professions study of Dr. Cornelius Osborn (1722/3 – 1782) demonstrates most of the stages of legal development of the profession. Dr. Osborn himself was trained using Apothecary Guild licensing practices carried out in Europe (England), but probably as a resident of the lower Orange County region near Haverstraw. Evidence suggesting this is the fact that a known highly trained practitioner with the Osborn surname is noted in this area, along with at least some related families. In the Haverstraw churches baptismal records, a Cornelius Osborn is found baptized for 1722; a note made about Cors. Osborn by Benson Lossing gives a date, from interviewing family members in the early 1800s, that is one year off. The father of Cors. is named James. His responsibility around 1720 was surveying the region in preparation for building new roads leading to Ulster County to the north, the southern boundary of which then was just north of what is now the northern edge of Newburgh (by Route 84). Just west of this area was the land upon which Cadwallader Colden settled.

Adjacent to the Osborns in Haverstraw area was, just off the Hudson River in Florida, NY, was a family, into the household or which a young boy named David Wood would be transferred by his much larger extended family of Methodists, who lived in the Flushing area to the southeast, and across the Hudson River directly to the east. About 20 years later, David Wood learned medicine taught to him by a German family residing down near Warwick. During his apprentice years, David Wood was trained by three or four specialists–someone practiced in medicine, another individual learned in apothecary (possibly from across the border in what is now New Jersey), a doctor practicing husbandry and the practice of animal health, and finally, from a male midwife. These are all detailed in a book on the teachings Wood went through during this time.

Then there was the training of Cadwallader Colden, at the northern end of Orange County, the southern edge of Ulster County. Cad Colden was trained by a physicians’ business in London, about 1707-1710. He received his pre-medical college education from University of Edinburgh, during the three years before.

The doctors of the Hudson Valley learned medicine in various ways. Important to note here is the fact that many if not most practitioners were well trained and learned in medicine, more than there were self-proclaimed, untrained “healers.” The bait of medical historians is to frequently preach the notion that educated physicians were rare in the early Colonies, which is only right if you take a one-sided, biased look at the profession during this period in local history. Such a claim was made forever popular when it was claimed by the British writer detailing New York Colonial history, Philip Smith, ca 1750. However, his writing on New York history and the education of New York residents was solely British in nature, and the British were indeed in political and intellectual competition with the Dutch throughout the decade prior. They won an important part of this political “war”, characterized as a series of political skirmishes on different claims and subjects, as the British successfully, finally managed to take New York back from the Dutch–twice. The survival of Smith’s state left subsequent researchers and writers on local medical history robbed of nearly all of the history and Dutch based truths about this region. Dutch settlers history demonstrates a very large number of settlers and travelers through this region who were in fact well educated and referred to as “doctors”. Perhaps as many as 25% of these earlier Dutch patrons passing through this region were trained esquires, and religious leaders, politicians, historians, educated in the classic writings, multilinguistic, engineers, mathematicians, philosophers, and physicians. Such was nature of the advanced education programs situated throughout eastern, middle and western Europe. British writer Philip Smith was never interested in this fact about their levels of education.

So how learned was Dr. Cornelius Osborn when he was recommended by Col. Abraham Swartwout to be recruited, to serve the Fishkill regiment led by Swartwout, with his son James to serve as his Field Physician assistant?.

Osborn was granted official acceptance of his medical ideology by Samuel Bard just before the war began, in August 1775. In spite of his probably apothecarian-chymistry focused training, he was very fit for this work, and he was a fairly well trained chemist of numerous uses to the troops as well as their leaders. But he was also trained as a combined practical/philosophical chemist (“new apothecare”). Did Bard ever know this about Dr. Cors. Osborn?

This leads up to how his sons got trained to become doctors; two out of three (James and Thomas) were fully accepted as such.

His oldest son James, earned his MD through approval after the war; he was his father’s assistant during the War, probably approved as an MD by a local esquire or government leader, ca. 1784. He even left us a manuscript, mimicking (copying) his father’s for the most part, for the written materials required for his licensure (to prove his ability to write).

The middle son Peter was less fortunate. The Dutchess County Medical Society was forming and regularly meeting by the time Peter was old enough. The society defined the requirements for teaching, testing and approving those who earned their licenses in medicine. Until then, one learned by being trained by another physician and/or by serving as an assistant to the surgeon for a military regiment. Peter attempted to latter. Peter probably hoped this process would be as easy for him, as repeating the his father’s work was for James. That wasn’t the case. “Medicine” then, was not just practicing your skills in apothecary science, making concoctions of medicines and such. And so after his first 3 years of service, attempting to assist the military surgeon, Peter was required to repeat his position again after 3 years, which he declined, and so never became a physician.

The youngest son of Cornelius Osborn was Thomson. By the time Thomson was old enough to qualify for medical training, another increasingly famous physician Bartow White had moved to Fishkill. Dr. Bartow White was the son of Dr. Ebenezer White of Westchester, who also served in the Revolution like Dr. Cornelius Osborn. Ebenezer sent his son to the young medical school in New York City–Columbia, just before they made classes a requirement for obtaining your license. Whether or not Thomas went to Columbia is uncertain, and it seems likely he didn’t. Instead, Bartow may have held classes at his home, for Thomas Osborn is noted as learning alongside Dr. Bartow White. Dr White was responsible for licensing about two or more other young men per year in medicine between 1797 and 1805. (One of the better known was Dr. Isaac Van Voorhis who died in Chicago is one of them. Dr. Van Voorhis wrote and had published an article on his experience next to Dr. White, performing a forensic surgery investigation of a patient’s body.)

Thomas, Peter and James Osborn had a nephew, Cornelius Remsen, who moved in with them after the Revolution. Cornelius Remsen was born in the hamlet of Newton, today, an unknown, unrecognized hamlet located next to Livingston Manor in the Catskills. In past attempts to document Cors. Remsen’s history, all past history and family genealogy writers misidentified his birth place as Newton, “Long Island”, interpreted as Newtown. Reviewing the records of Newtown, Long Island, there were members of the Remsen family there, with a family history published stating this name actually came from a prior surname.

According to early family interviews performed by the most famous post-Revolution history writer, Benson Lossing, he penned “Newton, LI” as the birthplace. But, a review of historical papers and ephemera led me to uncover the phrase “Newton, Liv[ingston] as also the place of a mercantile store in New York, which ended up being where Cors. Remsen was raised. Also in this hamlet there was the Du Bois family, who owned a mercantile business there. They referred to their business address as being in Newton. This was about the time state congressman Samuel Mitchell wrote and had passed an act requiring that plans be made to name all areas that fit the definition of areas considered well populated hamlets. Cors. Osborn’s past business ventures, such as lending mortgages, happened to occur in this same region

At the Osborn family residence, Cornelius Remsen resided in the house located at the Baxtertown Road-Osborn Hill Rd intersection. He became the first Osborn family member to learn fully under Dr. Bartow White, without attending classes at Columbia medical school. He became an MD without necessarily having to travel to NYC for his formal classroom lectures and science training, and served in the War of 1812 right after his licensure as an MD. He is also linked to the importance of Baxtertown in local slave remission history, for he was a major landowner and contributor to the development of the Zionist church raised in the village of Wappingers. His daughters never married and had children.

Interestingly, I learned most of the above by going through books and other printed items–manuscripts, microfilm, etc.. All of these documents required visits to the libraries that held them, in New York City, Poughkeepsie, Goshen, Washingtonville, ad nauseum, during the 25 years prior. Once the internet became a possibility, in some ways, the outcomes of my searched improved; but that did take me almost 20 years to reach the state we are now in. Original documents from the genealogical libraries, still cannot be duplicated as well, unfortunately.

During this review of medical documents, articles and the like published during the post-Revolutionary War years, the outbreak of Yellow Fever during this time led to the first production of disease maps in United States history by Valentine Seaman of Yonkers-upper Manhattan area, demonstrating the roles of human population features, local commercial practices, natural ecology, and weather-climate-topographic requirements, for epidemic to happen, and in some places, endemic disease patterns to develop.

Due to the availability of journals and popular magazines globally by about 2005, I was able to find and analyze dozens of the first disease maps and for the first time document the detailed history and philosophy of medical geography, and spatial epidemiology as a specialty, being published in this field for several of my discoveries and related writings, starting with my MS thesis in 2000, becoming known as a historical medical geographer/epidemiologist for the first time following the publication of my international commentary article about John Snow’s disease mapping and the U.S. produce soil geochemistry map depicting how the environment influences epidemic patterns in the midwest, published by a medical epidemiological journal from England around 2009.

The best example of how I used these research experiences to build upon the knowledge of local medical history for the first time since about 1910, came with my publication of Dr. Cornelius Osborn’s history and his manuscript written in 1763 by the Dutchess County Historical Society, in their Yearbook for the year 1990. But that research was penned during the non-internet years. Once I commenced the use of internet resources around 1992, it took me another five years to begin to find the historical documents needed to engage in a thorough nationwide research on this topic.

My most important resource then was the series of book documenting the holdings (“Catalogue”) of the Surgeon General’s library in Washington DC, a 35 volume (one per year, A-Z in sequence per printing) encyclopediac set of all materials perhaps ever published on medicine. I spent 3 to 5 years reviewing this set and used this reference to order more than 5000 articles out of my University Library (Portland State University), during my years as a graduate student there–1996 to 1998. I received about 50% of these requests.

The use of this Catalogue also enabled me to review all of the local Hudson Valley, New York physicians of the past, and the books they would have been read in. This resulted in my ability to locate and research many of our local physician’s references, but especially see whom in New York was published, by state, county, town, surname, school, professional agencies, key topics (outbreaks) there were experienced in and the related activities they were engaged in. Dr. Osborn was not found in this series, although the chief authors he was read in were. Dr Cadwallader Colden was thoroughly covered in this important series of references.

As a result, I was able to locate and save on my personal computer much of the work of Dr. Colden. Almost nothing was written about his daughter Jane Colden, but plenty was available on the earliest natural history writings penned by associates of the Coldens in the field of botany and natural history, and in particular, writings by later botanists who made reference to these initial discoverers of the local medical plants.

Up until this time, the only version I could review on Colden’s work on colonial New York flora was reprinted, but untranslated, by the New York Botanical Garden. But viewing those documents was at the time not allowed, except for scholastic purposes, in the form of enrollment in a PhD program. The availability of Colden’s work on the internet by the late 2000s, made my research on this document possible, without need to spend time travelling during a sabbatical. I therefore was able to engage in the bulk of this work on the Coldenghamia Flora, was from about 2009 to present. My first translation of Colden’s two part article from Latin was completed in 2010, the second by 2012, with Jane Colden’s history added to this work later that year.

Thus, one of the most important outcomes of this work is/was my ability to translate and add knowledge to the very first publication of the Flora of New York, by Cadwallader Colden. It was submitted by Gov. Colden to one or more of his “associates” in natural science, leading to its hand to hand transfer across botanists, until it reached the desk of Carl von Linne. Over the next two years, it was transformed and readied for publication by Linne, and became a formal written scientific document, published in two parts, the last portion published 1751. This represented the synopsis of work Cadwallader, and his daughter Jane had done, between about 1738 and 1745.

Another highly important discovery I made while researching Colonial to Revolutionary War medicine on the internet, was my review of some work engaged in by an underappreciated history of medicine scholar: Dr. Francisca Guerra. His writings in the history of Medicine were in Spanish, but he was forced to leave Spain due to political problems in the mid 1900s, and so removed first to parts of Western Europe, then South America and Mexico where he served as professor in medicine and history, and later transferred to the US where became an emeritus professor and history of medicine expert for Yale for his latter years. It was during that stay at Yale, that he catalogued the entire newspaper collection of the colonial to early post-colonial U.S. years, focusing upon anything that dealt with health and medicine published mostly as advertising, medical orders, announcements, notices to potential students, ad infinitum. This end product also had photographic replications of the majority of these ads and notices. This document alone takes a researcher at least a month to thorough review, for your colony, province and state. It provides the most widespread details on the practice and learning of medicine for this time. Its lack of review by most history of medicine researchers these past 30 years, is why many of the common myths about medicine and colonial doctors, at the time the War began, are repeatedly misstated in the majority of articles and descriptive writings produced about later Colonial-Revolutionary War medicine, with a focus on the practices of Bard, Rush, etc. Much of what historians need to know about Revolutionary War medicine, has never been shared, and what is or has been shared, remains very Anglocentric.

Being published so early in our local history, it was in numerous ways the first event detailing numerous things about North American and New York History. It was not only the first “flora” of this region published, it also documented some malingering early introduced species in the Hudson Valley, New York, local plant history and ecology details, never fully researched or reviewed by botanists to date.

This document, ultimately being linked to the work of daughter Jane Colden as well, was the first to express certain local human culture aspects of local ecology and anthropology features. Jane herself coined the term “Hudsonian” in her work, to describe the unique locally defined habits people living here had taken on, to incorporate the local herbs, edible plants parts, and the like, into their normal living routine. The documentation of ethnobotany for our region was thus fully accomplished by Cad Colden, through his 1735 and 1745 works on the Iroquois and their way of living and surviving as a body of people, and through Jane’s manuscript notes kept in her journal and early plant drawings developed, to accompany her observations and notes.

Both of these items, anyone learned in Colonial history, will probably be familiar with. For during the next century, as the United States first got formed, and then grew in size and complexity, most classes in the first colleges would note a lot of Colden’s work done in nearly every field they were teaching, including politics, the art of war, American Indians, farming and agriculture, anthropology, linguistics, paleontology, geology, economy, morals and human rights, botany, astronomy, climate, epidemiology, diseases, epidemics, and health. Cadwallader Colden’s name is mentioned literally in hundreds of items published between 1750 and 1850 on these topics.

The work of his daughter was lost for a short while, for it never led to an initial publication due to her marriage soon after this work was minished, and her fairly early death just a decade after much of her combined ethnobotany-scientific botany was completed. The notes she had kept left this country completely at the end of the Revolution, about 1783, when her papers were handed over to a young Hessian military leader, scientist, naturalist, military landscape artist, and engineer just before the British Army gave up Manhattan Island. Her papers remained dormant in the place where this professor later developed the field of Forestry as a college or university level ‘natural philosophy’ (science). Their rediscovery in his later years, led to the immediate publication of her work and the development of her recognition by middle European schools and scientists, well before the British natural scientists began to honor her accomplishments.

Interestingly, like Cadwallader Colden’s work on New York Flora, that work (the work of Jane and that forester) remained untranslated, untranscribed for a century, then forgotten in western European science history, then lost and misplaced in the U.S. Ivy league libraries, then Jane’s work was finally rediscovered by the local Garden Club, 1950s; a portion of her accomplishments, in English, reprinted. Any related work penned in German, may have been printed by the Hessians, but never honored and translated for use by Anglican colleges and universities. Ethnocentricity robbed many scientists the rights to see, learn from, and know, the discoveries made by the scientists over there in Eastern Europe.

In 2024, I successfully translated into English the important work by the Hessian scholar, scientist, botanist, military leader, and economic forester, originally published in old German.

In terms of accomplishments, the above are worth noting. But the opening of the Worldwide Catalogues also made it possible for me to successfully finalize my work on the kinds of medicine that were practices about the colony, province and later State of New York, and the same for physicians uncovered who also left us with important writings discovered how and what they learned. This provided individuals into the history of pharmacy, medicine, science, a chance for the first time to understand in much more detail the way people became doctors in the early Colonies/U.S., and the many ways they were taught and learned.

History has this habit of categorizing the history of medicine, by the use of terms like quacks versus physicians. Yet the quacks referred to in the 1750 History of New York book, which is the key reason and source for this ever-important sterotyping of doctors, be they ‘trained” or ‘untrained” based upon some ethnocentric ideology, remains the way modern historians teach the history of medicine. They teach their students there is a “right” and a “wrong” way to practice per period in history, an educated versus uneducated version, an official versus non-official way.

Yet, the truth here is best stated as — there is a method taught by actively practicing clinical groups, another taught by the theorists, philosophers and academicians as taught in the classroom, another taught by the former barber, now chirurgeon who is an expert in physiology, anatomy, and cutting and displaying the human body, and still another type of doctor belonging to the apothecary guild who was taught much the same by books, but also who learned the new science–chemistry–of life, animals, drugs, and was scientifically trained and experienced and perhaps only a few steps more than the chymist who became famous because he successfully treated a member of royalty with some unusual potion or magnetic balm or unexplainable fever-breaking bark found on some distant continent. No doctor, from any of these lines of specialty, was smarter or greatly better than the others. All became popular, due to their accomplishments, especially those displaying less failures when treating the most ill of patients.

The manuscript I reviewed on our local physician Dr. Cornelius Osborn can now be almost fully explained. It is just an 82 page handwritten document. But it has the materia medica, the ways of experimenting and producing medicine, the ways of giving that medicine the potency it needs, the way to energize it naturally, or through the assistance of some higher power–these all are in his writings, and can only be understood, by reading as much as possible about all the possible believe for that time that he practiced (1735-1782), and understand the teaching of the multiple kinds of doctoring for several generations before him. Medical philosophy changed back then, quite extensively, every few (7-12) years.

So, in prep for the 250th, there are two items I have not yet fully thought through. The first is, medicine at the dawn of the revolution is vastly different than that practiced at the end. And there are two other periods–one or two years before–and one or two years after–when ideas were also very different, about to undergo change and solidification. Five to seven years after the Revolution, medicine was in its new form in the United States. To some it was trying to add to the British ideology; but to many anti-Anglican families, a new patriotic ideology was emerging, based on the idea that the US has different diseases unlike those of Europe, because the US, its weather, its climate, its Geology, its Paleontology are different. Disease was then thought to be largely a product of nature, and individual family traits or health related histories. Places were so different than each other, that certain places effectively treated or provided the medicines needed for where the ailing person lived, and whether or not that individual could acculturate or not. This ideology did not exist in the same way before the Revolution. The Revolutionary War, and the sharing of medical teachings during the war, between doctors from different countries working together (on both sides), enables doctors to learn where they were performing better and where they had failed, based upon what the others just taught him.

The preachings about Revolutionary War medicine, will most likely take that classical and heavily prejudiced, mistaken approach to teaching, by discussion the notions of there being doctors practicing quackery or not, doctors who went to a European school or not, doctors who performed strange ideology like sweating or not (avoiding as much as possible the mention that nearly everyone bled at the time). Just as the war began in North America, Reformed Medicine was taught and existed in Middle Europe (Hungary). This Reformed medicine questioned the heavy preaching of certain traditions like blood letting–it didn’t condone it totally, only asked that physicians be more observants of when it did work versus did not.

These Reformed doctors did in fact also take on the old favorites learned from by physicians of all sorts–Thomas Sydenham. Sydenham relied upon clinical observations and experience, not just speculative theory, to make his recommendations–and these were all based upon the epidemics he worked in and through. Doctors of all sorts respected the works like those of Sydenham, and many also honored the classics still being published by Hippocrates and Galen. But they relied quite a bit on Anglican ideology over that of Hungarian ideology. Even though both were equally wrong and equally right, in major parts of their ideology. When the Revolutionary war began, our leader Samuel Bard favored the Anglican teachings, not the Middle European teachings. Thus, the famous William Cullen’s preachings of humours prevailed, as did that line of reasoning for why to bleed patients like we will, and should, not the preachings of Slavic-Hungarian writers, whose focus was on observations and experience in the hospital, like de Haec and Stollius.

The Forests of Harlem. The first forestry book – by Friedrich Adam Julius von Wangenheim.

August 30, 2022

2023

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

There are several innovations in my research approaches worth reblogging here. In particular, the numbers of people citing my work in their various fields are quite gratifying. Mind you, some of these mentions involve work that is over 20 years old. But that’s fine. Better late than never, so they say.

The study of Colonial Medicine is the theme for these items posted.

The first document comes from an original copy of a thesis that I purchased in 1990. It is the work of a MD program student from Yale, on a document found at Mystic Seaport museum. It explored an original document kept by an apothecary-physician from England, dated 1720, four pictures from which are posted here.

The second item is a slide presentation on physicians of the Hudson Valley, prior to the Revolutionary War.

OrangeCountyPhysicians1620-1776_06_07_2023_Final_PDF_UPLOAD Download

Brian Altonen, MPH, MS

I designed a way to score health care programs for their GIS savviness using this scoring process, which evaluated the different spatial modeling software tools, methods of analyses, mapping processes, applications of GIS use, types of GIS (point, raster, vector) employed for surveillance and public health monitoring. This survey ran from 2015-2019 and focused on the use of GIS by Health Care Organizations and Facilities.

Middle of Outbreak update.

A number of historic epidemiology researchers, like myself, have stated that more than three years is a lengthy stay for a pandemic event. The ways that yellow fever and cholera tended to stay, and re-emerge, was in two or three year clusters during the late 1700s and 1800s. Each had varying numbers of years apart between their cycles, a source for our reasoning.

The first tendency is to look at past epidemic history and try to assign some sort of natural…

View original post 1,982 more words

August 13, 2022

Returns . . . on Investments

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

Patience helps.

In health care research, we engage in projects that we usually want to see immediate benefits from.

Those benefits include the completion of a project, then witnessing, documenting and “proving” any changes it is responsible for, then seeing these results published in the literature, either something as simple as a news story or article about your work, or the publication of presenting your results in a public and/or professional media, or the printing of a complete review of your work, either by yourself or by a colleague who appreciated it, and who became the first individual to employ what you have discovered.

I have certainly experienced all of these signs of appreciation for my work over the decades. My work on history of plant medicine put me on the television screen as far back as the early 1980s, in the largest newspaper at the time in Long Island, NY, and in innumerable books, newsletters, monthly mailings, education materials devoted to some aspect of local medical history, in New York, and of all places, Oregon, where I spent 20 years building my repertoire in accomplishments, working as a professional and university professor who called his specialty “Medical Botanist and Historian”, for 20+ years.

It was in 1994 when I decided to switch my focus from teaching plant chemistry in a university chemistry department as my specialty, and researching medical history as my second specialty on the side, to studying the underlying geography of these things in life. Interestingly, I went in geography and spatial analysis because of my theories about how and why plants formed their unique chemical products. This theory of mine remains active in the field of phytochemistry. I needn’t search the web far to find individuals, scholars or not, employing one of more of my various theories about how plants form their useful chemicals, or why they do, or how these products came to be and undergo change as a part of the human geography evolution ecology theme, or how and why some become popular before others–the nature of human philosophy and logic, and how medicine relies so heavily upon these relationships we define, for what we learned and “know.”

I came up with theories for how and why plants evolved cancer drugs for example. Easier to understand is the theory as to how plants learn to divert certain chemical pathways in new directions, so as to convert their common chemicals in one specific group, into a new type of chemical formed when two uniquely evolved pathways get merged, into one, and for the first time produce a chemical novelty, important to their survival, yet more important to mankind, or their local ecosystem.

So, all of this thinking and intellectual reasoning was applied to plant medicines and most other ethnobotanical products, and then to a unique interpretation of this theory in relation to what plants existed in certain places, leading finally to my interest in the use of geography and spatial analysis to look at these concepts. One could use topography and spatial technology to map out plants in an area prone to wildfires, like I did for the Tillamook burn area in Oregon; one could then apply that to the chemistry of plants that might aid or even make the plants of that region exceptionally prone to initiating and then effectively spreading wildfires over time. With spatial analysis, and combined plant chemistry geography studies, I could see and write about the parallels that existed between this small part of Oregon, and the much more aggressive wildfire areas seen elsewhere around the world–the main one was in China, due to the “Yang” aspects of the mountains with forests, according to their local philosophers/geographers. Chemically and environmentally speaking, according to a geographer’s or spatial analyst’s mindset, this reasoning did have unique meteorological based reasons as well.

Then came my interest in how certain disease causing insects could survive in only certain places, and the pathogens they might harbor. Hanta Virus, Lyme Disease, “Malaria”, were famous diseases, for which I could learn this use of spatial analysis and geographic information systems, GIS.

At the time, I was working out a concept for which I could write my required thesis about. My mentor was kind of a ‘Jack of all trades’ in his field. He was the one who convinced me to apply to Portland State for my work in Geography–“not more of that pure science!” he would say to me. “You have to see the entire picture!”

The second forte of this professor. Larry W. Price, was medical geography. His most unusual, popular book, was also famous worldwide–Mountains and Man. He wrote the “bible” for today’s athletes, who were most devoted to tackling the high elevation parts of the world. He recommended I think about medical geography, and forwarded me to some famous writers on this subject who were “geographers”, not epidemiologists or ecologists.

Medical Geographers kept me out of the scientific realms enough to begin to think about other interpretations of health and disease, and the human environment. The added that “social Darwinist” aspect to the field of epidemiology research, which, to tell you the truth, I really wanted to stay as far away from as possible at the time. It’s interesting to study human behavior, sociology, sociocultural aspects of health. It’s another thing to study things you do not normally think about when learning science, at least back then. The Marxist interpretation of human behaviors and foodways for examples, and how capitalism is the cause for terrible modern diseases of the heart, lungs, various solid organs, and all those tissues that develop cancer today. What they hay!?

The early socialist interpretations of disease patterns and behavior are exactly what we need to consider when interpreting disease patterns today–which is what we do in medical geography. My review of the other interpretations of things have to do with epidemiology, which I learned twenty years earlier after several years of medical schooling, enabled me to develop another view of what they traditionally taught in medicine. This was my impetus for studying the epidemics along the Oregon Trail, in the way that medical geographers interpreted its outbreaks during the early to mid-1800s, as the disease made its way into the United States, and then across to the Great Plains, and finally partway along the trail heading towards Oregon. ‘Why did it stop? Or didn’t it really stop?’ These were the unanswered questions in the literature about Oregon Trail history, and in the medical writings for the time, from about 1839 to 1859. My goal was to go through as many historical writings as I could about Cholera on the Oregon Trail, define which of these were Asiatic Cholera and which were simply several diarrhea, or dysentery (amoebic or not), or opportunistic dysentery, or some other fever-generating, dehydrating, sometimes fatal disease brought on by the fatiguing nature of traditional overland journeys.

Disproving the complete diffusion of Asiatic Cholera westward along the trail wasn’t hard to do. But defining what caused the cases of diarrhea and occasional deaths required a thorough review of cholera-like diseases–like I mentioned in my defense of my thesis, August 2000, my study is more appropriately called, “the Geography of Diarrhea”, a section I had to write up for the Appendix of my work.

Present at the defense of my thesis were two colleagues. One was a pathologist, genetic engineer who worked on pathogens at the local CDC lab in the basement of the chemical building where I taught classes on phytochemistry, for twenty years. The other was my future colleague at the time who was deeply interested in the possible use of a full GIS for disease research. Our next three projects we received grants for as a team at the University, School of Community Health, were all devoted to disease ecology and GIS, environmental science and cancer, and socioeconomics and public health, all using GIS.

When west nile struck this country in late 1999/early 2000, we had initiated out lyme disease ecology project. Next came the more successful grant funded study on environmental chemistry, exposure to benzene at Superfund sites, and my development of a series of methods to explore the chemistry of these well documented chemical release sites, and to explore the content of each setting, relative to numbers of local cancer cases being documented in the state’s new disease registry. In this project, winter of 2005/6, I saw the need to replace square grid analysis with hexagonal grid analysis, and so developed the math needed to identify the points needed to construct a beehive/hexgrid display on my basemaps. I did the math to demonstrate how and why this math produced a more accurate centroid-specific map for which to analyze spatial units with (instead of the moving circles windows, which only defined “hot spots” one circle at a time).

As a number of my other pages review, I developed this methodology at the national level, covering the entire U.S., using a database that detailed the health histories of anywhere from 40 to 90 million people in the US. The base data for this database was zip codes. I developed the first national maps as zip code area maps of US statistics, and since ICDs were included in this database, I focused that work on ICDs, ICD groups, and any combinations of ICDs one could pull together, in order to map out where disease, for example from Japan or China, were found in the U.S.; how west nile was distributed relative to the two mountain ranges, on a per zip code basis, where certain sociocultural or human behavior diagnoses tended to cluster, like in certain rural versus heavily populated urban settings. This later was converted to a hexgrid mapping algorithm that I developed. (But the zipcodes maps tend to be more visually pleasing and applicable to real life intervention programs).

My recent work has enabled me to develop this work to a smaller area level–the neighborhood or street level, if I want to.

So, what are my returns of investments?

You can look at the development of these return s as simple “OCD”–obsessive compulsiveness–in other words, it ain’t over until its over.

The most amazing example of this happening pertains to the twenty years it took for me to first remember, and second know how to go back to a decades old research question, when the answer to that question finally appear in the literature. This event is when the source for a local New York physicians 1763 recipe of an alchemy nature was uncovered by a scholar researching very old manuscripts in Oxford around 1990. That opened the door to my continuation and near finalization of understanding the decade by decade history of medicine in New York, from about 1600 to 1850.

My luck with uncovering the original copies of the old medical geography maps of the world was a second set of discoveries helping me through my research endeavors. In particular, a lot of maps were published in local medical journals, or as their own essays by the same publisher, and these booklets were very common to the mid 19th century. Reviewing these maps provided important insight into how diseases were interpreted topographically, climatically, latitudinally.

November 9, 2019

Project Report – 2015 – 2022. CV 11-2019

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

I designed a way to score health care programs for their GIS savviness using this scoring process, which evaluated the different spatial modeling software tools, methods of analyses, mapping processes, applications of GIS use, types of GIS (point, raster, vector) employed for surveillance and public health monitoring. This survey ran from 2015-2019 and focused on the use of GIS by Health Care Organizations and Facilities.

Middle of Outbreak update.

The first tendency is to look at past epidemic history and try to assign some sort of natural cyclicity to the observations. The most common blame we hear surfacing this way, is “global warming”. Noah Webster is perhaps the first to note this relationship–blaming the outbreaks on climate and weather changes in combination with population growth, dwindling natural resources, and the development of urban areas. (see https://www.smithsonianmag.com/history/americas-first-great-global-warming-debate-31911494/ )

But still, we are now more than two centuries through these patterns of recurring reasoning, none of which seem to have resulted in science and medicine learning anything new about recurring disease outbreaks. This new outbreak gives us a chance to “reconnect with out past”, new opportunities to be published, to be the “first to discover something” within our own personal calendar for learning.

The first large organization/epidemiological research team outside the CDC, WHO, to successfully map an outbreak, continuously, came to be due to the Covid outbreak–Johns Hopkins (link: https://coronavirus.jhu.edu/map.html ). Other leaders in this field (I won’t give their names), whose products focus on disease monitoring and surveillance, were caught completely by surprise by Covid, and not at all ready to develop their workstations to a fully manage the outbreak about to happen.

With this new outbreak, there are observations we can now make about disease ecology and public health. There are also observations we can make, and the theories that follow, regarding evolution and genetic changes within a pathogen, that could enable us to better understand this natural phenomenon–disease pathogen evolution, migration and diffusion.

Cholera, Yellow Fever, Measles, Small Pox, Covid, all seem to follow similar spatial diffusion patterns. These laws persist, regardless if the organism/pathogen is multicellular, bacterial, or viral. There may in fact be some biological/evolution-based reasoning for the outbreaks we are seeing, but this post is not the place to discuss such a topic, for the moment. Suffice it to say, that like Vibrio comma, whatever newer strains of pathogens evolve, they work towards allowing themselves to spread more quickly, more easily, into more people, preferentially, without killing off their hosts or victims too quickly–that would be against their long term survival.

So for the moment, this addition to the page is for only to give an update to several years back. More will follow. Time to stay “politically” and “scientifically correct.”

Over the past 6 years I produced numerous studies on epidemiology, averaging about 52-60 per year, depending upon how you define a new study when an old one is extended or provided with a new grant. All in all, a total of 417 “projects” were engaged in during a 6.5 year period. 35 per year are completed residents’ work; another 15-17 are residents’ work in the first year and more in a preparatory phase, when I am not engaged much in the project, a handful are in the post-completion, pre-writing or writing phase for a journal article and/or conference abstract proposal. The rest of the projects are department level, and ongoing and/or grant related. These numbers do not include the data mining I regular engage in, and develop helpful disease registries from.

The value of this seemingly continuous research work is it provides me with opportunities to test new methods, new software, new formulas, and design new methods of analyses. It allows me to test small groups/small numbers theory, in relation to exceptionally large numbers theory. It allows for the confirmation of past theories published in my 2000 thesis (use of the “Reversed Hierarchical Diffusion Theory” model to explain how poverty can be a leading cause of epidemic spread.) It allows me to develop new and more helpful ways of presenting the data (examples follow).

This slideshow requires JavaScript.

One of my past writings, accomplishments in researching these spatial diffusion of disease patterns is worth noting.

In 1999, for my thesis work, I developed the logic for how, when and where a reversed hierarchical disease diffusion pattern could commence. This was the result of my extensive review of John C. Peters’s work on the Asiatic Cholera outbreaks throughout the 1800s. Peters was assigned a position that was the predecessor to today’s position(s) in charge of monitoring national epidemiology data, peoples’ health behaviors and disease statistics. John C Peters noted the tendency for cholera to initiate in low income living settings, a pattern repeatedly seen for many big cities impacted by the outbreaks, where mostly low income families resided, in particular the Irish and German immigrants. But we could also see this tendency for low income people to be struck in other cities, with African American poverty stricken communities. Kingston, Jamaica in particular developed its outbreak due to a low income older lady shopping for family’s food related essentials at the nearby wharfs.

So, today’s recent realization of this outbreak striking mostly the lower income settings–lower Socioeconomics Settings or SES–is simply a rediscovery or re-recognition of a link between poverty that we have known to exist since the first statistical reviews of yellow fever and cholera were published. Political leaders and leaders in health care are just never trained to keep this thought in mind when making their decisions, about how and where to divvy up the costs for care, and provide health care services where they are most needed.

ReversedHierarchicalDiffusionPatterns2

Figures Extracted from my 2000 Thesis (source: https://oregontrailcholera.wordpress.com/ ).

The following is from the November 2020 Publication of some of my May to July 2020 work, defining the relationships between mortality and chronic disease (CD) history, with each CD evaluated independently and as a combination risk factor for use in outcomes prediction and risk analysis.

See

Altonen BL, Arreglado TM, Leroux O, Murray-Ramcharan M, Engdahl R. Characteristics, comorbidities and survival analysis of young adults hospitalized with COVID-19 in New York City. PLoS One. 2020 Dec 14;15(12):e0243343. doi: 10.1371/journal.pone.0243343. PMID: 33315929; PMCID: PMC7735602.

https://pubmed.ncbi.nlm.nih.gov/33315929/

Acknowledgements: this work completed courtesy of Ryan Engdahl, MD., Department of Surgery, NYC Health + Hospitals, Harlem and Woodhull Hospitals, and Department of Research Administration, Health+Hospitals/Central Office, New York, NY, 10013, USA.

SurvivalPlots_Top4Comorbidities

SurvivalPlots_NumbersofDiagnoses

SurvivalPlots_Top4Modalities

https://pubmed.ncbi.nlm.nih.gov/?term=Altonen%20B%5BAuthor%5D&cauthor=true&cauthor_uid=34675748

MonthlyLeadExpsoureTestResults

The implementation of GIS as the means to monitor health by many (most) agencies is twenty years overdue. Barriers to implementing GIS theoretically could be defined as simply: lack of understanding of the technology, lack of experience engaging in its use, the absence of leaders in the field with a true GIS mathematical experience (most agencies felt specialist in CAD sufficed during the early years–2000 on, ’tis not the same!), the hesitancy to learning this new technology, the fear of losing your place in the workforce as a CAD expert, the cost of initiating a GIS, and based upon the decade, the level of technology that was presently in place, its storage and costly hardware requirements.

ChildhoodLeadPoisoningofChildrenQGIS

QGIS is a free GIS tool, developed abroad, that is very easy to use and less cumbersome than the products commonly sold in the U.S. In terms of sophistication and power to detect and display valuable epidemiological information, half of this is due to the knowledgebase and skills of the researchers and analysts, with another substantial part of this linked to the GIS tool(s) in use. The disadvantage to the most popular tools out there are:

cost
the value of what they produced (i.e. better unquestionable analyses are still generated more by direct human involvement, than by using/misusing the technology to achieve such gains),
the possibility of “going down a rabbit hole” without knowing there are better options, end products that may be generated in other ways.

The chief factors that led me to search for better options were the slowness of traditional GIS methodologies, and the constant need for experimentation and re-experimentation with new formulas, programs and technologies. The health care field recognizes and accepts certain tools, products, presentations as acceptable standards, and refuses to pay much attention to newer, more diverse, helpful forms and versions of the same technology.

VisitstoPatientsRatiobyReligiousGroups

Unfortunately, the most popular of these technologies are still very primitive in the way they function and operate, how many end products they can produce in a given amount of time. Can these tools produce, for example, a standard Atlas style report of the top 300 ICDs in a given hospital population? produced in just a day? repeated for various comparable subtopics, such as age groups (i.e. children, on up to seniors), gender, race/ethnicity, religion, zipcode area, field of employment, SES, etc.? The capability to accomplish this has been present for at least a decade (or longer).

BirthweightEducationMaternalHistory

It is possible there are just two hindrances to companies becoming overly creative in this field. They are rapid turnover of people in various positions, for various reasons, but perhaps largely due to how the HIT field is managed when it comes to implementing a complete and thorough Data Management. The overturn of experienced workers, in search of other venues and positions, removes people with important skillsets, where they could become exceptional producers of outcomes. The lack of expertise in management with health data is the second reason–if they cannot themselves produce the reports or products they want, then they should not be in the position to request such items. Likewise, the manager of a team should be able to produce the products being requested of his team members. Some differences in skills sets are expected between management and the rest; but differences that are so prevalent that they slow down performance, discovery, recognition of need, and the ability of the team to produce a much needed product at the time, make such managers very inefficient.

ChildhoodLeadExposureComplications_AsthmaADHDAutism

Related to the inability of a team to produce an effective work crew, is the tendency for inefficient teams and especially management to gloss over their human resources, and to even eliminate some of the most important members of a team when it comes to certain skill sets. This is regular event in the workplace, no matter how much sites like LinkedIn want to put out some unrealistic representation of how companies are actually engaged in this form of corporate growth due to better use of human resources and staff improvement.

DomesticAbuseCasesReported_2figs

With regards to generating an effective GIS for a program, production should start as soon as three months, with product by month 9, that then gets administrative and department approvals, and fine tuned, all by month 12, and then by the end of the first quarter of the next year, pass the first quarterly reporting administrative tests. It’s an “If I can do this, then why not you?” scenario, for those companies/work environments that need to take two to three years to finally get their act together.

EdwardRTufte

Innumerable facilities like to refer to Edward Tufte’s work on becoming a successful, highly productive business when it comes to data reporting. The late 1980s (to 1990) is when I was first approached by someone who likes referring to Tufte’s work. He was a used bookseller in Portland, OR. Now, nearly thirty five years later, I can say he is still right. And I have been in several positions where I experienced managers who attended conferences and came back swearing they, their team or company, was about to engage in such a venture.

To date, companies are still trying to get one basic data system working completely and effectively, not hindered by the new technology and hardware-software requirements. What prevents discovery, are the factors hindering chances of completion and thus slowing down the rates of discovery. Unfortunately, we are still trying to understand and document the same metrics in health care that we were trying to document twenty years ago, in turn never looking fully at the rest of our data. Thus, cancer, diabetes, rheumatism, hyperglycemia, are still being tested and retested, at the cost making important chronic diseases like epilepsy and MS last on the priority list.

The following is an example of how I applied it to develop my typical video-style presentation of a diagnosis distributed about a region.

(https://brianaltonenmph.com/about/)

Bibliographic Report of Projects PDF 2015-2022 Download

November 3, 2018

The Spatial Modeling of Health is now at the Next Level

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

3dsas4

Hot Spots for Lead Poisoning over the past 10 years (produced in response to a local news release, in 2018)

The Newest Priorities in Health Care must become Spatial Modeling. But to advance to this levels within our current system, a number of major changes still need to be made.

The first of these changes in developing the knowledge based for successful spatial modeling to commence within health care. In nearly every aspect of health care and epidemiology taught in academia, spatial modeling takes second stage to standard 75 year old methods of dealing with public health and the improving health care within the system.

The businesses with the greatest potential for engaging in the more advanced method of analyzing population health–the insurance companies–have never taken on the task of implementing GIS into their day to day analytic processes. As a result, they remain dependent mostly upon older formulas and methods for improving programs. And they perform these tasks at very slow rates, so slow, that the progress made in health care practices is never a result of their engagement in health and disease monitoring. Most advances in health care come from the technological and chemical/pharmaceutical and treatment advances made by the businesses devote to medical equipment. In the meantime, insurance companies, the allies of health care industry, have done little to advance the quality of care we receive.

Based on ICDs and/or Lab results, Rare Diseases, Genetic Diseases, Genome and Congenital Disease mapping are standard parts of the new Disease and Diagnosis Tracking system

Now, some want to argue that their engagement in prevention and intervention programs is proof that my claim is wrong. However, most of these processes are required of the health insurance industry, not a product of their own creativity and design of new practices.

If insurance companies had taken on GIS when that was first possible, sometime around the Windows 3.1 to 3.2 period, ca. 1995, they would right now be completely engaged in spatial health analysis, and some of the more advanced companies be engaged in the 3D and prediction modeling of population health, at such levels that urbanization programs and programs designed to improve rural population health could have been established by now.

The Ranking of Medical GIS Implementation within a Care Setting

Back in 2012, I evaluated the levels at which health care businesses were engaged in spatial analysis. I designed a 1 through 10 scoring system for a program, using the various applications of spatial analysis processes and tools to the programs they provide.

Level 1, score 1 engagement was one step over doing nothing at all–it consisted of manually mapping, but mostly tabulating your data of health, much like the public health institutions and agencies have been doing since this process was first put into practice around 1840. Talk about being behind, this puts many insurance companies more 150 years behind their potential and analysts and inventors or new research processes,

Level 3 to 4 is about where many smaller businesses engaged in health insurance now work. They are probably trying to define high risk populations, income and race-ethnicity groups, and using census data to locate the highest risk areas, and then, developing intervention programs targeting these health issues. The most basic of these issues are the standard we always hear about–the ones that just won’t be reversed and go away–namely, the low income related poor health/high cost for care related problems, the simplest of unhealthy human behaviors like not engaging in preventive health care behaviors, like exercise, deciding to cease the use of drugs or engage in smoking, reducing your STD risk, developing early recognition programs that work for preventing breast and colon cancer.

But maybe the management of the most basic human behaviors is not the best route for health insurance companies and programs to take when trying to reduce the costs of disease and immediate and long term care.

Another problem the typical health care programs, companies and even leaders have in designing an better program, is broadening the focus of what they do. For decades, the same preventive health plans have been at the forefront=smoking, drugs, drinking, nutrition and diet, lack of exercise, work and unemployment related stress and their impacts upon domestic life and personal safety within the home settings.

PMIhumansphere

But wait a minute, I was just talking about some of these earlier processes–ah yes–the major problem is not that these programs we design do not work; the main problem is that we have failed to redirect out focus into more diseases and broader programs for prevention that better fit many local population health needs, not just the same handful to dozen or two health problems.

Now, again, some are going to argue with how I am wording this criticism. Annual HEDIS reviews of most plans cover thousands of health care related statistics (if you accept the simple monitoring of an RX list as representing over a thousand metrics being monitored.) Similarly, annual reviews sponsored by NCQA, as an example, require successful program to engage in anywhere from a half dozen to two dozen or more monitoring-intervention focused health care, disease prevention programs.

BUT, (and yes that’s a big BUT!), these are pretty mush the same programs again and again. There were similar programs in place decades ago, like the annual anti-Tuberculosis campaign that took place nationwide around holiday season, or Jerry Lewis’s Easter Seals program, or the numerous cancer prevention programs that were developed. Some of these programs worked, others remain important to the national public health programs in place.

ChildhoodImmunizations

An example of Analytics is the Immunization Surveillance program. Case outbreaks are evaluated historically and monitored. Outbreaks lead to updates and reevaluation of prior GIS analyses performed. Both ICD related to the disease and refusal to vaccinate are monitored. Cluster analysis is performed and simple analysis of area by population features. Clustering and immunization refusal can be linked to local SES and/or cultural patterns. This helps to define the ranges and density in frequency of refusals required for critical levels to be reached, for an outbreak to begin, or the disease to most likely be spread to a new site. Risk factors can then be analyzed for regions, based upon outbreak numbers and risk.

Reaching Level 6 – Comprehensive Analytics

During the early 1900s, we had the excuse of lack of EMR and no GIS, for not engaging in super-active public health improvement programs nationwide. We don’t have that excuse today.

We need program that don’t just monitor some token (mostly symbolic and obviously linked to generations of ineffectiveness) public health problems, like the several highest cost programs nationwide devoted to asthma and diabetes prevention. In the 60’s, diabetes programs focused upon genetically produced Diabetes Type 1 patients; today this program has to serve tens to hundreds of thousands more people with the human behavior-induced (obesity-linked) form of diabetes onset.

In the 1960s, it looked as though epilepsy had a chance of being cured in a lifetime, according to Parke-Davis. Large contributions were made to the Epilepsy-Find a Cure campaigns. These campaigns are not longer common to the TV screen, which in part is due to more than 50 years of ineffectiveness. Sure, medications have improved, and some treatment-preventive methods make for safer long term outcomes. But, the unknown possibly gene-based reason for onset, has only delayed progress in treatment, or at least caused it to come to a standstill. So instead, those who have epilepsy, must deal with life in terms of two things–improved quality of life, including increased engagement in preventive self-administered care practices, improved interactions with friends and neighbors at the community level, and hopefully (but unlikely) reduced cultural based reactions to the person with the diagnosis of epilepsy (it is still to find a job that will allow for seizures to result in needed breaks or days off). By first hand experience (I dealt with this for easily twenty to thirty years, including the controversial surgery before its time), social knowledge and cultural definitions still remain a stumbling block in allowing epileptics to reach their fullest human potential.

So, it is up to any new programs now being developed to fully assess all of the health features of their populations, not just engage in the needed programs devoted to asthma treatment, mental health [programs, diabetes and lifestyle, or even programs that ensure patients remain devoted to whatever programs, activities and prescription products that are needed for them to stay healthy. Current programs speak about having these programs, but due to cost, knowledge base and staffing, only those select few examples exist.

3dsas5

For a certain ICD, or medical condition such as a lab value, or count /prevalence of individuals with a certain risk, can be evaluated based upon: Number of Patients, Visits engaged in for the ICD or risk activity measure, and numbers of procedures (complexity of practice) a patient goes through to receive care for the problem. TB cases for example, evaluated for average number of visits per patient provides a more useful rendition of the Patient-Visit-Care data, about the clustering or patients and cases, relative to cost and demands upon the system. Plans can be made based on the small clusters of cases with the largest numbers of visits per case average, in other words–those regions and cases with the highest cost. To document the need for change, and later success, a thorough review of the total population for its top ICDs and cost/visit relationships are required. For example, when the epidemiological transition modeling of the ICD data is applied, as in the below figure, evidence showed us that TB patients demonstrated the highest visits per patient over time, but have also undergone one of the most successful reductions in visits per year, over the past 12 years.

Epitrans_BlackMales Epitrans_BlackFemales

To fully understand the health of its population, an insurance company or managed care system must produce complete reviews of its patient’s health status, annually, and to apply this same process retrospectively with patients that have been in their care for years. On many of my pages about spatial analysis of health, I defined these various processes.

For a company or business to go above that Level 3-4 that I claim they exist at, they can easily reach what I am calling that Level 5.5–which is characterized by full engagement in environmental and population health monitoring, and experimenting with spatial analysis and the applications of spatial mapping software to the program. To be a successful Spatial Analyst centered program, a company must have developed the knowledge base and programming needed to engage in spatial analysis of anything and everything within the EMR. That constitutes the level 6 position in my 1 to 10 scale scoring system.

Infibulation or Female Genital Cutting/Mutilation is the first Culturally-linked “health” practice analyzed across the system and mapped, before initiating a full scale monitoring program, which includes the development of a plan and policy or set of rules required to produce a HIPAA compliant database (5 levels of compliance or use were defined, from full internal to full external data sharing), the definition and use of different parts of that database: a) name and registry data, levels 1 and 2 use only, b) visit history (temporal data and visit type only, for analyzing counts, frequencies, and ratios), c) procedures history (lab procedures and all other events done in care ranging from education, to diagnostics, to treatment and rx, etc.), d) results/outcomes of special testing (in general), e) Special lab results, microbiology, etc., f) pharma hx and such, g) ICD primary research ICD and allied or related ICDs, h) lifespan ICDs (diagnostics, procs, before during and after the medical experience). Others are pulled based upon purpose and request.

Example of this I see at work every day, include:

Pulling and massaging the EMR data and then mapping this data (within a half-day or less), in order to see where possible risk areas or subpopulations exist, engaging in this in response to a query from a PCP or some news report (i.e. by development of lead exposure >20 maps, or the maps of kids whose parents refused immunization after the recent measles outbreak in Rockland County.)
The development of standard base Population datasets, for immediately comparing and contrasting prevalence and incidence for a topic, again “on the fly”, or looking at recent changes in patient types in the ER due to a new flux of immigrants.
Finding the hots spots for a given bacteria or disease type that demonstrates seasonal and cultural / SES related patterns of re-eruption, based upon the last ten years of EMR data, in response to the recent legionnaire’s clusters detected, or the new adenovirus outbreak, or concerns that medication resistant TB may still be able to re-emerge in the future due to certain immigrant-rich communities.

This is the model developed for testing all of the EMR data, which includes notes (open text) and is evaluated as a non-structured data NLP process, as a NLP-process merged with a structured data search and extraction process, and a purely structured data search and extraction process (the methods most often employed). NLP is show to increase the amounts of pulls of researchable data from two-fold to ten-fold, depending upon the research questions. Seven data mining and extraction sql searches were merged into a single data pull run to accomplish this task; the total process takes place typically in three to six queries placed around the initial raw data master query. The Data transformation process (deidentifying the data, making the dataset HIPAA compatible at one or more levels) includes a removal, change or recoding of name (removed), patient_id (replaced with a math equation defined fake_id), address (replaced by lat-long, if needed), zip code (may be lat-long instead), facility (unique letter or unique letter-number coded), provider (recoded and deidentified), certain dates (i.e. birth and death dates, may be substituted with age or replaced by just year, or year and month),specific rare ICDs (placed into clusters or groups).

In effect, there is no excuse for the current status of GIS implementation in health care. At the ESRI level, some changes were made administratively with this goal and the related tasks in mind. There are many examples of GIS implementation that can be produced. And the Health GIS news is always sharing examples of new programs being initiated. Yet, none of these programs demonstrate the level of sophistication I am talking about with amount and types of use for GIS.

GIS needs to be implemented as a preventive health tool, a prediction tool, a cost analysis tool, an administrative planning tool. It can be used to evaluate, routinely, patient-visit practices. It can be used to see if there are peaks in product use or whether a change in medical claims policies (such as defining a new ICD10 resulting in improved DMDD care, for Community A vs Community B) actually changes the practices that are happening.

With the programming and methodology for GIS use established, any program can transfer the data from ongoing research programs into the spatial analysis aspects of health care. In theory, annual reports can even be implemented, for monitoring just Hispanic or African American Care related needs , or evaluating if changes are happening at the Culturally-related, Culturally-linked vs. Culturally-bound disease patterns level (which I have explain in detail several times, elsewhere).

FemaleSpouseAbuseMap1Closeup

Most of the programming that is required for the spatial analysis of health exists in some form or fashion across the IT world. Yet, to date, most of the agencies functioning as part of the health care system have failed to develop programs that successfully analyze and report on all that we need to know about health.

In the recent past, a number of Medical GIS people have argued with this statement. In part, they are correct because for each of their systems, some part of an ideal Medical GIS program has been perfected and is ongoing.

But this statement is not about those successful parts of Medical GIS that are put into play. To produce a successful program, an agency or company or corporation has to be able to implement spatial analysis at all levels. This not only includes the basic descriptive analytics that many places are engaged in. It also includes full scale analysis of the population that is managed, its ICDs, its health habits, the Charlton, Elixhauser, and other standardized scores for evaluating patient health risk, the costs for care on an individual in relation to age, ICD, SES, and procedure basis, the frequency of use of the system at all clinical and administrative/preparatory levels, the changes in numbers that occur over time, the development of machine logic NLP review processes for physicians’ and patients’ notes designed to accompany CPT and diagnostic code generated output, prediction models for any or all of the above, the semi- to fully automated regression analysis methods.

The other concern of high priority has to relate to HIPAA and patient confidentiality. To deal with this, five tiers of data development and processing were designed, to make data compliant with internal, external, and combined internal-external EMR data users. This includes the design of algorithms that produce effective deidentifiers (fake ID substitutes), and recode certain data forms that otherwise could still the possible identification or a given individual. Degree of data change for each of these levels is focused on things like exact DOB/Deathdate, exact location information, even exact ICD identificiation in some cases (aggregates are used instead where appropriate).

Before implementing this new GIS, a number of alternative spatial analysis processes were developed and tested during the past three tears, using EMR data (examples of which I have posted), using SQL and SAS (non SAS GIS) to develop, utilize and test a variety of new spatial methods to extract data and transform it, making it HIPAA compliant, well before importing it into SAS or GIS. This new GIS demonstrates some of the benefits of still managing complex EMR data within the GIS Workstation and Reporting environment.

Lead Poisoning Cases, Children < 5 yo

Due to the recent reports published in the news about falsifying documents pertaining to health safety inspections performed in low income settings, a means for evaluating the results of Lead Exposure blood tests performed on children under 5 at the time of testing was developed. Only unsafe values above 20 were checked, and mapped for this first run.

This same process may now be used to evaluate any datum or binomial query about NLP evaluated contents of an EMR note, be it of structured or non-structured data form. Three queries were developed based upon structured data defined expectations, a combined structured-non-structured result that may be considered a “hit”, such as a procedure and note indicating the result, and a completely non-structured text entries based NLP data review.

The next level of this work is to design a standard way to report upon the entire histories of a given set of patients. Four key databases, which are independently analyzable, yet linkable, have been identified for this process. They define the patient and population (1 unit), the ICD history (5-7 units per patient avg), the Visits schedules and times (20:1 ratio, by type of visits as well), and the Procedures or Care Events (labs, imagery, educational steps taken, discussion types, interventions) linked to Visits type-time relationships (50-2000:1 patient ratio). In addition, special tests are identified as well to be of value in separate reporting, such as microbial/infectious disease history, psychological/and mental health history, etc.

The ultimate goal of any Medical GIS program, implemented by an institution or healthcare plan, should be the development of an effective combined intervention-prediction modeling healthcare analytics program, designed to complement and add to the other analytic programs you may already have in place.

September 23, 2018

Next Steps in Population Health work, for progress in this field: Analyze!

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

H+H1

The current Healthcare Management programs nationally have remained in this unfortunate state of limbo over the past decade. I make this statement based on a series of studies I performed in 2012, the results of a survey I posted for 3 years regarding GIS implementation by facilities and programs, and the current state of progress depicted by published Medical Geography articles.

There are some new processes underway, but in spite of their increasing number, not a single facility has emerged as the leader in this field. That is because all research groups involved remain in the experimental stage in their work.

This is depicted in the above process defining flowchart I established around ten years ago, for scoring where a program of may be placed in my 1-10 scale for evaluating GIS-Remote-Sensing-Population Health levels of research, and applying GIS to health, at a large combined complete EMR database, and agency or corporation level.

Back in 2010/2012, some might recall I successfully mapped the entire United States using a national database of data related to 50 million – 120 million patients, depending upon the sets of data being analyzed, mostly focused on age-gender relationships and several thousand ICD groupings, for the most important ICDs in US health care.

In the more recent work, instead of size of population, I took advantage of a combination of population size and time details to evaluate care features. This allowed me to see relationships that exist between types of care, and the details of how patients and physicians engaged in that care. When considered together, the amount of care analysis reviewed at the total (gestault) level for care per patient, per healthcare problem, per personal health status, etc, ranged from just 1 to 21 years of care, with 1-16 years care as a better assumption, due to lack of completion of data entries per patient, at a per patient per year “continuous care” related level.

So, of about 12-16 million people reviewed, 50-120 M combined patient-years of data were reviewed. Multiple that value times 7 to get and estimate on the numbers of visits reviewed, and times 40-100 to get the total numbers of health care activities evaluated at the clinical level, minus the details, for whatever practice activities that physicians were engaged in.

With such a huge dataset evaluated during the recent stage of this work, a number of important types of studied emerged. These are some examples.

PhysicalCulturalGroupSubclasses2

PROJECT 1

Epidemiological transition, amongst Religious Groups.

Goals:

Develop general religious group maps of the region
Develop three tables per group and gender: 2015, 2010, 2005
Produce maps for specific health behaviors/ICDs/etc.

Major Groups:

Non-committed (agnostic, atheistic)
Muslim
Jewish
Asian/Cultural
Christian (merge two subgroups)
Natural Theologians
Unknown (include?)

Epitrans_BlackMales Epitrans_BlackFemales

PROJECT 2

Recategorizing ICDs

Goal

Recategorize ICDs into well know subgroupings, and then develop newer categories for major comparisons

i.e. emergent vs non-emergent, chronic disease versus non-chronic, genomic/genetic related vs others

Steps:

Produce tables of emergent vs non-emergent (with or without gender; replace gender binomial with E vs non-E. Perhaps relate to Race ethnicity religion groups
Produce maps of the same
Duplicate the first two for Chronic vs Acute
Duplicate the steps 1, 2 and 3 for genomic and genetic/congenital, versus the rest

PROJECT 3

Applications of QGIS

Goal

Produce a variety of QGIS maps for the region, for different levels of data mining, restructuring, analysis and evaluations

Steps

Demonstrate utility of this process across all forms of EMR data, including basic ICD, to Lab results, to risk related lab data features
Produce maps of Height, Weight and BMI from data already pulled
Reassess and remap for age-gender, add race-ethnicity-religion (RER) if time allows.

FGMmapping

PROJECT 4

Cultural Ethnicity Influences

Goal

Identify Cultural-Ethnicity related ICDs of the four type previously defined by past studies (Cultural/Geographic Infectious disease and genetic), Culturally-linked (possibly genetic. systems related), culturally-bound (mostly behavioral health related), culturally-related (high risk due to race-ethnicity in the US).

Steps

ICD analysis
Groupings
Tables and/or mapping

FemaleSpouseAbuseMap

FlowchartStatus

This slideshow requires JavaScript.

August 29, 2018

The Status of the New Medical GIS Program, and Site Stats

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

I am current developing a NYC GIS research process for evaluating 15 to 20 years of health data. Unlike most medical GIS programs, the data in this system is quite complex, and includes standard medical data along with case and practice specific structured and unstructured data components.

A preliminary review of the billions of datum rows in the EMR for this project suggests that there is an average of about 1.5 to 2.2 million patients appearing in the EMR due to new healthcare activities per year. Over the years, the total number of patients found in this EMR, which consists of 8 separate parts, is approximately 20 million people. However, there is considerable overlap in the patients from one network to the next, and it is estimated that in fact this data set probably represents 11 million to 15 million people. Only a small percentage (about 7-15%) of these patients have 10 plus years of care. In spite of the lack of a full lifespan’s worth of care for a single patient, these patient groups with 10 years of continuous care can be used to define standard processes involved with health care, of any sort.

The EMR’s components may be group for lengthy reviews as

patient data (mostly structured),
Patient temporal diagnosis (ICD9 or 10) data (chronic and acute, date of onset, completion, etc.)
visits data (structured, mostly numeric, date and time data; used to evaluate type of visit, time/date relationships, length of stay or service, and place(s) of care and their sequence, temporally),
procedures data (structured for the most part; well defined billable actions such as labs, xrays),
events data (structured and non-structured text and notes data; activities may be interpreted like procedures, but they are usually done by the physician as additional care processes, and usually not billed, such as administering a psych test, providing the patient with a pamphlet on STDs),
outcomes of results data (structured and non-structured; highly detailed findings; includes activities engaged in with these items, such as residents reviewing the xray, or students evaluating the blood sample and providing their diagnosis for review by their mentor)
additional outcomes or results data (focused on a unique topic such as pharma, sample testing, histology evaluation results, genomics, or bacteriology)

Approximately 72,000 health care procedures are detailed in this EMR. There are well over 300,000 events (many educational materials) per facility. Since patient visits are temporal, they are crucial to developing any pattern for evaluating patients health care processes. The visit types most often used are: office visit , emergent care, inpatient, regular visit, ambulatory surgery (other surgery is in inpatient), other visit (for xray, other unique item), referrals, communications by phone or mail about patient’s history.

A typical average visits per patient per year ration is about 6 to 10 per year.

A typical average procedures engaged in for care, per visit is about 40 to 100, increasing with age and complexity of patient’s health.

A typical inpatient stay can last about a week and consist of 1,000 to 10,000 procedures (with repeated ids), and each procedure generating as may as 1000 lines of data includes structured results, and non-structured additional information (who was engaged, notes on related issues, etc.)

The following diagram is used to depict these datum relationships

basicpatientevaluation

A collection of visits for a single problem may be modeled as follows:

leafstemmodel

Each visit has measurable components and time elements (PVEP = Patient-Visit-Event-Procedures)

measuringavisit

The following is an example of a lifetime of care model (each oval is a visit):

Case4

In March 2018, GIS was added to this patient care-disease modeling program.

— an update on this site’s stats —

00_aNYCHealthGIS

Step 1. Social, Societal, Demographic and Population Base Maps developed

H+H1

Step 2. Early application of data to QGIS

H+H2

Geography of Site Visitors for from 2009 to 2018 (so far)

baMPH_01

All Time most popular sites:

baMPH_02

Popular in 2018

baMPH_04

New Mapping Technology videos visited on this site (most are in Youtube):

baMPH_03

H+H2

For more on this technology, see the SAS generated videos produced by programming mapping in SAS (not SAS GIS) about ten years ago; this technology is complemented by the use of the new GIS workstation projects.

For examples of these visits, search through these blog posts, or see the collections of videos put on file at:

Topic:

Childcare

Migratory Disease Patterns

NPHG (National Population Health Grid)

My programming for this work is kept available at the following site, rarely referred to and/or visited:

Managed Care Innovations (SQLs)

https://wordpress.com/view/managedcareinnovations.wordpress.com

June 25, 2018

Implementing a Spatial Analysis Surveillance System

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

00_HowtoimplementMedGIS

The analysis of electronic medical records and the operation of a geographic information system are two very unique sets of skills, which when combined require time and finesse to get to an advanced level with the analyses.

In many ways, the complexity of each as stand-alones are comparable. There are multiple levels of format, complexity and relationship to medical data, which in this EMR system includes “dimensional” databases devoted to

Patient/Demography,
Visits,
Procedures or Events (treated as at least two levels in my models),
Results, outcomes or findings attached to the prior two, and
Location (x, y and z)
Time (t).

In a traditional GIS, this is comparable to the different “layers” that may be overlaid for analysis, and the “dimensions” at which maps or images can be produced and evaluated using this method.

00_TraditionalGISlayers

To develop an effective data storage, surveillance, analysis and presentation tool, equal amount of time and effort need to be spent at the EMR and GIS end. You comprise value and quality of spatial produce when too much time is spent on developing an effective EMR, without engaging adequately the GIS-related spatial analysis and presentation potentials of the program.

In traditional GIS classes, twenty years ago, the ability for researchers to use GIS to perform their analyses, besides the traditional Excel with Add-ons, or SPSS, or Stata, or SAS, or S+ to perform the analyses. The early impressions were that GIS could very well make these older, very traditional analysis programs obsolete. However, most recent changes in GIS and Statistical Analysis tools demonstrate the inability of any of these traditional software brands to hybridize their technology with other brands out there.

This is exemplified by the recent change in SAS from the traditional SAS (8.* to 9.*) to SAS Enterprise, without the traditional programming atmosphere offered by the older products. (Hopefully the new SAS will improve, but for now it’s overall value has been reduced by one or two application related levels).

When we look, for example, at the years long attempt to add a GIS option for SAS, in the form of SAS-GIS, the results of this product were quite upsetting, due to the quality of the output and figures, no necessarily the meaning and value of the analytics itself. SPSS is now compromised by its level of complexity, and breakdown into multiple sections of the software subscribed separately. Like a number of drawing programs that did the same, obtaining these forms of software became impossible for smaller groups, making it necessary for more affordable, more productive tools to be developed.

00_StepsforcreatingaNYCHealthGIS

Due to these recent changes, there may never be a perfect mapping environment for adequate population health surveillance and analysis, in fact, a form of EMR-GIS that is equally valuable and applicable across potential EMR-GIS settings, be they linked to big business or small business, large or small EMRs, insurance companies or much smaller institutional healthcare settings, large area or small area focused operations, large npos or small npos with minimal funding to support their goals and plans.

Transformed Data

Next, it helps to interpret these two large parts of the EMR-GIS system (EMR and GIS) in view of their smaller parts.

The EMR currently in use within my system makes use mostly of SQL and SAS. This two-tiered method of pulling and then evaluating data was successfully developed and implemented to perform the prior Big Data spatial health projects that were posted several years ago, when the national health data was analyzed (varying from 40-120M U.S. patients, 1-2 billion records per year, using SQL in a Teradata datapull work environment), then exported, filtered and turned into quantifiable location data, and then mappings using a SAS polygon-grid mapping program that I invented (it took only 15-30 minutes to produce a national map).

04_DataPINRequirements

In this newer system, more data crunching and redefining is done as part of the initial datapull process. All data are geocoded, and made HIPAA compliant as part of this initial datapull. This means a number of basic features of the data have to be maintained, such as no personal names, SSN, patient or member IDs, phone number, exact address, etc. These are all changes in the initial data pull process.

For example, for research purposes race and ethnicity data are converted to the U.S. Census standards, and missing or unknown data meaning converted to the appropriate subgroups (for example, as AfrAm or BL, Wh, As, SoAs, Al/NatAmer, HI/Pac Isl, . . . “N/A”, “unknown”, “no response”, “refused to answer”, “other” may be coded together or as their own unique subgroups). As another example, ten to twelve groups were defined for “religion”, referred to as Religion Groups (file column name “Relig_Grp”), and grouped as [I am being quite non-specific and incomplete here]: Christian, Christian-related (sects), Jewish, Muslim, Buddhist, Hindu+, Agnostic, Atheistic, Natural Theologians, . . . None, Other, . . . Unknown, Not noted . . . etcetera. When appropriate, all locations/specific places, proprietary and specific names are also recoded or eliminated.

The end product of this data pull is “transformed data”, which has four levels of HIPAA related compliance. This initial pull generally results in Level 1 data–which means that it would be difficult for an HIT individual to trace it back to the actual individual, without knowledge of the SQL transformation programming that was used. This data may then have to undergo further transformation or aggregation, depending upon it uses and needs.

Transformed Data Levels

Level 1 is intended for internal use and may include generally acceptable information, for example, a list of patients with names, DOBs, MRNs, addresses and phone (never SSN) numbers to contact.
Level 2 is no name, no address (lat-long instead), and preferably DOB converted to decimal age, for example, internal studies that are part of the system/network, but may not be fully active or engaged at the ally facilities.
Level 3 includes the above, plus recodes or removes all facility identifiers and PCP info, as needed, converting these to unique identifiers to something that can be decoded later when needed; this is intended for external use (but may include location or facility for certain outside npo activities such as quality assurance or improvement checks and program grading projects).
Level 4 is aggregate data (i.e. adequate for unmonitored course or college level training), with all of the above features, and further limitations applied when needed in complete compliance with location related features, as defined by NIH PHI guidelines.

03_Requirements

Software

Up to this point, some form of data storage process and a GIS are mentioned as requirements. A SAS may be used as a substitute for the GIS, assuming the programming I promoted here and elsewhere– areal (i.e. zip. census block) and grid (namely square or hexagon) spatial analyses, without basemapping — could be implemented (no SAS-GIS add-in is needed). In many cases, the data pulls are done using some internal software and/or sql. In the above figure, other steps are required to implement some standardized surveillance-analytics program. They define the most basic requirements.

Setting aside the selection process for a GIS for the time being, knowing your potential data and information resources for one of the most essential parts of this process.

01_NYHealthDataLayers

Data

Again, using the NYC setting as the example, there are several sources for basic information data and spatial data available for setting up a surveillance analytics spatial workstation.

03_SpatialData_Resources

The better know sources for data for spatial analysis work are the GIS companies and resources, with ESRI perhaps the better known, and a number of Federal, State, regional, agency related resources serving as the primary sources for the actual GIS spatial (point, line, polygon) shapefile data. Knowledge related resources or datasets (directly or indirectly spatial) that can be linked to shapefiles comprise the rest in the above listing.

06_GISdatatypes_Spacevsother

The following sites were used to access the base layer and background mapping data for establishing this EMR-GIS.

NYC Oasis Basemap (to locate sections for a study; to review features of that section): http://www.oasisnyc.net/map.aspx
NYC Open Data: https://opendata.cityofnewyork.us/
NYC Open Data, datasets browsing page : https://data.cityofnewyork.us/browse
NYC Planning (department of transportation) maps and baselayers/data: http://www1.nyc.gov/site/planning/data-maps/open-data.page
NYC Map Tiles (background or basemaps, i.e. older maps digitized): https://maps.nyc.gov/tiles/
NY Orthoimagery maps (also base maps): https://orthos.dhses.ny.gov/ see also http://gis.ny.gov/gateway/mg/napp_download.htm
USGS Earthexplorer: https://earthexplorer.usgs.gov/
Landsat, downloads/purchases: https://landsat.usgs.gov/landsat-data-access
Landsat general page : http://www.landsat.com/
Zip Codes (some ESRI links): https://www.zip-codes.com/zip-code-map-boundary-data.asp

06_GISdatatypes

NYC Open Data site is the source for most of the shapefiles and spatial data, which are used to link EMR data to. There are spatial and non-spatial (location or area related) data available at this site. Most of this data is reliable and useable (can somehow be linked to a GIS, either by lat-long-x,y coordinate, or place name/zipcode/etc.).

02_NYOrthoimagery

The Orthoimagery page provides rasterized datasets that in some ways are comparable to the use of Landsat imagery (though not exact as LS or NDVI ,etc), and may be evaluated using some of the same remote sensing methods or strategies.

00_NYHealthData

Medical GIS

To date, there has been a number of barriers preventing the adequate application of GIS within the health sector to the implementation of a facility or program based GIS devoted to monitoring, surveillance, intervention and other health program management activities. At the institutional and insurance company level, this barrier exists due to the lack of knowledge and experience within the Medical Records or EMR data management system. There have been numerous examples of GIS utilization attempted here and there throughout the system. Still, to date, there is no clearcut leader in the field making use of a combined EMR-GIS data warehouse management practice or procedure. For nearly twenty years, EMR-GIS practices have remained at the experimental level, when interpreted using the process I developed years ago just before analyzing the national U.S. EMR data for the first time at the 50M to 100M patients level almost 10 years ago (see nationalpopulationhealthgrid.com).

The result of this disengagement of GIS expertise in the field is literally the stagnation of health management at the local and regional government program, insurance program level. Is it possible that the limits have been reached for singularly trained people, forming a team of varying experts, who when even working in groups are unable to take their team performance to the next level of achievement?

The barrier to health improvement has often been related to this lack of GIS implementation. The combination of changes in software, hardware, storage technology, data analysis speeds, data build and restructuring speeds, have partially limited the ability of “experts” to make any long lasting changes. Since all of these parts of the EMR data technology undergo changes and development at rapid speeds, by the time a process is developed for such a program, the tools and information have changed, the older knowledge base is outdated, desires to patent or own a particular process get outdated (half of the 17 years patent rights may be gone), and the health of the people may have even changed, making certain areas of focus no longer applicable.

Implementing a GIS at the healthcare level, in particular within the private business or hospital/facility levels, enables more directly targeted, patient and doctor implemented changes to be made. Whereas at the insurance level, the same achievement is theoretically possible, the one or two steps away from the patient-doctor interaction that an insurance company places itself, and the frequent discontent patients suffer due to the lack of helpful or adequate coverage (even lack of coverage) insurers provide, severely hamper any ability of the insurance company to have a timely impact on patients’ health. To implement a GIS at the caregiver’s facility level ensures the facility of its right and ability to make improvements.

01_NYCHealthImprovement

If we look at this issue as a similar series of different care programs were implemented in the past, we see a parallel here. In terms of intervention rates for patients receiving some form of preventive care, such as childhood immunization, passive programs with insurers that fail to interact with patients (i.e. PPV) have much lower rates than HMOs, which is turn have lower rates than Managed Care programs, in which the provider and patient are regularly evaluated and scored for their more interactive relationship. Enabling a program to evaluate its population provides its leaders directly with opportunity to make decisions that immediately speed up or slow down certain parts of the healthcare process. They need not wait for feedback from their patients’ insurers, and in general, they can do nothing if they rely upon last year’s (or if lucky last quarter’s) reported posted by the regional public health review.

Finally, it is important to note that the variety of measures that a program can engage in is much greater when the program itself carries out these activities, instead of waiting for regional agencies and scorers of programs to determine what limited measures to use to evaluate an institution or facility’s performance. An effective combination of EMR evaluations and GIS monitoring and surveillance can carry out such processes as fast as on a daily or live basis, instead of retrospective.

The benefits of implementing an internal GIS include enabling a program to surpass its competitors, even the smaller programs may supercede the prior successes of their much larger competitors.

Summary

To succeed in the implementation of an EMR-GIS program at the institutional level (not just the limited research level), the following processes need to take place.

learn the required skills, implement them, and develop the required work habits;
define rules and regulations, and establish/publish policies;
ensure HIPAA compliance, meet related NIH research and PHI requirements;
set up an IRB capable of handling all of the above processes;
produce a task force comprised of experts in these fields;
review and test the EMR data, including routine error analysis checks;
document/detail the Levels 1 through 4 requirements of data transformation, and define the pre-Level 1 data limits (i.e. ‘no SSN release, ever’, etc.)
implement a GIS–first at point-vector level, and then at raster and imagery levels
define the EMR, GIS and combined EMR-GIS-analytic processes (flowcharts)
establish some regular analysis and outcomes reporting standards (mimic your HEDIS, and then some), and semi-automate to fully automate these processes
engage in structured and non-structured text analyses EMR data analytic processes
develop and initiate qualitative, quantitative and combined research programs
apply the EMR-GIS tools and methods to searching for new grant opportunities and identifying unique population related needs for your program
develop a big data GIS reporting “Atlas” that can be regularly produced for your facility/facilities and the appropriate parts of the program
produce a time comparison report of the ICDs and therapeutic/diagnostic results performed on your facilty or institution’s population, comparing three periods of time, in order to define the changes in ICD rankings that have taken place over time [last year, vs. 6 yrs ago, vs. 11 yrs ago) at the patient, visit, visit:patient ratio (VPR) levels. (Also consider repeating this for special subgroups, i.e. just child age diseases, or just chronic disease rates, or just infectious diseases.)

00_aNYCHealthGIS

June 10, 2018

A New Surveillance System for Managed Care

Posted by Brian Altonen, MPH, MS under Uncategorized
Comments Off

For the past few years, researchers accessing the electronic medical records system have been most devoted to very basic forms of observation, surveillance, monitoring, and reporting. GIS has been theoretically applied for the most part, or to experiment with this analytic process, to supplement processes already underway for quality improvement activities, and/or to use GIS to produce basic spatial expressions of the data researchers are working with. The best use of GIS is to apply it to explain why certain things happen, for predictive modeling, and to evaluate change at some fairly sophisticated, detailed level of analysis. Yet for the most part, we primarily envision and apply GIS in health research to explain something that happened, not why and where it will happen.

The public health and quality improvement practices have already developed GIS in order to monitor and report upon such basic public health data as STD rates, or watching for infectious disease outbreaks, or monitoring the HIV incidence for suspicious sub-populations that serve as some nidus for some outbreaks. Public health and population health management programs employ it for reporting and planning purposes, such as for evaluating and recording childhood immunizations and to intervene where changes are most needed, or to study 18 to 24 year old Chlamydia rates in young people who are sexually active and demonstrate high rates for unexpected pregnancies.

Health improvement programs may use GIS to report on annual diabetes well visit rates or to show spatial relationships that might exist for high and low A1c, LDL and BP areas. In the background, some parts of the New York City GIS teams have been able to provide potential researchers with helpful baseline spatial data to use for developing new spatial surveillance systems, by providing important supporting spatial datasets for this work., such as relating the placement of clinics, offices or similar service facilities, to increase the engagement of patients with these programs.

A number of years ago, I graded the level of accomplishment an program had in spatial epidemiology applications as typically as high as level 5.5 to 5.75. This was based upon the different practices these groups normally engage in with GIS, using it to record, develop a history of, experiment with, and research how to implement this form of practice for basic health and safety concerns. This also has attached to it the supposition that for the most part, health care programs engage GIS at only some basic level. To determine just how much a program is engaged in the use of this tool, one need only reflect on how many measurable factors or details contained within an EMR system are being analyzed. Is the EMR being used to its fullest extent?

An example of the 3D Mapping of cases using SAS 8.* and 9.* (non-SAS-GIS), applied for surveillance purposes since 2007

For the most part, GIS in health has been very slow in advancing in the actual implementation of this means for reporting during the past ten years. We are certainly more familiar with the potential uses of GIS and its possible applications outside the already well-established realms define by environmental health, public health, population health, and quality of service/intervention teams. For the most part, these projects remain single examples of what is being done. Very few programs engage GIS at the level of big data reporting, such as mapping all 999 ICD9s at the spatial-temporal-age-gender-race level, per program, per facility, per larger unit (i.e. insurance programs, companies, NIH funded, SES focused programs) responsible for providing that care.

That is now changing within the population health surveillance and research activities I engage in on a regular basis.

For the past 2+ years, time was spent exploring the complexity of a complete EMR.

When we all first learn about ‘big data’ what we see as examples do not accurately demonstrate the details, length and complexity of health data that resides in the most basic, first or second generation residing within an EMR system. There are not just 1, 2 or 3, or even 4 to 6 special tables within which all data are recorded. Each group of data that can be placed within an EMR forms its own table, with multiple rows entered per unit of activity or metric being evaluated. These multiple row tables are modified or reconstructed into one-to-one or one-to-a-few formats.

When all patient data are entered, for example, these data which are stored initially as rows, get converted to columns, with patient identifiers (or its numeric assignment) as the index column(s) for this work. Each patient can then have columns that depict name, gender, dob, dod, mother’s name, address, state, zipcode, race, religion, insurer(s), etc. In the system I use, these tables are called dimensions and provide the most important personal, family and demographic data that exist for any given patient.

When a patient interacts with the health care system, there visits happen. Some programs call these interactions between health care staff, and another entity involved with the patient –such as patient, parent, other provider, previous care giver, other facility. In this review, I term these actions “visits”, but it has other common names elsewhere in the QOC/QI system.

This second dimension of care, the Visits, have only a few basic elements that define them, such as location (coded even down to the extreme, such as bed in a room), date, time (at day-hour-minute-second level) that something starts, ends or happens, time of closure or completion, etc. The study of the Visits Dimension for a patient’s care process provides the dimensions needed to correlate events over space and time, allowing for a review of practitioner or systems logic, and identifying situations where changes may need to be made, through rule-setting, policy, procedure, assignment of place for the event to happen, implementation of different programs for poor performance teams, groups or places. Without even looking at what practices were performed for a patient in the health care setting, we can see where further investigation may need to be made due to higher failures or death rates are seen for a given program. The details of what were done and which of these went wrong haven’t even payed a role yet in the health care process.

The third level or dimension of care pertains to the details of what events ensue during a given visit. The definition of each step in the care process also enables a time element to be defined for the care process. This means the patient may come in a time1, see the MD at time2, received an injection at time3, be seen by a specialist at time4, undergo and MRI at time5, be evaluated at time7, be admitted for inpatient care at time8, and then undergo nearly a thousand more time-defined processes over the next few days of treatment, recovery, and then discharge. Other temporal processes that can be evaluated here include time till initiation, overall time elapsed, time to recovery, and even post-inpatient time in relation to unwanted readmission events.

measuringavisit

FIGURE 1

In this review of the care processes, what happens with a “visit” are generally interpreted as events or procedures. Events are what happens to a person, that typically is considered part of the care process. Procedures are practice related events that typically involved additional skills and are often coded with a procedure identifier because that identifier may be linked to the cost of care and billing. As a general rule, events are not charges, procedures are. Events carried out by a clinician are considered in defining the bill for the visit. Procedures carried out by the clinicians and/or technicians are often charged per routine, not per visit. But like always, we have exceptions–such as procedures that are free but documented as part of the visit. One of the most common of these within the system I operate are the Vital Signs taken and related medicate history questions asked and entered during each visit event.

The value of Procedures and Events coding is that the kinds of services being offered are considered, along with their relation to the overall timing and sequence of activities engaged in for the care process.

The “bread and butter” of all health care processes are the results of these procedures (and sometimes events). “Results” is the term applied to these datum elements here. And results are typically more than just the “result” of a test.

Typical results entered into a data warehouse include such datum as (with semicolons as separators): Yes; T; 1; 20; 3.45; “2,4,5,3,6,1,8” ; Complete; French; 13450; John Smith Sr.; “>150”; “168/96”; phq9; “above normal range”; Dr. Chase”; etc.

Over the past few years, extensive reviews were carried out for the size and numerica relationships between these four core “dimensional” datasets–patient. visit, event/procedure, and results–a general accounting of these figures, for just one visit and its linked events, is about 1:7-10:40-400:4000-40,000. I term this ration P;V;E or P; R.

Patients are their own unique number, Visits are their own unique number, but for a health related happening (a diagnosis linked to the visits), one patient may have 7-10 visits per year related to it (directly or indirectly) per year. Each visit in turn results in various events and procedures (vitals taken, labs ordered, educational materials provided, referrals give, etc.); even the most basic, simplest visit, such as a 9 month old well visits, will have Procedures entered for several immunizations, several health and safety checks with the mom, height and weight measures, pulse, an overall health evaluation visual exam of the kid done for scoring the child’s development, etc. etc. Therefore, 40 to 400 events (educating the mother about breast feeding) and procedures (labs, health metrics) are not atypical to any system.

The key to understanding each program, each system, requires a complete evaluation of these different measurables, numerically and percent wise, to see what the norm is for the system, and to see how its various subcomponents perform and document the same duties. So, for a single institution, we might assumed that all follow a protocol, and that each one could have different time related findings, but all within institutional standards, such that all of the products of that type of visit are the same (i.e. vitals documented and entered, immunizations that are due were completed, all of this occurring in less than one hour.

“Results” is the next dimension, but the data content of this is actually best considered multidimensional. The basic format of results data should be qualitative or quantitative, structured or non-structured, parametric or non-parametic. Different institutions may store subparts of these data into separate places, such as grouping all pulses into one dataset, or all lab measures and results into a single laboratory results file, or all xrays taken into a single xrays database, with dates, times, procedure taken, amount of energy administered, time in and time out, results, initial interpretation, final interpretation, etc.

Results are any outcome of happening linked to a procedure or sometimes event. Therefore, results can also be evaluated as relating to any of several groups of data entries:

process or procedure related info, such as exposure time, amount of xray administered, test tube/sample number, type of test, numeric sequence of sample taken, frequency, drug administered Y/N, type of test administered, units of measurement, US or metric values,
true results, like positive diagnosis (structured or non-structured), amount of energy read, size of nodule noted, amount of radioactive substance detected within tissue, estimated cells per cc, percentile ranking of height, LDL, BP, and id number for organism identified
events, activities, notes, that follow and/or relate to those results , such as normal range, maximum range allowed, viewed by PCP #, diagnosed approved by department head (Y/N), reliability of results (0/1), event closed or not (0/1).
general or non-specific non-structural data, such as words, text, impressions, notes of normal range, etc. entered as free text into a cell designed for this (Comment, Other, or Note cell ) and used by the practitioner to provided additional notes, which may or may not be specifically related to the procedure at hand.

basicpatientevaluation

FIGURE 2

The data evaluation, up to this point, focuses on just the Visit as the chief event, or research and analysis unit. We can look at one visit and all that happens in relation to it, be it a well visit, an inpatient stay, an emergency event followed by hospitalization, a referral to a specialist, a meeting with a social worker. You can analyze the time component, the sequence, and/or the length of time until a certain point is reached (how long until the MRI was done?).

The data evaluation can also assess all of these events, over time, for a single patient, in relation to his/her medical history, and onset of new diagnoses or ICDs.

In the following model, the sequential visits related to single problems are assessed, such as diagnosis of heart disease, leading up to valve replacement. Each one of the processes as defined in Figure 1 above, is presented by an oval in this figure.

leafstemmodel

FIGURE 3

Over time, these processes of care escalate and can have a cascading effect on patient health care needs.

therapeuticprocesses_fig4

FIGURE 4

In the more complex, lifespan models, all of the diagnoses and actions taken to care for someone may be placed into this model, to define lifelong related population health processes and individual health care experiences.

therapeuticprocesses_casestudy

In an application of this model to a personal medical history model, to review cost of care to the overall results of that care, for someone with a long history of epilepsy, the following cost analyses models were developed. They predict the relationships of rising costs for care in patients well controlled, not controlled, and those who underwent some intervention care (such as neurosurgery), versus those who didn’t. [These were all covered earlier; the arguments for costs depicted here are found on those blog pages].

20_lifetimecosts

The Spatial Modelling Dimension

The next level of implementing this process for evaluating health care involved the application of these above statistical processes to data that may be linked to GIS research processes.

All data in an EMR, structured or not, parametric or not, numeric or text, can be converted to fully quantitative data by adding simple several spatial elements to the project.

The common comparisons between facilities and clinics, or health care between races and neighborhoods, for examples, are informally spatial in nature, and more formally best referred to as geographic, since latitude and longitude, distances, time, and spatial relationships are not a part of their formal numbers based evaluation processes. By add the location-distance relationship, such as through the use of centroids, or space area analyses, or patient place (lat long) data, any and all health EMR data becomes quantitative in nature. A fully text based, non-structured, content analysis, or 50 people undergoing a rare experience, is made spatial by adding lat-long to their analyses (although this one metric alone benefits more by other non-parametrics such as race, gender and age).

Health care analyses that become replicable and semi- to fully-automated in EMRs analyses can also be semi-automated or manually interpreted using GIS. The values of this application for GIS to healthcare monitoring are fairly easy to visualize.

By implementing a GIS at this time for this surveillance program, a second process for evaluating health spatially is now in operation.*

This process of spatially evaluating data in SAS was developed a few years ago. The means for producing videos of these 3D models of health in an urban area were perfected, in SAS Basic and SAS Graph (no SAS GIS was developed).

0_8_Multiple3Dmodeling

Which is fortunate, since SAS GIS has recently been turned over to a new workstation format for spatial analysis using SAS–SAS Enterprise with ERSI ArcGIS extension. At the institutional level, this doubles the cost for implementing such a program at the QI/QOC level for Managed Care programs, like the ones I have worked with.

The spatial SAS methods applied serve in the analysis and projection/display process, with the animation of results that can be developed for rotating 3D model imagery the major benefit of this spatial analysis method. [see Below] We can further improve upon this by smoothening out the shapefile centroid data used to produce these models, by converting irregular shapefile (zip code) data into more regular square cell grid data (the algorithms for this I presented numerous times elsewhere). We can further smoothen these presentations with a hexgrid modeling algorithm I developed (also detailed elsewhere; no example here, for now).

With the addition of a regular GIS workstation to the analytics process for evaluating 20 years of 11 million people’s health data, this work environment enables higher levels of the above (see initial figure) scoring system to be reached–Levels 6 and then Level 7. Because the data pulls and reconfiguration are based upon automated or semiautomated, often SQL and then SAS macro processes, it is possible to run these evaluations for numerous new types of studies: for example, re-evaluating past reserch projects and questions across the system, by focusing on any form or group of ICD, labs, diagnostics, psych test results, demographical, Age-SES-Race-Ethnicity-Religion (SAS-RER) grouping, neighborhood (latlong), NYC healthy area polygon, nearest office visit (location theory/distance), inpatient stay pattern, Log reg / Kaplan Meier derived life expectancy patterns.

The following is an early example:

H+H1

H+H2

NOTE: Pb = Lead, Px = poisoning, Hx = history. This is for 0-9.99 year olds

Future postings will review these processes in more detail, cover the theory, review the programming and statistical methodologies, and provide various types of examples.

See

Altonen BL, Arreglado TM, Leroux O, Murray-Ramcharan M, Engdahl R. Characteristics, comorbidities and survival analysis of young adults hospitalized with COVID-19 in New York City. PLoS One. 2020 Dec 14;15(12):e0243343. doi: 10.1371/journal.pone.0243343. PMID: 33315929; PMCID: PMC7735602.

Acknowledgements: this work completed courtesy of Ryan Engdahl, MD., Department of Surgery, NYC Health + Hospitals, Harlem and Woodhull Hospitals, and Department of Research Administration, Health+Hospitals/Central Office, New York, NY, 10013, USA.

Other recent professional journal articles by myself are referenced on the following NIH search page:

Hot Spots for Lead Poisoning over the past 10 years (produced in response to a local news release, in 2018)

Based on ICDs and/or Lab results, Rare Diseases, Genetic Diseases, Genome and Congenital Disease mapping are standard parts of the new Disease and Diagnosis Tracking system

The Ranking of Medical GIS Implementation within a Care Setting

Lead Poisoning Cases, Children < 5 yo

PROJECT 1

PROJECT 2

PROJECT 3

PROJECT 4

Transformed Data

Level 1 is intended for internal use and may include generally acceptable information, for example, a list of patients with names, DOBs, MRNs, addresses and phone (never SSN) numbers to contact.

Level 2 is no name, no address (lat-long instead), and preferably DOB converted to decimal age, for example, internal studies that are part of the system/network, but may not be fully active or engaged at the ally facilities.

Level 4 is aggregate data (i.e. adequate for unmonitored course or college level training), with all of the above features, and further limitations applied when needed in complete compliance with location related features, as defined by NIH PHI guidelines.

Software

Data

Medical GIS

Summary

NOTE: Pb = Lead, Px = poisoning, Hx = history. This is for 0-9.99 year olds

Future postings will review these processes in more detail, cover the theory, review the programming and statistical methodologies, and provide various types of examples.

*Thanks to my research assistant Terrence Calistro for installing and developing the GIS.

About the Site:

Links

Pages

Categories:

Search:

Monthly:

**Thanks to my research assistant Terrence Calistro for installing and developing the GIS.*