The people in healthcare are the patient, the physicians, the nurses and other allied health care givers, the administrative staff, the individuals in charge of facilities upkeep, pretty much every person who sets foot in a healthcare facility or setting.

There are a few people that are virtually present within a healthcare environment.  The most important of such are those that comprise an individual’s past medical or health history.  This includes past physicians and surgeons, and all of their associates who saw you name, you medical history, your state of being, your financial state and healthcare debt status.

The indirect, almost invisible people, also present in the healthcare equivalent of Big Data and its collection of people’s health trivia and other unexpected tidbits of personal knowledge, are the financiers, investors, innovators (if there are any), outside thinkers, members of the Boards and Committees who have minimal business relationships with the system, and as many other people one can think of that contribute to the quality of life and health in the health care environment setting.  There are the brownies perhaps, or the dog breeders with their patient friendly doses of animal medicine, or the volunteer groups often interacting with the children’s leukemia floor patients, or the post-op epilepsy wards where patient need to readjust to their recent physical and possibly cognitive changes.

Some of the the events I mention are very rare.  Others are quite common, so common they happen numerous times every day.

The rarest examples are those outlier cases that come into a facility or unit, the likes of which may never be seen again by that team due to its low incidence rate.   There is that hemophililia patient, whose cost amounts to more than 1 million dollars per year, just for the costly pharmaceuticals.  Such a cost may be covered personally, or by some government program like medicaid, or by the patient’s insurance agency (rather rarely so, however).  There are those 6 infants per year who present with a fracture of their arm near the elbow, for which there is usually only one way that happens–child abuse.  There are those people whose health history demonstrates an exceptional outcome, so exceptional the system has to determine how to treat such an event, quietly or supportingly and even outwardly, whatever way being the patient’s choice.


The rarest medical events occur on a scale of a few per 100,000.  There are even rare events, but for this review, we will stick with the handful of cases per 100,000 patients as a realistic prospect when you are reviewing two million people.

Two million people has within it 20 groups of 100,000 people.  If the incidence of something is 3 per 100,000, that means for your 2 population population, you have 2,000,000/100,000,  or 20 times 3 possible cases that might exist in your patients’ list.  That’s 60 cases, and is possibly enough to do a mixed study of, in terms of the quantitative-qualitative research sense.

The more common events documented in medical records are those in the n/10,000 ratio, one logarithmic level greater than the previous example.  These patients provide researchers of that population with the most valuable insight once the 2 million begins to be studied.  These kinds of events (diagnoses, complications, etc.) can be situations which are too often ignored of not reviewed.  Again, with a mixed approach to this sort of study, you could produce a very valuable set of insights into this population, enough to advance the special programs defined for these individuals further along.

The second benefit of this incidence and the 2 million population size is that we now have 10 times more examples than the previous example.  Instead of 60, we have 600 cases.  Perhaps the entire population can be reviewed, for general features such as age range and gender, but now also more specific features such as income level, neighborhood setting, type of job.


The n/1000 group provides us with still better opportunities to explore health and health related events at the small subgroup level.  For a 2 million population, with the same incidence noted before but increased tenfold, we have 3/1000 or 0.3% of the population eligible for the review–totalling 6000 cases (from 0.003 x 2 million).

Now we definitely have to sample for our study, unless we are dealing with only EMR/EHR data.  Then we can explore the 6000 cases and explore their features, mine their other records in search of unique cases and/or outliers.    A common example of this kind of scenario is studying the epileptic patient, the frequency of which is about 4 out of 1000.  A two million patient population will not doubt have about 3000 to 6000 patients, of which if we select the most active and present examples, we are still provided with a large enough population to apply mixed methods research to.  The question on how to deal with this possible study population size is as follows:

  1. First, you need to deal with specifically what it is that you want to study, and how big are the related subgroups.  With epilepsy, we could divide these people into socioeconomic groups and/or ethnic groups, and probably come up with a completely new insight into managing these patients.
  2. Second, you need to know what topics can be studied that can be linked to an intervention or improvement process.  For the treatment of epilepsy, these could include such metrics as waiting room time, frequency of hospitalizations, comparisons of health and performance of patients between major treatment facilities or groups, comparisons of the smaller subgroups to each other (the many kinds of epilepsy), to see which programs are effective, and which are not.
  3. Third, you need to evaluate when, where and how qualitative reviews will carry your study further into the unknown.  Case studies, focus group activities, surveys are all methods  available to these patients for further exploration of their care process, and how well it meets their needs.


For the n/100 group. . . that seems like it could be too many cases for your clinical teams to deal with for a study.  Selection is definitely needed, or requests for voluntary participation.  But at the EMR/EHR level, these groups also allow for a stabilization of data quality in the EMR/EHR world.  One can evaluate all of these patients, filter down to smaller groups by finding the percent of good versus bad records, and then come up with a set of rules for evaluating this kind of population in general.  Examples of this would be the diabetics and heart disease patients common to healthcare programs.

It could also be demonstrated as certain forms of poor patient compliance, poor physician performance, poor follow-up activities, poor long term quality of life consequences.  Again, the mixed approach to studying this group is possible.  Defining subgroups or specific aspects of the care process that you wish to improve can be added to the overall study design.  This is the situation in which performance improvements can be made in quality of care offered for certain ethnicities or minority groups.   Such work on these patients will also significantly impact the cost of care overall, first by increasing healthcare system engagement processes, themselves requiring more money, and then secondly by targeting and actively engaging doctors and patients in this quality of service, quality of life related process.  As before, sampling may help, but it is more useful to subdivide your patients into smaller groups, define your priorities, and continue this study with an emphasis on the differences between the forms of patient care provided for each of these sets of patients.


So, the number 2 million is certain a benefit to a healthcare system, if an when I can evaluate such a population size.

Next, we have to answer the question: ‘What are the ways to initiate this form of research?’