I felt like I'd discovered New Zealand's beauty again upon hearing Tricia Greenhalgh describe her work in evaluation of Big IT projects yesterday at the HINZ conference in Wellington. It's wonderful to find someone with her eyes wide open who can illuminate the difficulties of policy analysis when applied to interventions, such as shared Electronic Health Record systems in complex adaptive social environments.
I haven't done my homework and read the key papers she refers to, to integrate my experience into those, so that's on my new to-do list, but I think I have evolved a few concepts that help me in sense making that I wanted to get down and share first. They may spark some better ideas among other readers. I'll presume that there might be value in totally fresh eyes brought upon this subject by someone like me crossing fields (and oceans), and beg apologies for seeming to take credit for ideas that others have already presented.
Let me state that in 1976-7 I had the joy of working with a group of people who "ported" the patient record system from Massachusetts General Hospital, probably the best system in the world of electronic health records, to the New York State College of Veterinary Medicine at Cornell Univerity in Ithaca, New York, USA. We had a fully functioning EHR, with 200 terminals, subsecond response time, automated decision support, etc. -- 34 years ago. The technology was able to do this then, on machines with far less capacity than a single 1 Gigahertz $400 laptop computer today.
I met with the head of BAZIS, a group responsible for computing for hospitals in the Netherlands, in 1989 at a SCAMC conference (precursor to AMIA) in San Francisco. They were running 2,500 bed hospitals, with sub-second response time, on a single microVax computer -- again, with less processing power that a single $400 laptop has today.
On the basis of those experiences, and everything I've seen in the last 34 years, I feel confident in asserting that the issues around Electronic Health Records have nothing to do with technology.
There are no show-stopping technical problems. The key problems are organizational, political, psychological, and social.
THEREFORE -- it doesn't MATTER what kind of technology or computer hardware or database system or data architecture a vendor can supply, aside from the fact that if these are BAD choices, this can KILL a project. These things, however, cannot MAKE a PROJECT succeed. This is a crucial
distinction.
That little thing called "implementation" of a change, in a clunky-but-functioning large social system, turns out to be huge. It also turns out to be something that I.T. People, in their little technical silo's, seldom care about and know almost nothing about, and have no methodology for dealing with.
Here's a test of an steering committee -- count the number of psychologists, group psychologists, behavioral change specialists, and anthropologists on the steering committee. My assertion would be, if this count is ZERO, you should pack your bags and walk away. Again, presence of these perspectives will not guarantee success --- but absence of them will guarantee failure.
I've spent the last 40 years working in IT and one thing is clear to me, captured in this basic rule of thumb: There is no computer system so perfect that hostile users cannot make it fail.
In fact, getting much further away from the purely technical paradigm of "health records", I'd suggest a long review of what the poet T.S. Eliot meant, in Choruses From the Rock, when he said:
They constantly try to escape
From the darkness outside and within
By dreaming of systems so perfect that no one will need to be good.
But the man that is shall shadow
The man that pretends to be.
So, lest this be to long, here's my own observations on this area of evaluation of evaluation of social-scale IT projects, making an effort to use language that is accessible at the risk of being less precise. In other words, the question is whether our method of evaluation itself needs more evaluation. I'll use "principle" and "implications" bullets.
1) Social reality has dynamic, interconnected, multidimensional, feedback processes that are continuous and unbounded in space and time.
Our attempts to "evaluate" an "intervention" typically impose upon this reality a digital, discrete "event" model with sharp boundaries in both space and time of "what" it is we are evaluating. These boundaries can dramatically alter the results and need to be justified, or at least made explicit. At a minimum we need to ask if different boundaries in space and time would have given different answers to our questions.
Example -- WHEN do you evaluate "the impact" of something on society? How long do you have to wait before you are relatively sure that all of the longer-term impacts of an intervention have had a chance to percolate through the system and show themselves?
This is a classic problem in evaluating teacher competence -- how do you compare a teacher who students just love, but is quickly forgotten, to one they intensely dislike, but, a decade or two later, they realize was the most important beneficial influence on their lives of all their teachers?
To paraphrase John Sterman in Business Dynamics, "There are no side-effects, only effects."
Which of those effects we are willing to acknowledge and measure, and which we deny responsibility for and refuse to measure totally alters the picture. We know from social wisdom that it is possible to "win the battle but lose the war". We have heard the quip " The operation was a success -- but the patient died."
I'd affirm my own belief that the impact of an intervention's decision and implementation process upon society is at least as important as "the intervention" itself. At the end of the day, have we put in some sort of technical fix, but done it in such a way that we've alienated everyone and created new polarizing rifts or driven away good people in a way that will take years to heal? It's a key principle in Baha'i consultation for example that the social impact is far more important than "the decision" in guiding discussion. Better to make a "wrong decision" and preserve working relationships, than to make a "right" decision which destroys working relationships -- because a healthy living social structure can recover from a wrong decision, and will, but a destroyed social structure cannot benefit from a "right" decision down the road as it has become non-adaptive and will soon crash the bus.
(2) The measure of an intervention (good versus bad) is generally a function of SCALE, and we should expect that it is not a monotonic value but can easily be one that changes with scale.
This type of concept was very hard for my MBA students to grasp, when comparing investments by "net present value", a metric that meant that which investment was "better" depended upon what rate of inflation or time-value-of-money you used. At 0%, investment A might be better. At 5%, investment B might be better. At 10 %, investment A might be better again. This sort of non-monotonic comparison is rampant in life, but unrecognized in our decision making processes, and often so startling to people that they become incapacitated upon beholding it.
Similarly, we can find interventions that measured immediately are "good" by some metric, but that measured over a two year period are "bad" by the same metric, and that measured over a ten year period are "good" again, by exactly the same metric. We cannot assume that "goodness" is independent of the size ruler we use to measure it.
(3) When we are comparing two or more alternatives with multiple dimensions, there is no reason that the term "better" is even meaningful .
Martin Gardner in Scientific American Mathematical Games did a wonderful job of popularizing the concept of a set of 4 "non-transitive dice" where die A "beats" die B 2/3 of the time. Similarly B beats C 2/3 of the time. And C beats D 2/3 of the time. And D beat's A 2/3 of the time, closing a "strange loop". There is in fact no "best" die. Again, this concept is infuriating to many people who cannot accept it. (see youTube video)
Despite this incredibly inconvenient fact, our society is just rife with occasions on which people struggle to identify the "best" wife / employee / location / house / cell-phone / etc. when the term is simply non-applicable.
(4) When faced with the inconvenient truth that a popular process is non-applicable, people will almost always nod in agreement, then proceed to apply the process as if it were applicable.
We see this all the time when people attempt to apply the General Linear Model of statistical reasoning, which assumes there are no closed-feedback paths between "cause" and "effect", to social systems which clearly have feedback paths between the two.
I'd allege that ALL social systems and large-scale interventions have such feedback paths, and therefore ALL attempts to apply flat GLM-based statistical techniques to them are invalid, inapplicable, and wrong, wrong, wrong. That said, people will go ahead and do them anyway.
(5) When the non-applicability of popular techniques is blatant and brazen, researchers will simply alter the data until the "problem" with their mental toolbox "goes away."
Example. My wife and I attended a conference on "Self Regulation of Health Behaviors" at the University of Michigan in the US a few years ago. All the top researchers around the world in health behavior were there and presenting. As the conference I kept noticing that people were giving examples of successful interventions, such as paying a woman's CHILDREN ten dollars for every pound she lost, instead of her, in which the actors were not the "self" of the conference, but other people in a contextual support group.
At the panel at the end of the conference, I called the question, and asked, was I correct in noticing that, to a person, every single researcher had noted that the most successful interventions in "self-regulation" were, in fact, interventions that involved other people. After conferring briefly, the panel agreed.
In my typical annoying way, I then asked WHY it was that NONE of these examples were in the papers these people had published. Again they conferred, and agreed that these interventions were difficult to measure, and impossible to compute a "p-value" or confidence value for, so they left out those data points.
In other words, for the single most expensive component of health costs in the USA (behaviorally mediated), the single most effective strategy (social support) was simply erased from the data set because it "didn't' fit" the way the researchers felt they had to evaluate the data. They agreed to this, nodded, and then went on as if this was no big deal.
Hmm.
(6) There are many laws of physics and mathematics, as it were, that apply to feedback-controlled systems. These are generic laws of "control theory" and as valid as Newton's law of F=ma. These apply to ANY system with feedback, whether it is animal, vegetable, mineral, social, mechanical, etc. (See for example this post)
These laws are not contested in the field of control systems theory and practice. They are well burned in , and have tool-kits in, say, MATLAB, which can be used to apply them. They are used to design our cars, elevators, electric motors, airplanes, bridges, etc. Textbooks on them are in their 5th edition.
This is not magic, nor is it wild speculation.. This is solid boring engineering.
The problem is, this work is in a silo and essentially unknown in the medical and policy and IT worlds. Therefore, the basic principles are violated or partially rediscovered on a daily ad hoc basis.
Control theory deals among other things with issues such as "stability" of a system after application of an "intervention". It deals with "rise time" - or how long it takes a system to respond to an intervention. We can stop here and realize that these concepts would be extremely valuable already if they even were on the radar of discussions of policy interventions in society, but they are not on the table. For example, in any active system, if you change something, will it (a) tend to change back to where it was, (b) tend to amplify your change and change even further (eg, fall over entirely once tipped.), (c) generate a huge firestorm of protest and cause you to remove your bloody stump from the sacred stones you just touched? (Stir up a hornets nest you cannot put back in the hive.)
Here's a truly inconvenient fact then:
(7) You cannot push a social system to change faster than a certain intrinsic speed. If you attempt to push faster than that, instead of changing the system, you will rip, tear, shred, or destroy the system. There is no reason that this maximum rate of change is conveniently within the term of office of a particular political party that would like to "see the results" of their intervention before the next election.
Again, we expect people to sagely not agreement, then get right back to what their plans are for Big IT that they will "implement" within the "next 5 years". This time frame is typically not chosen on the basis of evidence that the social fabric can support a change of this magnitude at that speed, but because we have 5 fingers on each hand and 5-years is before the next election.
(8) If you are trying to change the state of a system from one mountain top (locally optimized in some metric) to a higher mountain top (even more optimized in that metric), you almost certainly will have to go down into the valley in-between on your journey.
Anytime you replace one functioning system with another, there will be a "settling-time" period in-between where the pain of the new system and lack of system-wide coherence is evident and even dominant.
In short, things have to get worse on the pathway to getting better.
This is a topic that is not popular for discussion. How much worse is acceptable? How high are the stakes and the costs of new issues that you had previously "under control" the old way, and have to be made "out of control" for a while before they become "under better control" the new way?
(8) In a medical environment, during this instability and disrupted time, there will be more mistakes, more errors, higher mortality and morbidity for a period of time, until the parts of the system reaquire "phase lock" and start working smoothly again as a unit with larger-scale coherence.
In aviation, for example, at Johns Hopkins in the course in Patient Safety, we got the figure that 74% of all commercial airline "accidents" occur on the first day that the flight crew is assembled and has to "work as a team." The people are all trained professionals, highly competent, "error free", but the STRUCTURE OF THE TEAM, THE METASTRUCTURE, the SYSTEM has not yet "found itself" and established "phase lock." (my choice of words).
This is a very general phenomenon.
This is going to occur any time you change a complex system. There is nothing you can do to get around this problem. It will not go away if you pretend it is not there.
The coordination signals, the feedback loops have to "ramp up", fill the buffers, take up the slack, and run a few entire cycles before each part of the system becomes "aware" of the changes to the overall environment due to the ACTIVE involvement of each other agent in the overall system.
All systems have this property. The identity of each agent is a function of each other agent, and until they have detected, by action, the torques generated by the new presence of new agents elsewhere in the system, they will mis-compute, mis-construe what action they need to take in order to accomplish a certain outcome. They will push a lever, as it were, and turn around to find that someone else has now ALSO pushed the same lever, applying an impulse twice, or giving a drug twice. Or someone else, thinking it is their job will UNpull the lever. Etc.
It is not a "FAILING" of the system that this period of nascent-presence occurs, but it can well be a FAILING of the implementation team if they have not provided massive external temporary auxiliary support sensors and actors to detect and mitigate these cross-signal pathways during this "burning in" period. It will be typified my massive amounts of "But, I thought YOU were going to handle that!" or "Then what DID you mean when you said you "had it"?"
(9) Shutting off sensors does not cause the problem they were sensing to "go away" in a lasting sense, regardless how good it appears or feels in the short run.
It appears extremely common that change managers implement a policy of "I don't want to hear about problems! Make this work!". Often, people on the front lines with eyes will continue to raise the fact of problems, and management's resolution of this is to remove them from "the table", not allow them to voice their "negative opinions", or even to fire them entirely to get their "negative" voice and "opposition" out of the system. This typically only makes the underlying problem that had been sensed WORSE, because everyone else who was not fired now knows that they are not allowed to even mention this problem, so it has to become enormous as the "elephant in the room" before management will be faced with it, often via multiple or class-action lawsuits.
In other words, you cannot FIX a piece of software by refusing to print the "error log".
(10) A GOOD system implementation will be characterized by discovering, surfacing, and dealing with a huge number of coordination problems and local issues in a fractal world that were invisible and unsuspected when someone higher up or to far away to see or appreciate the devil in the details said "Just do it."
If someone tells you "things are going smoothly" and "there were no issues", your sensors are broken. There is no way you can change a complex set of processes and meta-processes without there being "issues".
(11) To a good first approximation, all social systems have the "wicked-II" property.
"Wicked" problems, have been defined as "
"Wicked problem" is a phrase originally used in social planning to describe a problem that is difficult or impossible to solve because of incomplete, contradictory, and changing requirements that are often difficult to recognize. Moreover, because of complex interdependencies, the effort to solve one aspect of a wicked problem may reveal or create other problems.I have defined an even worse category, "wicked-2" or "wicked-II", which is problems that are visibly wicked at a local level, but APPEAR from far off, or from management's over-simplified view of what it is their employees do all day, to be SIMPLE.
C. West Churchman introduced the concept of wicked problems in a "Guest Editorial" of Management Science (Vol. 14, No. 4, December 1967)
They are almost always described from above using the term "Just", with some exasperation, as in "Why don't you just merge those two data sets and be done with it!" followed by NOT remaining around to HEAR the answer or refusing to believe that anything so "obviously simple" can possibly "take so long" to do. "Surely" the employees are malingering, or heel-dragging, or trying to sabotage the project by refusing to cooperate.
(12) Rephrased for importance -- All complexity APPEARS to "go away" if you stand back far enough from it, or get high enough in an organization that the successive series of oversimplifications in management reports or powerPoint slides have taken their tool.
This, however, does not mean that the complexity has gone away. It only APPEARS to have gone away, or not to be there in the first place. Thus, at a distance level, it seems "EASY" to "JUST" combine patient records from these 8 different practices into a single combined Clinical Data Repository, and, voila, we're done! What can possibly be so hard about just doing that! (Followed by laughter at the the simplicity and obstruction of the lower-level fools who balk when asked to just do this.)
What is less clear from above is that one of the data sets is only 20% accurate, and that when the 8 are merged with to trail as to where data came from, the final data set will only be 92% accurate, which is not good enough to be reliable and will result in massive incorrect issues followed explosions of anger and blame, followed by refusal to use the data "until these problems in data quality are fixed."
(13) Large scale IT is not just large -- it spans multiple levels and contexts in a social systems. This means that not only does it APPEAR to be different things to different people, it actually IS different things to different people. The phrase "it', implying that "it" is a "single" thing, is therefore misleading.
In fact, it would be far more instructive to leave the "it" OUT of the equation when talking about how people respond to "it" -- and focus instead entirely on the processes that are affected and impacted OUTSIDE the box.
This is, of course, exactly the OPPOSITE of what classical I.T. management and thinking will do, if they are providing leadership to "the project." I.T. managers would have many people concerned with what is INSIDE The box -- at least in the view of clinicians.
Now, as one who has labored in the vinyards INSIDE the box for the last few decades, I can state confidently that IT Managers often do not look INSIDE the box either. The concerns from computer science perspectives about little things like "architecture" are often taken as "minor" voices from an "interest group" of "stakeholders", not as definitive statements about a different type of physical laws and principles that totally determine how the data flows within the system will behave.
For one example, a coherent "data architecture" will have the property that some "fact" that is true in one part of the system will also be true everywhere in the system -- at least, it will be true once the truthiness (thank you Jon Stewart and Daily Show for that term) of it has propagated from the "source of truth" and the "trusted gold-standard source" outward, across hill and dale, across silo boundaries and messaging links, to all the different component legacy systems that "make up" the larger simulated "coherent system", and within each legacy system, to the far internal corners and all the places where this "fact" is stored, or represented, or has affected what else is stored there.
For those outside the box, this process may seem "instantaneous" and occurs when a user "inputs the data." For those who labor in this vinyard, the process is anything but atomic or instantaneous, and is fraught with pitfalls and dark spaces and places where this new fact, this new state of the world, is at first glace rejected and denied. "I'm sorry" the system may say (right), 'but you are not authorized to change this piece of information." Now we have a hopefully-transient state where half of the sub-systems think the patient is, say, "male" and the other half think the patient is "female."
It is a "demand" of the architecture group and the database group that this kind of state should not be allowed to exist long enough to be visible to the users. Management may consider this a negotiable demand, failing to grasp the implications of such data inconsistencies on decisions based on "the system."
(14) Here is the "deal" however -- for a single truth to be maintained (after this invisible-to-user transition period) it is ABSOLUTELY REQUIRED that "the system", the greater system, MUST be able to OVERRIDE any particular sub-system and INSIST that a data field be changed from what the legacy sub-system thinks it is to some new value.
Well, this is no small matter. In fact, this is a HUGE matter. You have no idea of the implications and ramifications of this on people, processes, responsibility, and behavior.
Example -- after 5 years of incredible effort, a particular subsystem has FINALLY managed the data quality effort and has the data 99.99% correct. This is almost certainly becomes some person, call her Edith, personally cares about and owns the data and you should beware of your life should you attempt to reach in to "her" system and arbitrarily enter unclean, unvalidated, or inconsistent data!
Edith will not be a happy camper when told that, now, her almost perfect system and household is going to be MERGED with the much larger and much more conspicuously sloppy neighbor's household, with "shared responsibility" for data quality. Suddenly, everywhere Edith turns in her once perfect home, there are dirty socks on the floor, old T-shirts on the chairs, and half-empty boxes of food on the tables. Not only that, but if this mess is cleaned up, tomorrow she will arrive and found it has been messed up yet again.
So, what will Edith do? Think about this. How is a somewhat detail-obsessive compulsive person going to deal with this situation. First, she will oppose the merge violently, vehemently. This will be over-turned for political reasons, and she will be told it HAS to happen, there is no choice. At this point either (a) Edith will leave the organization because she refuses to put up with this clutter, or (b) Edith will stay, but she will psychologically leave the organization and STOP CARING if the house is clean any more.
So, what we have here is a sort of Gresham's law (cheap money drives out good money) applied to data quality. Everyone who cares deeply about data quality (always a minority!) will be consistently over-ridden, in the short-run, by the "demands of the greater good to merge these data sources."
Here we see the system moving into the valley of the shadow of death, as it were, between the older optimization mountain of high-quality maintained by Edith, and the new mountain, still perhaps a year or so away, when the NEW management finally comprehends WHY the older staff was so anal-compulsive about data quality, and comes up with new processes and mandates about data quality. In between times, even the databases that used to be highly reliable will become LESS reliable.
This will result in some extreme cases where users of the data, placing reliance on the database that turns out to be unjustified in hindsight, will produce very bad patient outcomes (e.g., death), which will produce very large social firestorms, which will result in many users now refusing to use the new system "until these problems are fixed." Well. When will that be, now that Edith is gone?
By the way, the accuracy required is very high. How many times would you need to look up a blood type and get a definitive but incorrect value before you would stop using that source? Not very many high-profile instances, shared at mortality conferences, are required to cause clinicians to refuse to use this system and boycott it.
15) Similarly, the overall reliability of the system depends strongly on this thing called "architecture", and in bad architectures, the overall reliability is determined by the WORST SINGLE COMPONENT in the entire system.
What is desired, in computer terms, is an actual "distributed operating system" and a "distributed data architecture" that maintains truth regardless where data elements are stored or accessed.
What is often used instead, in fact, what is almost ALWAYS used instead, because managers and policy makers don't understand the crucial distinction, is a collection of legacy systems connected to each other with some kind of "messaging" links, typically HL7.
Here is a fact of life -- a collection of systems connected by messaging links is NOT automatically the distributed single-truth system which is implicitly required. Some very strong statements can be made about such systems, based on computer science.
The odds that a collection of systems will "form" or "emerge" a coherent super-system, of its own accord, is identical to the odds that a group of people from different cultures and stakeholder groups, put into a room, will form a "working team" with coherent vision, goals, processes, terminologies, and methodologies. In other words -- ZERO.
Even with a highly-paid "coach" person, and extensive "practice" sessions, and strategy sessions, and reviews of prior actions, a collection of sportsmen, even individually very professional, will not form a "team" capable of spectacular, or even non-laughable teamwork. This doesn't "just happen".
Management telling the players "We MUST work together!" does not produce this outcome. -- were that it were so easy.
This is a problem I discuss extensively in this blog, the problem of "unity with diversity". The specialty sub-systems have to maintain individual identity and their own "Ediths" for quality control, or their contribution to the overall mix will go to zero. At the same time, each system has to be capable of being OVERRIDDEN when the greater system is CERTAIN that some data element, regardless of Edith's diligent efforts, is incorrect.
This is no longer a "technical" problem -- this has now become a social problem.
16) The odds of success are inversely proportional to the number of users of the system whose perspectives and needs are considered "marginal" or "slack elements", who are expected to simply "adapt" their work to the new system's demands, on top of their already overburdened lives.
In particular, I'm trying to recall any system put in that did not have the phrase in the summary "We didn't really consult with nurses before doing X, and it turns out we should have."
Hmm. It's fascinating that this very same lesson is encountered over and over and over with zero social learning. I would suggest that there is a very strong paradigm / myth / belief in place, held by GP's and Managers and often government administrators, that "nurses don't matter" and, "whatever it is they do, we don't need to plan around it."
I'd suggest someone do an actual count of the number of times the post-hoc evaluation report says that "nurses should have been consulted more, earlier, in greater depth than they were." This TRUE FACT needs some kind of extraordinary PUSH from LEADERSHIP to make it into control of action, when push comes to shove, apparently. It seems to be a VERY forgettable fact, the true baby in the bath-water, one that is almost ALWAYS discarded very early in implementation plans, and then almost always SERIOUSLY regretted only later -- again, apparently, because it flies in the face of the internal belief system of the de-facto power brokers, who are not themselves nurses.
So, let me put that as a separate point:
17) NURSES MATTER AS MUCH AS DOCTORS for success of an EHR implementation. If nurses don't have an equal voice at the table, in both numbers and volume and power, you might as well plan on system failure and walk away and cut your losses. Study after study reveals this, and time after time the next guys on the block reject and ignore this advice.
In the USA the Institute of Medicine has finally realized this in a report this year on the future of health care in the USA, with a call for nurses, not doctors, to lead the IT direction. Not too surprisingly, this has outraged many doctors who feel they either speak for nurses, or feel they understand all the key issues that nurses require. From what I've seen, nurses would disagree with this, and believe they have important issues that are simply invisible to doctors, issues that will not be addressed by even wise and well-intentioned doctors.
Again, simply by reading the after-action reports on prior installations, or failed implementations of EHR's, there are clearly some key success factors for EHR's that are NOT visible before-the-fact to the standard methods for managing such projects, but that seem to be visible AFTER the fact to those managers, who, sadly, are unable to communicate this lesson to the NEXT group of people attempting exactly the same kind of effort, and about to make exactly the same mistakes.
No comments:
Post a Comment