Wednesday, April 08, 2020

Tips for reliable operations during the Covid-19 crisis

I read recently where a lineman was seen to reach out and touch a live high-voltage wire without his insulating gloves on and get killed instantly.  At dawn one day in 2006 an aircraft crew taxied out to the wrong runway,  one which was way too short,  and tried to take off, resulting in a crash which killed all the passengers and all but one of the crew.  ( See The Crash of Comair 5191 post   )

These "accidents" were not caused by the people involved being stupid, uninformed, or careless. They were caused by a series of factors which added up to just the wrong mix,  when the last line of defense, the awareness or mindfulness of the crew,  was brought down by overwork, fatigue,  and things not being where they always were -- an unexpected change in context.

Novices and outsiders are quick to jump on the "obvious stupidity" of the men involved, but experienced pilots, surgeons,  etc.  will tell you sadly that there are only two kind of people in their field -- people who have made a mistake like that already,  and people who will.

I want to cry as I look at the number of doctors, nurses,  first-responders,  corporate and military decision makers trying to survive a fire-hose of high-stress high-stakes decisions,  day after day after day, with little sleep,  none of their usual ways to relax,  and many of the normal things they count on being broken or absent or changed.   On top of the usual review process is a "cancel culture" that believes "one strike and you're out!" and a legal culture of blame that often ignores the context and structural features and management policies and decisions that contributed heavily to errors happening -- and the front-line worker is the one who ends up being fired, sued,  or condemned.

Novice in today's "cancel culture" believe that if a person makes a serious mistake, that should spell the end of their career.     In fields where it takes 5-10 years or more to get professionals trained for a job,  you can't simply discard them because they made a mistake.   We don't have enough of them and it won't fix "the problem" regardless how good it might feel at the time to blame and sacrifice them.

In other posts in this blog,  I go into a longer discussion of some of the theory of how errors occur in general, in the long run,  and what can be done to reduce them, including some links to literature on the subject. Some are at the end of that piece on Comair 5191.   If you search this blog for "error" or "reliability" you'll find them.

For now I want to focus on:

Some useful rules-of-thumb for operation in crisis situations,  especially extended situations where fatigue is the norm,  the stakes are high -- possibly life and death -- , and all manner of little things that one is used to are out of place, broken, unavailable,  or replaced by unfamiliar substitutes.


REALIZE that the effective IQ of a person can be seriously reduced by fatigue,  stress,  overload, and distractions so that a person with a normal IQ of 160 could end up operating at an effective IQ of 100 or maybe 80 -- below average.

* REALIZE that the effective IQ of a person goes down even more once they make a mistake and are trying to continue operating, or have to continue operating despite that.

* REALIZE that as a person becomes impaired,  typically, their judgment goes down faster than their ability.      Even people who are used to being able to self-assess correctly can end up thinking they are fit for service when they should really be taken off the front-lines for everyone' sake.  You see this in the number of people arrested for being drunk who are unaware they are seriously tanked and are offended by anyone treating them that way,   who are shocked to see the booking videos of themselves the next day.

* THEREFORE --  you have to assume that you are one of the impaired people. Even if you don't notice it. Even if you think you can man up and focus and push your way through  it.

*  If your coworkers all tell you you need to take a break,   take a break.  No one wins if you kill people by accident.

* Anywhere you possibly can,  buddy up.    Work in pairs.  Never work alone. Get a second person to spot for you, and you for them.   It's amazing how many stupid things a truly exhausted person might do which could be caught by a second person watching and going "Wait -- what are you doing?!! "

Buddies also can cover for you, or cover you,  or help you get back off the street if you tripped and broke your ankle while relocating across the street in a rush. 

Some people protest that buddying up makes a shortage of staff worse, because now you have only half as many people.  Possibly not true!  In software development,  such arguments are raised against "peer-programing" where a team of two people write code together, side by side, working on thesame screen with one driving and one watching.    Studies show that pairs of people can often write good systems much faster than twice the speed that one person could, and with much higher quality.

* Checklists are your friends -- use them!

It takes way less time to check items off as you go,  regardless how compulsively anal that might seem, than to pick up the pieces after a disaster because you missed a step.

The way humans work very often we use shortcuts, so the way you remember to do some report is that the mail-cart comes by, and that triggers your action.      If the mail-cart doesn't come,  its very likely nothing else will trigger your action,  and you won't do something you should have.  You may not even realize how you cue up triggers and use them, but when many things are out of place, all your normal triggers may be broken.

*  Don't punt.   As the surviving pilots say:  Plan the flight, then fly the plan.  And along with prior comments,  have a second person vet the plan and tell you what you forgot.   It takes way less time to get a review than it does to get part way somewhere and realize you forgot something.


* Don't trust your memory for anything.   Have some way to write things down and check them off.  You will be subject to far more distractions than you are used to.     Things you put in short-term memory will be long gone by the time you get back from the interrupt.   Even walking through a  door frame may cause a complete blanking out of what it was you came there to do.


* Don't assume things will be where they should be.    Someone probably moved them. Or they've all run out when you weren't looking. Or someone stole the whole bunch of them since the last time you checked.

* Don't assume you will be where you should be.   Your whole base of operations may suddenly get shifted,  and things you counted on being in the cabinet or just down the hall won't be.    Get a go-bag of your critical resources and be prepared to grab it and go in an instant if everyone has to relocate.  Figure out in a calm time what should be in it.

 * Hard as it is -- leave extra time for everything.   Its faster to do things right the first time, even if it seems to take longer, than to do them over.   Nothing will burn up staff time faster than having to do things over.

* IF the computer is working at all,
expect the system or network or database to be really, really slow.

* In a widespread crisis, if the cell-towers are still operational,
expect a long wait for a dial-tone or a long-distance connection.

* Don't assume communications are received.  

Never assume someone got your message.   Lost and never-received messages are a classic cause of large scale disasters and massive short-term arguments between people, one of which thinks the message got through and the other of which didn't even know there was ever a message.  If it matters that the message gets through,   request or demand an acknowledgement, and check those.

* Never assume someone correctly interpreted your message.  In fast changing circumstances,   vague references to "this", or "that",  or "the plan" or "the latest directive" etc. may have shifted between the time you wrote the letter and the time the recipient receives it or reads it. Take the extra 30 seconds to be specific.

* Never assume the recipient is even there anymore.    Stuff happens.

* WHEN you make a mistake,  tell your team and your boss as soon as possible!  Don't compound things by waiting. Get if over with.  It doesn't get better -- it only gets worse, especially if someone else can fix things up right away but no one can fix things up if you wait.  More people end up being fried for cover-ups than for the original problem.

The guideline for private pilots who make the mistake of getting lost, or flying into deteriorating weather, is CCC :   Climb, Communicate, and Confess.

Of all of the above,  I suspect a guideline of never operating solo and always having a buddy is the most valuable if you can possibly pull it off.

OK - your turn- what did I miss?  What other rules of thumb do you find invaluable? Do you have links to some great other sources of rules of thumb for crisis situations for civilians?   Share your ideas here in the comments section !!  Please !!

3 comments:

Wade said...

A friend pointed out that some of us are near the limit of our ability to cope, near the edge of absolute frustration. This is a different facet of psychology than what I referred to above as a lower IQ. This anxiety means that people need to be cut a lot of slack and given extra room and extra time to make decisions. There is a serious risk of missing signals that you are sending in new information too fast, or asking for decisions too quickly.

There is serious difficulty focusing and concentrating, possibly made worse by actual injury and pain one is fighting through. There is serious risk of toppling this person's carefully balanced stack of things to do in what order, making things much worse.

Current fighter-jets, as I understand it, are built to monitor the pilot's vital signs and if the plane's Artificial Intelligence system determines the pilot is getting overloaded, it drastically simplifies the displays and the decisions it is asking the pilot to make, and stops bugging the pilot for a response with incessant alarms. This is probably a good model for us to emulate.

Watch for signs of overload or a person getting near their threshold of breaking down or of "losing the picture." Be gentle. Be patient. Don't send signals that you are in a huge rush and to hurry up.

One thing that happens with pilots is that there is a clear understanding and agreement with the Tower to assume and accept that the pilot is busy with something important. The Tower will never start reading the clearance to the pilot until the pilot indicates they are "ready to copy" -- and they certainly do not read the clearance to possibly empty space and then assume the pilot go it. In fact, in many circumstances, a "readback" is requested or demanded to double check that what the pilot got is what the Tower wanted the to get.

Again, that's a good model, based on tens of thousands of instances where bad things happened because communications didn't work.

Wade said...

Two other thoughts.
(1) Machines like Copiers are like horses, as they have anxiety detectors in them which cause them to malfunction if you are anxious or in a hurry. The closer you are to a deadline, the more likely the copier ( or your Windows 10 computer ) will be to do something totally bizarre, go off in a corner to sulk, or put up a spinning cursor with no explanation and an interminable "please wait..." message.

If its an industrial-size copier, it will choose this moment to get a paper jam, or run out of toner and refuse to print even in black and white despite what you are out of is blue.

Alex B said...

These are invaluable "heads-up" points - for any life situation!
Personally find a daily (spiritually inspired!) *meditation* routine, morning and evening, most helpful in orienting and sorting out my priorities and "mission" in life. As the scripture says, "Bring thyself to account each day..." (The Hidden Words of Baha'u'llah)