Monday, November 29, 2010

Is text better than video for communicating messages?

There was a a question on the NMC forum about video versus text. Here are my thoughts. I reduced them to a few lines of text (!) as my response there.

======
 (The image is from Cognitive Behavior Therapy Self Help Resources at http://www.getselfhelp.co.uk/interpersonal1.htm )


Summary of points :
1)   many people graduating from college today have never read an entire book. They simply don't know how to process large blocks of complex-logic well-written  structured running-text, either to read it or to write it.    Beyond a threshold length, which is surprisingly short,  they won't even try.  They were educated with the largest concept restricted to the size of a Powerpoint slide.

2)  People do not have the leisure of time,  let alone non-multitasking time, let alone an uninterrupted stretch of time.

3)  With semi-literate international workforces,  you may be limited to a vocabulary of 2,500 words or less, and some of those may be misunderstood.    People tend not to raise their hand to point out that they have no idea what a given word means.

4)  With interrupted attention,   it might be safe to figure that at least 10% of what you say is missed entirely and the gap filled in with what the listener thinks you probably meant should have gone there.  These gaps and fill-ins are silent on both sides, but can surface later when they try to reconstruct what it is you said in context.    Text is far more vulnerable to damage if a portion is missing or wrong than images.     If you miss one step in a set of directions to my house,  your directions are useless.  It's hard to put an X on a map, on the other hand, in a location that doesn't even exist.  It's easy to make that mistake in words.

5)  The retention rate for text, alone, after 48 hours is probably close to zero.   If the words tell a human story they can relate to, retention may be much higher.      If you make a video that tells a human story,  retention is higher still, even years later.

6)   Video is far more likely to convey tacit knowledge than text.   Watching a group of people contemplating a new idea, and then accepting it, with all the associated body language and non-verbal signals, is a much more powerful experience than reading about a new idea.


=====


detailed discussion:


Interesting discussion topic.  (taking another munch on my cookie and  a sip of coffee...)

I'd like to push the envelope back further, or perhaps fall entirely "out of the box",   and share some thought's I've had over the past 15 years since my masters degree program in computer science, specifically distributed artificial intelligence and a focus on problem with collaborating "agents" attempting to make sense of a scene.    For the record, I have a US Patent as well in the area of image processing and data-fusion,  so I've thought a great deal about this sort of thing.  These are not new thoughts, nor "off the cuff" ideas, but substantial core issues.

These thoughts are quite relevant to the question of text-based communication in business,  if you can bear with me. Of course, the length of this post is part and parcel of the whole problem.  No body can stick around long enough for anyone to actually get into deep thinking about an issue.   We aren't all down to a maximum attention span of 140 characters of a Twitter message, but we are definitely heading that direction.

One on-going battle in computer science is the question that might be phrased in English as:  "What fraction of important information can be expressed in words?"    Alan Turing did some key work on the irreducible core nature of computation, in the abstract, in general,  back during World War II, while he was cracking the German's ciphers for the British.    He dealt with questions of what kind of thing was "computable" at all, given infinite time,  and what kind of thing was simply not computable.

This is beyond Whorf's question about whether what we think , or CAN think, is limited and shaped by the language we are using.  Can you think things in  a largely parallel language, like Chinese, say,  that you cannot think,  let alone articulate,  in a serial language like  English?  Another interesting question.   Turing's question was,  heck with WHICH language,  are there things that are important that cannot be expressed in ANY language?

Again, this goes beyond but is related to the faddish myth that crippled Western Science for the last 100 years or so,  in a worshiping of mathematics, that said that if you couldn't put it in an equation, it wasn't real.  I think, finally, we are getting more and more cases of important things that "count, but cannot be counted."    That, in my mind, is good.

Anyway, cutting through all that,  Turing focused on what sort of problems can be expressed as strings of symbols,  and then solved by manipulating those symbols by ANY "computer" of any kind,shape, color, architecture, man-machine hybrids included.  I suppose that is a super-set of what kinds of problems can be articulated in words and then solved by "thinking about them" in a logical fashion.

Turing's work, however,  and his model of a completely general computer,  involved the concept of an infinite "tape" of ones and zeros, moving through a reading and write section,  for unlimited time, as the machine "worked on" the problem.

My objection to that work,  and the whole school of Computer Science which evolved from Simon and Newell's ground-breaking work at Carnegie Mellon in, oh, the 1960's I guess,   was that, at the time, the idea of "image processing" had not yet taken shape.    I was part of a new wave of "Young Turks" who argued that,  had Simon and Newell owned an image processing engine, they would have based all their work on it instead, and discarded the linear-symbol-string model of "everything important".

The crux of the matter is this:  in the real world,  the one human beings and societies operate within,  there is not an unlimited amount of time to work on a problem.  We do not have unlimited budgets in either space or time.    In fact, the total time the average serious researcher has to work on "a problem" is probably under a decade or two in total,  which has to be interrupted by activities of daily living,   so maybe, say,   5 years of actual thinking time is a rough upper limit.  That's the upper limit of the upper limit.

For normal mortals,  dealing with social issues around us,   maybe 200 hours is closer to an upper limit, and 40 hours (a work week), is more than most real-world problems get allocated.   The sad truth is that, today, most people graduating from college today have never actually read an entire book.  Ever.  I kid you not.   They are not given Mortimer Adler's "How to Read a Book."  They are not trained in how to tease apart a complex argument with detailed sub-branches of logic from its expression in linear speech in "a book." They are not trained how to take a complex thought, and articulate it in said format. 

Some might question (and do) whether they are even capable of entertaining a complex thought, regardless of input and output considerations.

So, a more relevant question that Turing's grand question about what is ultimately expressable and computable in symbols (or words),  is this:     What kind of stuff can human beings process in 4 hours, clock time, start to finish.

Sadly, the higher in the "chain-of-command" a human being is,  the less time they have to address any specific problem.  This, again, is a crucial piece of information.      While it is conceivable  that you could get a freshman to spend 40 hours on a particular problem,    it is not conceivable that you could get the University President to spend 40 hours on a particular problem, let alone the President of the USA, or the CEO of any large corporation, or the head of any military war effort.

I'll assert that without proof, but I think your experience probably supports it too, as a good first approximation to life in 2010.

Try, for example, to imagine an MD spending 40 hours on "your case".   The idea is absurd.  Nobody gets 40 hours.   Maybe, at the outside,  for a really complex, challenging, and compelling case,   if you had a really good health care system,  you could get 4 hours.    One hour is more likely.  Let's say the doctor really cares about you, the health system will accept the time spent this way, and you are able to get one hour of a doctor's thinking attentional time,  to the extent that every other important case in their mind is not "taking up RAM" or "background cycles."

Even that hour is unlikely to be "an hour".  It is more likely that you will get 12 minutes here,  5 minutes there, 8 minutes somewhere else, etc., that could "add up" mathematically to "60 minutes".  Whether it adds up in terms of "effective equivalent of undivided attention time" is a different (but important question.)

Still,  the reality today is that "attentional time" from pretty much anyone is fragmented,  and, for the most part, plagued by hundreds of other competing problems that lurk just below consciousness and suck up energy keeping them at bay.  Surgeons may learn how to totally "be where they are" (thank you Buddha),  but most of us,  given the slightest excuse or pause during "a meeting",   find ourselves immediately pulled away to some other problem in the hopper.

I used to do stage magic.   It's a fact known to magicians that adults in an audience spend, maybe, at most,  1 second out of every 20 actually present and looking at what you are doing.    They sample and "fill in the gaps", while their head is actually busy working on something else.    They see you lift the scissors towards a rope,  go off somewhere in thought,  then see two ends of rope drop as the scissors move away, and they will swear afterwards that they "saw" you "cut the rope".     Our heads are great at "filling in the gaps" so even we are unaware of the fact that we do this.  Kids, by the way, are terrible audiences to work with, because they tend to actually be present and watching,  not zombies  like their parents.

Back again to our question.  How can you communicate with people who have at most one hour of divided attention time to give you, and who do not have a large number of complex mental structures you can simply tap to resonate with. There are no shared classics,   only shared TV shows and songs and movies, none of which have great depth or complexity. 

Those are the strings of the instrument of their mind that you must work with in order to play your song in their head.

So, there are some choices here.    You can go with blocks of text.   You can try equations.   You can try graphics like charts and diagrams.  You can go with "PowerPoint" slides.  You can go with short YouTube videos.  You can drag them into virtual reality and give them an immersive experience in a different and possibly far more colorful, interactive, and exciting world.

For explicit knowledge,  words might work,  but again,  back to Turing's work and reality,  it is not enough to play your song in their head, hard as that is.   You must play it in such a way that,  48 hours later,  there is a residual change in their head from the way they would have been had you never played your song.

Otherwise, the whole enterprise is pointless, at least in terms of education of "getting somewhere" in an extended social discussion about, say, social policy issues or anything more complex than "what channel should we watch?"

Summarizing so far: Let's face reality here:
    *  You have very finite time -- maybe an hour of clock-time.  Less if the person matters, in the hierarchy of power.
    *  You have an audience or conversation partner or business associate who has many OTHER pressing problems,
         and whose attention you only partly have.
    *  You have problems that do not easily lend themselves to being described in words.   If you attempt to be accurate,  the number of words grows explosively, because you lack a common shared shorthand with the audience, and must try to define all terms and their nuances.    If you abandon accuracy, at the risk of being called on this "error" later,  you may be able to "boil down" a complex problem into a sufficiently over-simplified cartoon that fits on slides in a one-hour presentation.
    *  You have to assume that the audience is distracted, and is going to miss some of your key points but fill them in with their own thinking about what you must have meant or said in there, thereby distorting your message silently. 
    *   If you have an international audience or workforce, you may need to limit yourself to a vocabulary of the 2,500 most common words in English. 
    *   You STILL have the problem that there are things we have no words for (but might someday),  as well as things that there simply cannot possible ever be words for, that won't fit through this keyhole you're trying to talk through.

A brief plug for image processing.  Images have two dimensions, and text typically has only one dimension to it.  The implications are profound in terms of noise-correction and robustness.  Images are infinitely better.

If I give you "directions to the party",   and it turns out one of the roads is blocked, or I get one of the instructions wrong or ambiguous,   the whole set of directions becomes useless.   If I give you a map of the area and mark on it the location of the party, and if you know how to read maps (possibly a big if),  then you are in a mujch stronger position.  From that image you can derive linear word instructions ("turn left here"),  but you can ALSO derive other instructions if the first set fails. It is also very hard to put an "X" on a map in a location that is not on the map,  but it is trivial to write instructions that direct you to an impossible location.

This is a profound issue.     Serial strings are inherently vulnerable to "point errors".  Attempting to correct such errors in advance results in a word-explosion so that now you have a string of words which is too long to be processed in the finite time available.

I can take a photograph of  George Washington and randomly change 30% of the bits (pixels) from whatever they are to pure black, or pure white, or a mix of the two,  and you can still recognize that it is a picture of George Washington. The "image" is robust against that kind of point noise.      If I take a set of equations and randomly change 30% of the symbols,  all I have left is garbage.   In fact, if I get ONE symbol wrong, it may be garbage, or, worse,  silently wrong in such a way that it still looks correct. If I take a book or text and randomly change 30% of the words to something else,  it is very unlikely that the intended meaning will shine through on the far end.

FACT:  The multiplexed, distracted audience that your communication is intended for WILL miss or mis-interpret a significant fraction of what you "sent".

So there are four questions. 
(1)  How much can you send, in terms of volume and complex structure in a very short window?
(2)  How much of that will be received correctly through a noisy channel?
(3)  What will the residual of that message look like after 48 hours?
(4)  Can you convey, consciously and intentionally, tacit knowledge through this medium?

If you can use video that tells a human story,  you will be far better off on all counts than if you try this using text only.

=====



Wade

No comments: