Several smart people have written about choice as a paralyzing force in
western consumer economies. A recent experience reminded me of a great
consulting parable that has stuck with me for nearly 20 years, and those
combined to raise some thoughts about some
of the costs we bear for excessive choice.
1) Many of you know Jim Gilmore's work; along with Joe Pine, he wrote
The Experience Economy. A long time ago, I worked with him at my first
consulting-firm job. He told a great story that I can't find on line,
and given that this is years later, I'm no doubt
getting some of it wrong. If you read this and I got something wrong,
sorry Jim, but the insight is still a good one, and I trust I'm true to
the spirit of the tale:
Several businesspeople sit down in a hotel bar in New Orleans and a
waitress approaches the table. "Hello everyone - may I bring you
anything to drink?" One guy speaks up. "Yes, I'd like a draft beer." The
waitress had heard this before, and responded semi-automatically.
"I'm sorry, but we don't have any beer on tap. We have a great list of
bottled beers though," and rattled off a long list of macro- and
micro-brews. "Which of those may I get you?"
There's a reason consultants get a reputation, but knowing the guys in
question, I doubt the following exchange was done in a snarky fashion.
"I asked for a draft beer, and you don't have draft beers," the man
confirmed. "No, we don't, but we have bottled beers,"
and the waitress recited the list of bottled brews again.
At the end of said recitation, the patron said gently, "You asked if you
could bring us something to drink. I asked for a draft beer. Right
across the street, I see the Hyatt bar has Bud on tap. Is it possible to
bring a beer over from there?" Never having
heard this before, the waitress had to go ask her manager, but a few
minutes later, the table got its draft beer, ensuring the anonymous
server immortality in consulting lore, not to mention a generous tip for
being such a good sport.
2) I was thinking of this story this weekend as I was painting my
kitchen. Armed with a fat Benjamin Moore color-swatch book, the lady and
I held up tiny paint chip after tiny paint chip, settling on a
promising candidate. Not wanting to buy multiple quarts
of to-be-discarded colors, I bought a gallon of the
I-hope-it-will-look-good chip color. After bringing it home to try on
the wall, though, the paint was . . . wrong. We decided to try to
lighten it to to the next higher swatch on the card of six colors. At
the store, I was surprised to discover that I would need FOUR gallons
of white to dilute my gallon one shade, but the paint guy (at Sherwin
Williams, who used the Moore colors with no problem) was able to add
green tint to make the color less rosy in just
the gallon I'd already bought.
Back at home, the color was true to a neighboring chip -- the clerk did a
great job, at no charge -- but still not right. After some on-line
grazing of kitchen color advice, we found the Moore book had a tiny
subsection called America's Colors and in that limited
palette of maybe 25 colors, there was a color that looked good in
pictures, looked good on a swatch, and, I can report, looks good on the
walls (not to mention my hands, calves, and hair).
3) In both of these vignettes, the point is the same, and echoes the
provocative TED talk by Barry Schwartz on the tyranny of too much choice
(http://www.ted.com/talks/barry_schwartz_on_the_paradox_of_choice.html).
People don't want infinite choice (not just because of the
mental-health implications Schwartz outlines). No, people do not want
infinite choice: they want what they want.
What does this idea have to do with technology and business? Two main
ideas come to mind. First, extensive choice adds to the customer service
burden: ordering a beer from a list of 75 is harder, takes more time,
and requires more guidance than ordering from
a list of, say, 10 brews. Restaurant servers get asked all the time,
"what's the Basement Brewery Old Wheaties taste like?" and skilled
waiters and waitresses have a variety of useful answers at the ready: a)
compare it to something less obscure, b) offer
a personal testimonial or diplomatic warning, or c) offer to bring you a
sample. Each of these makes his or her job harder than it might
otherwise be, but the burden of service is higher in a high-choice
scenario, whether in wedding dresses, paints, restaurants,
or car shopping: before Ford rationalized the optioning process, the
2008 F-150 pickup came in more than a billion possible combinations.
Recommendation engines can help cut through the noise. Amazon's
recommendations are so good they sometimes hit too close to home;
eBay's are pretty generic and/or too obviously linked to my last visit,
which may have been a one-off (e.g., 2005 Toyota Camry
door handle). Netflix and iTunes are working hard on this set of
technologies, but for domains outside of media, it's hard to build a
sufficiently robust profile. No great recommendation engines jump to
mind for financial services, consumer electronics, or
other expensive, highly complex purchases.
Another alternative to requiring more and more skilled service is to
emphasize design. Apple has basically three mobile devices, with varying
amounts of memory/storage, usually two colors, and a few connectivity
options. Android, meanwhile, offers more than
4,000 different devices. While one may be EXACTLY what I need, cutting
through the noise to find it is non-trivial at point of purchase. In the
supply chain, meanwhile, keeping parts, contractors, documentation, and
manufacturing expertise all current (never
mind optimized) for so many devices to be sold in so many geographies
is a significant challenge. Furthermore, the network effects that result
from a shared platform are constrained somewhat because of Android
sub-speciation. An app built for a touch screen
display of size x won't render quite the same way on a size y display,
and keyboard-driven devices don't enjoy full reciprocity with the glass
keyboard on the touchscreen models. I often do an experiment in my
classes: swap Android devices with a stranger.
Now try to open a familiar app, send a text, or call someone. Very few
Android users feel comfortable on kindred but non-identical devices.
Apple, meanwhile, has made iPods, iPhones, and iPads look and feel like
siblings; the learning curve, while cumulatively considerable (think
about being handed an iPad in 1999), is continuously compatible. The
exception is the current Apple TV remote: unintuitive,
slow, too easily lost, not predictably responsive. I mention this
because Microsoft's next Xbox, revealed last week, was demoed with a
convincing, lag-free voice interface. All the Google Glass jokes
notwithstanding, voice control will have its place -- but
in the living room, most likely, rather than on the subway, in
restrooms, or at the grocery.
Voice control, as anyone who has used IVR trees can attest, is
non-trivial to get right, but does have the potential to render the
negative implications of device proliferation less onerous. In the
meantime, especially when dealing with the digital Swiss army
knives of our era, consumers will continue to face a bewildering array
of choices until Samsung and/or Google get a handle on simplification.
At the macro level, choices are often easy for vendors to generate,
particular insofar as they are increasingly defined
by software (adding another menu item is "free" to the developer), but
consumer frustration would suggest that the insights of the Don Normans,
the Barry Schwartzes, the Bruce Schneiers, and the Jered Spools of the
world are falling on infertile ground. In
plenty of situations, less is truly more.
Early Indications is the weblog version of a newsletter I've been publishing since 1997. It focuses on emerging technologies and their social implications.
Tuesday, June 25, 2013
Early Indications June 2013: What gets measured, gets . . .
It has become a business truism that “what gets measured,
gets managed” after the great Peter Drucker allegedly wrote it. (There is no
citation, however, and it may be that original credit goes to Lord Kelvin, who stated
that “If you cannot measure it, you cannot improve it.”) In the “big data” era,
it has become an article of faith that the more measurements we can gather and
presumably analyze, the more we can optimize behavior that drives medical
outcomes, social welfare, and corporate profitability. While I believe that we
will see some extremely positive validations of this hypothesis, there are also
enough cautionary tales that suggest some skepticism is warranted before
accepting the promises of the big data evangelists as articles of faith.
Five unrelated examples combine to suggest an alternative
mantra:
what gets measured, gets gamed.
That is, the scorecard gets attention _at the expense of_ the
nominal task that was being measured in the first place.
Example 1: A former student reported that forecasting tools
in a consumer products company were generating remarkably consistent
projections, regardless of seasonality, competitors’ new product launches, or
other visible alterations to the landscape. After some investigation, it was
determined that a specific forecast curve had become popular (whether with
procurement, finance, marketing, or plant managers was not made clear). To
generate the “acceptable” forecast month after month, analysts took to
[essentially] defeating the forecast algorithms by adjusting past actual
quantities: to get the future curve they wanted, employees rewrote history.
Sales forecasting is gamed by definition, given the way commissions,
market uncertainty, and expectation management affect the process. Numerous
attempts have been made to induce “best-guess” estimations by the sales people,
but even those companies that deployed prediction markets reported mixed
results.
Example 2: For a time there was a breed of financial planner
who was paid not on the basis of his or her clients’ rate of return, but by
commissions generated by equities trades. Not surprisingly, clients did not get
advice based on the long-term growth of their portfolio, but on the hottest
stock of the moment. Moving clients in and out of different equities based on
magazine cover stories proved to be good business for the planners, and only
incidentally and accidentally profitable for the clients.
Example 3: A former colleague of mine recently analyzed the
marketing activities of a large technology company. Even though the company
sells B-to-B with a direct sales force, an executive dashboard someplace
measures website clicks. The word came down through marketing that each product
group had to “win the dashboard,” in this case, piling up web clicks through
heavy ad placement even though this behavior could in no way be tied to
revenue, customer satisfaction, or even lead generation.
Example 4 comes from closer to home. Course evaluations have
become the focus of many universities’ professional assessments of non-research
faculty, trying to ensure that students feel the instructor did his or her job.
At Penn State the forms are called not “course evaluations” but SRTEs: Student
Ratings of Teacher Effectiveness, though I doubt I am alone in believing the E
stands for Entertainment. In my last teaching job, now 20 years ago, I was
known to game the evaluation process, bringing cookies for the class on the
last day before passing out paper evaluation forms. In our modern age, however,
the assessment has gone online, so students are able to fill out the forms at
their convenience and administrators can get scores reported in days rather
than the months it took to code paper instruments.
At Penn State, the move toward paperless assessment has
coincided with a startling drop in the completion rate. Like some other
schools, we have an institute for the advancement of teaching skills. Upon
seeing the drop in SRTE completion, our center undertook a project to try to
improve compliance with the assessment. Note that these efforts do nothing to
improve pedagogy or understand why compliance is dropping; the focus on the
course assessment process is completely unrelated to helping students learn. Once
again, the tail is wagging the dog.
Example 5: Information technology has become the backbone of
most modern organizations. Grading the performance of the IS group, however, is
extremely difficult. In many IS shops, measuring system uptime is readily
quantifiable and usually scores in the high 90s. (For reference, 99.5% is a
great score on a test but in this context it means the system was down for
almost two full days a year.) What is much more difficult to measure, yet more
important to business performance, is whether the right applications were
running in the first place, how much inefficiency in the data center was
required to get the gaudy uptime number, or how good the data was that the
system delivered. Information quality is one of those metrics that is incredibly hard
(and sometimes embarrassing) to measure, hard to improve, and hard to justify
in terms of conventional ROI. Yet while it is, more often than not, truly
critical for business performance, information quality was not in years past a component
of a CIO’s performance plan. I’m told the situation is changing, although measuring
application portfolio management – how well IS gets the right tools into
production and the old ones retired – remains a challenge.
The five examples, along with many others from your own
experience, suggest two important lessons. First, more data will by definition
– thank you Claude Shannon – contain more noise. As Nassim Taleb notes in his
critique of uncritical big-data love, more data simply means more
cherry-picking (and not, Nate Silver would add, better hypothesis generation).
Second, in the domain of human management, incentive
structures remain hard to get right, so there will be more and more temptations
to let “numbers speak for themselves.” Such attitudes can emphasize the most readily
measured phenomena, often of activity rather than outcomes – web clicks are
easier to count than conversions; sales calls are easier to generate than
revenues; incoming SAT scores are easier to average than student loan debt or
job placement rates of the graduating class.
One would hope that getting the assessments right, even though it usually means
counting something that doesn’t look as good, should dictate
performance assessment. Given so much evidence from the worlds of medicine,
commerce, sports, the military (remember Robert McNamara's "kill ratios"?), and academia to the contrary, however, it would
appear that these games will forever be with us.