Tuesday, June 25, 2013

Early Indications May 2013 First-world problems: Too much choice

Several smart people have written about choice as a paralyzing force in western consumer economies. A recent experience reminded me of a great consulting parable that has stuck with me for nearly 20 years, and those combined to raise some thoughts about some of the costs we bear for excessive choice.

1) Many of you know Jim Gilmore's work; along with Joe Pine, he wrote The Experience Economy. A long time ago, I worked with him at my first consulting-firm job. He told a great story that I can't find on line, and given that this is years later, I'm no doubt getting some of it wrong. If you read this and I got something wrong, sorry Jim, but the insight is still a good one, and I trust I'm true to the spirit of the tale:

Several businesspeople sit down in a hotel bar in New Orleans and a waitress approaches the table. "Hello everyone - may I bring you anything to drink?" One guy speaks up. "Yes, I'd like a draft beer." The waitress had heard this before, and responded semi-automatically. "I'm sorry, but we don't have any beer on tap. We have a great list of bottled beers though," and rattled off a long list of macro- and micro-brews. "Which of those may I get you?"

There's a reason consultants get a reputation, but knowing the guys in question, I doubt the following exchange was done in a snarky fashion. "I asked for a draft beer, and you don't have draft beers," the man confirmed. "No, we don't, but we have bottled beers," and the waitress recited the list of bottled brews again.

At the end of said recitation, the patron said gently, "You asked if you could bring us something to drink. I asked for a draft beer. Right across the street, I see the Hyatt bar has Bud on tap. Is it possible to bring a beer over from there?" Never having heard this before, the waitress had to go ask her manager, but a few minutes later, the table got its draft beer, ensuring the anonymous server immortality in consulting lore, not to mention a generous tip for being such a good sport.

2) I was thinking of this story this weekend as I was painting my kitchen. Armed with a fat Benjamin Moore color-swatch book, the lady and I held up tiny paint chip after tiny paint chip, settling on a promising candidate. Not wanting to buy multiple quarts of to-be-discarded colors, I bought a gallon of the I-hope-it-will-look-good chip color. After bringing it home to try on the wall, though, the paint was . . . wrong. We decided to try to lighten it to to the next higher swatch on the card of six colors. At the store, I was surprised to discover that I would need FOUR gallons of white to dilute my gallon one shade, but the paint guy (at Sherwin Williams, who used the Moore colors with no problem) was able to add green tint to make the color less rosy in just the gallon I'd already bought.

Back at home, the color was true to a neighboring chip -- the clerk did a great job, at no charge -- but still not right. After some on-line grazing of kitchen color advice, we found the Moore book had a tiny subsection called America's Colors and in that limited palette of maybe 25 colors, there was a color that looked good in pictures, looked good on a swatch, and, I can report, looks good on the walls (not to mention my hands, calves, and hair).

3) In both of these vignettes, the point is the same, and echoes the provocative TED talk by Barry Schwartz on the tyranny of too much choice (http://www.ted.com/talks/barry_schwartz_on_the_paradox_of_choice.html). People don't want infinite choice (not just because of the mental-health implications Schwartz outlines). No, people do not want infinite choice: they want what they want.

What does this idea have to do with technology and business? Two main ideas come to mind. First, extensive choice adds to the customer service burden: ordering a beer from a list of 75 is harder, takes more time, and requires more guidance than ordering from a list of, say, 10 brews. Restaurant servers get asked all the time, "what's the Basement Brewery Old Wheaties taste like?" and skilled waiters and waitresses have a variety of useful answers at the ready: a) compare it to something less obscure, b) offer a personal testimonial or diplomatic warning, or c) offer to bring you a sample. Each of these makes his or her job harder than it might otherwise be, but the burden of service is higher in a high-choice scenario, whether in wedding dresses, paints, restaurants, or car shopping: before Ford rationalized the optioning process, the 2008 F-150 pickup came in more than a billion possible combinations.

Recommendation engines can help cut through the noise. Amazon's recommendations are so good they sometimes hit  too close to home; eBay's are pretty generic and/or too obviously linked to my last visit, which may have been a one-off (e.g., 2005 Toyota Camry door handle). Netflix and iTunes are working hard on this set of technologies, but for domains outside of media, it's hard to build a sufficiently robust profile. No great recommendation engines jump to mind for financial services, consumer electronics, or other expensive, highly complex purchases.

Another alternative to requiring more and more skilled service is to emphasize design. Apple has basically three mobile devices, with varying amounts of memory/storage, usually two colors, and a few connectivity options. Android, meanwhile, offers more than 4,000 different devices. While one may be EXACTLY what I need, cutting through the noise to find it is non-trivial at point of purchase. In the supply chain, meanwhile, keeping parts, contractors, documentation, and manufacturing expertise all current (never mind optimized) for so many devices to be sold in so many geographies is a significant challenge. Furthermore, the network effects that result from a shared platform are constrained somewhat because of Android sub-speciation. An app built for a touch screen display of size x won't render quite the same way on a size y display, and keyboard-driven devices don't enjoy full reciprocity with the glass keyboard on the touchscreen models. I often do an experiment in my classes: swap Android devices with a stranger. Now try to open a familiar app, send a text, or call someone. Very few Android users feel comfortable on kindred but non-identical devices.

Apple, meanwhile, has made iPods, iPhones, and iPads look and feel like siblings; the learning curve, while cumulatively considerable (think about being handed an iPad in 1999), is continuously compatible. The exception is the current Apple TV remote: unintuitive, slow, too easily lost, not predictably responsive. I mention this because Microsoft's next Xbox, revealed last week, was demoed with a convincing, lag-free voice interface. All the Google Glass jokes notwithstanding, voice control will have its place -- but in the living room, most likely, rather than on the subway, in restrooms, or at the grocery.

Voice control, as anyone who has used IVR trees can attest, is non-trivial to get right, but does have the potential to render the negative implications of device proliferation less onerous. In the meantime, especially when dealing with the digital Swiss army knives of our era, consumers will continue to face a bewildering array of choices until Samsung and/or Google get a handle on simplification. At the macro level, choices are often easy for vendors to generate, particular insofar as they are increasingly defined by software (adding another menu item is "free" to the developer), but consumer frustration would suggest that the insights of the Don Normans, the Barry Schwartzes, the Bruce Schneiers, and the Jered Spools of the world are falling on infertile ground. In plenty of situations, less is truly more.

Early Indications June 2013: What gets measured, gets . . .

It has become a business truism that “what gets measured, gets managed” after the great Peter Drucker allegedly wrote it. (There is no citation, however, and it may be that original credit goes to Lord Kelvin, who stated that “If you cannot measure it, you cannot improve it.”) In the “big data” era, it has become an article of faith that the more measurements we can gather and presumably analyze, the more we can optimize behavior that drives medical outcomes, social welfare, and corporate profitability. While I believe that we will see some extremely positive validations of this hypothesis, there are also enough cautionary tales that suggest some skepticism is warranted before accepting the promises of the big data evangelists as articles of faith.

Five unrelated examples combine to suggest an alternative mantra:

what gets measured, gets gamed.

That is, the scorecard gets attention _at the expense of_ the nominal task that was being measured in the first place.
  
Example 1: A former student reported that forecasting tools in a consumer products company were generating remarkably consistent projections, regardless of seasonality, competitors’ new product launches, or other visible alterations to the landscape. After some investigation, it was determined that a specific forecast curve had become popular (whether with procurement, finance, marketing, or plant managers was not made clear). To generate the “acceptable” forecast month after month, analysts took to [essentially] defeating the forecast algorithms by adjusting past actual quantities: to get the future curve they wanted, employees rewrote history.

Sales forecasting is gamed by definition, given the way commissions, market uncertainty, and expectation management affect the process. Numerous attempts have been made to induce “best-guess” estimations by the sales people, but even those companies that deployed prediction markets reported mixed results.

Example 2: For a time there was a breed of financial planner who was paid not on the basis of his or her clients’ rate of return, but by commissions generated by equities trades. Not surprisingly, clients did not get advice based on the long-term growth of their portfolio, but on the hottest stock of the moment. Moving clients in and out of different equities based on magazine cover stories proved to be good business for the planners, and only incidentally and accidentally profitable for the clients.

Example 3: A former colleague of mine recently analyzed the marketing activities of a large technology company. Even though the company sells B-to-B with a direct sales force, an executive dashboard someplace measures website clicks. The word came down through marketing that each product group had to “win the dashboard,” in this case, piling up web clicks through heavy ad placement even though this behavior could in no way be tied to revenue, customer satisfaction, or even lead generation.

Example 4 comes from closer to home. Course evaluations have become the focus of many universities’ professional assessments of non-research faculty, trying to ensure that students feel the instructor did his or her job. At Penn State the forms are called not “course evaluations” but SRTEs: Student Ratings of Teacher Effectiveness, though I doubt I am alone in believing the E stands for Entertainment. In my last teaching job, now 20 years ago, I was known to game the evaluation process, bringing cookies for the class on the last day before passing out paper evaluation forms. In our modern age, however, the assessment has gone online, so students are able to fill out the forms at their convenience and administrators can get scores reported in days rather than the months it took to code paper instruments.

At Penn State, the move toward paperless assessment has coincided with a startling drop in the completion rate. Like some other schools, we have an institute for the advancement of teaching skills. Upon seeing the drop in SRTE completion, our center undertook a project to try to improve compliance with the assessment. Note that these efforts do nothing to improve pedagogy or understand why compliance is dropping; the focus on the course assessment process is completely unrelated to helping students learn. Once again, the tail is wagging the dog.

Example 5: Information technology has become the backbone of most modern organizations. Grading the performance of the IS group, however, is extremely difficult. In many IS shops, measuring system uptime is readily quantifiable and usually scores in the high 90s. (For reference, 99.5% is a great score on a test but in this context it means the system was down for almost two full days a year.) What is much more difficult to measure, yet more important to business performance, is whether the right applications were running in the first place, how much inefficiency in the data center was required to get the gaudy uptime number, or how good the data was that the system delivered. Information quality is one of those metrics that is incredibly hard (and sometimes embarrassing) to measure, hard to improve, and hard to justify in terms of conventional ROI. Yet while it is, more often than not, truly critical for business performance, information quality was not in years past a component of a CIO’s performance plan. I’m told the situation is changing, although measuring application portfolio management – how well IS gets the right tools into production and the old ones retired – remains a challenge.

The five examples, along with many others from your own experience, suggest two important lessons. First, more data will by definition – thank you Claude Shannon – contain more noise. As Nassim Taleb notes in his critique of uncritical big-data love, more data simply means more cherry-picking (and not, Nate Silver would add, better hypothesis generation).

Second, in the domain of human management, incentive structures remain hard to get right, so there will be more and more temptations to let “numbers speak for themselves.” Such attitudes can emphasize the most readily measured phenomena, often of activity rather than outcomes – web clicks are easier to count than conversions; sales calls are easier to generate than revenues; incoming SAT scores are easier to average than student loan debt or job placement rates of the graduating class.

One would hope that getting the assessments right, even though it usually means counting something that doesn’t look as good, should dictate performance assessment. Given so much evidence from the worlds of medicine, commerce, sports, the military (remember Robert McNamara's "kill ratios"?), and academia to the contrary, however, it would appear that these games will forever be with us.