Early Indications: January 2016

Saturday, January 30, 2016

Early Indications January 2016: Shocks

The past month has been marked by a series of extraordinary events that would have been completely unforeseen only a year ago, or even in mid-summer. (In June, West Texas Intermediate crude oil futures contracts were selling at $60 a barrel, roughly twice the current price.) While this may be an unusual month, the larger question remains: how can human institutions evolve to better address both sudden and glacial change, in both positive and negative directions? Put another way, if we see what keeps surprising us, maybe we can adapt our practices and assumptions to be surprised less often, less acutely, or both.

Oil is certainly big news. While the dynamics of a global market, controlled by a wide range of political and business players, remain fascinating, “common knowledge” in energy markets shifts dramatically. Recall how recently talk of “peak oil” was common: according to Google Trends, searches for the phrase spiked in August 2005 and, at a slightly lower index, May 2008. After 2011, interest dwindled to baseline noise, and today we wrestle with the problems of sub-$2.00 gasoline. The precise events coming into play right now have complex origins: innovations in drilling technology, geopolitical forces (including bitter national and ethnic rivalries), and national budgets whose planning assumptions have been obliterated. Saudi Arabia, for example, can produce a barrel of oil for about $3 but needs $93 to break even for budget purposes given its economic monoculture; Venezuela needs $149 a barrel to break even, to take the most extreme example. At $30, budgets in many places (including Alaska) are a mess.

Given that oil is such big business in so many parts of the world, considerable expertise is deployed in forecasting. Yet the industry’s record, with regard to both estimates of oil reserves and now prices, is consistently poor. Perhaps the lesson is that complex systems cannot be predicted well, so the best answer might be to shorten planning horizons — a tough call in light of the magnitude of investment and concomitant project lead time required.

The next “shock’ is in some ways predictable: U.S. infrastructure investment has lagged for so long that calamities on bridges, railroads, and water supplies are unfortunately overdue. The particular politics of Flint, Michigan’s mismanagement are also not surprising given the nature of both large, overlapping bureaucracies and the governor’s high priority on municipal budget repair to be performed by unelected “emergency managers.” The competing agendas are difficult: if bondholders lose trust, investment becomes prohibitive. At the same time, the dismissal of known test results and risks, and the human consequences thereof, are criminal: GM stopped using Flint water because it was destroying auto parts while Flint’s citizens had to keep drinking it.

The pattern in Flint is not all that unusual, except in its impact: given the size of federal and state governments, it’s hard to imagine who voters could hold accountable for substandard ports, roads, and airports. Many are in poor repair, but the constituencies are diffuse and/or politically marginal, and so can be ignored. Who can one complain to (or vote out) regarding connections inside Philadelphia’s airport, or Amtrak’s unreliability, or Detroit’s crumbling schools? Conversely, what good came to the Detroit mayor who supported that airport’s modernization? Who is the primary constituency that benefits from New Jersey’s extremely heavy spending on roads ($2 million per state-controlled mile) that are consistently graded as among the nation’s worst (at both the Interstate and local arterial levels)? Rather than planning horizons, the issue here appears to center on accountability. The interconnections of race, poverty, and party politics can also fuel tragedy: decisions were made in Flint that would be unthinkable in more affluent Detroit suburbs. (Another water issue, the one in California, could also amplify class conflicts in the event the El Nino snowpack melts to last summer’s levels in coming years.)

The third shock is a positive one. Google’s DeepMind unit (acquired for $400 million in 2014) announced that it had used machine learning to develop a computer capable of defeating the European champion at Go, the ancient Chinese game of strategy. AlphaGo, DeepMind’s program, will now play a higher-ranked champion in March. If the machine can win, another cognitive milestone will have been achieved with AI, about ten years faster than had been generally predicted. Interestingly, Facebook had previously announced that it had made significant progress at Go in a purely machine tournament, but the Google news swamped the magnitude of Facebook’s achievement.

To their credit, DeepMind’s team published the algorithmic architecture in Nature. Two distinct neural networks are built: one, the “policy network,” limits its scope to a small number of attractive options for each move, while the “value network” rates the proximate choices in the context of 20 moves ahead. It’s likely the technology will be tested outside abstract board games, potentially in climate forecasting, medical diagnostics, and other fields.

In this case, the breakthrough is so unexpected that nobody, including the scientists involved, knows what it means. Even though Deep Blue won at championship chess and Watson won at Jeopardy, neither advancement has translated into wide commercial or humanitarian benefit even though the game wins were in 1997 and 2011 respectively. This is by no means a critique of IBM; rather, turning technology breakthroughs in a specific domain into a more general-purpose tool can in some cases be impossible when it is not merely hard.

Elsewhere, however, giant strides are possible: Velodyne lidar, the spinning sensor atop the first generation Google car, has dropped from $75,000 per unit to a smaller unit costing under $500, with further economies of mass production to come. Even more astoundingly, the cost of human genomic sequencing continues to plummet: the first human DNA sequence cost $2.7 billion, for the entire research program. Shortly after, the cost was about $100,000 as of 2002; today it’s approaching $1,000, outpacing Moore’s law by a factor of thousands (depending on how one calculates) in a 15-year span.

In each of these technological instances, people have yet to invent large markets, business models, or related apparatus (liability law, quality metrics, etc) for these breakthroughs. As the IBM example showed in regard to AI, this is in some ways normal. At the same time, I believe we can create better scaffolding for technology commercialization: patent law reform comes immediately to mind. Erik Brynjolfsson and Andrew McAfee suggest some other ideas in their essential book, The Second Machine Age.

Education is of course a piece of the puzzle, and there’s a lot of discussion regarding STEM courses, including why more people should learn to code. I’ve seen several people make the case that code is already the basis of our loss of privacy, and there will be more deep questions emerging soon: who owns my genomic information? who controls my digital breadcrumbs? should big-data collection be opt-in or opt-out? Yes, knowing _how to_ code can get you a job, but more and more, knowing _about_ code will be essential for making informed choices as a citizen. The widespread lack of understanding of what “net neutrality” actually entails serves as a cautionary tale: few people understand the mechanics of peering, CDNs, and now mobile ad tech so much of the debate misses the core issue, which is lack of competition among Internet service providers. “Broadband industry consolidation” isn’t on anyone’s top-5 agenda in the U.S., yet even comedian John Oliver identified it as the major nut to crack with regard to information access.

In the end, humans will continue to see the future as looking much like the present, driven by psychological patterns we now understand better than ever. As shocks increase in magnitude, for many reasons including climatic ones, and impact, because so many aspects of life and commerce are interconnected, it may be time to rethink some of our approaches to planning for both the normal and the exceptional.

Monday, January 25, 2016

Early Indications November 2015: Broad thoughts on the Internet of Things

Current state

The notion of an Internet of Things is at once both old and new. From the earliest days of the World Wide Web, devices were connected so people could see the view out a window, traffic or ski conditions, a coffee pot at the University of Cambridge, or a Coke machine at Carnegie Mellon University. The more recent excitement dates to 2010 or thereabouts, and builds on a number of developments: many new Internet Protocol (IP) addresses have become available, the prices of sensors are dropping, new data and data-processing models are emerging to handle the scale of billions of device "chirps," and wireless bandwidth is getting more and more available. At a deeper level, however, the same criteria -- sense, think, act -- that define a robot for many working in the field also characterize large-scale Internet of Things systems: they are essentially meta-robots, if you will. The GE Industrial Internet model discussed below includes sensors on all manner of industrial infrastructure, a data analytics platform, and humans to make presumably better decisions based on the massive numbers from the first domain crunched by algorithms and computational resources in the second.

Building Blocks
The current sensor landscape can be understood more clearly by contrasting it to the old state of affairs. Most important, sensor networks mimicked analog communications: radios couldn't display still pictures (or broadcast them), record players couldn't record video, newspapers could not facilitate two- or multi-way dialog in real time. For centuries, sensors in increasing precision and sophistication were invented to augment human senses: thermometers, telescopes, microscopes, ear trumpets, hearing aids, etc. With the 19th century advances in electro-optics and electro-mechanical devices, new sensors could be developed to extend the human senses into different parts of the spectrum (e.g., infrared, radio frequencies, measurement of vibration, underwater acoustics, etc.).

Where they were available, electromechanical sensors and later sensor networks 
*stood alone
*measured one and only one thing
*cost a lot to develop and implement
*had inflexible architectures: they did not adapt well to changing circumstances. 
Sensors traditionally stood alone because networking them together was expensive and difficult. Given the lack of shared technical standards, to build a network of offshore data buoys for example, the interconnection techniques and protocols would be uniquely engineered to a particular domain, in his case, salt water, heavy waves, known portions of the magnetic spectrum, and so on. An agency seeking to connect sensors of a different sort (such as surveillance cameras) would have to start from scratch, as would a third agency monitoring road traffic.

In part because of their mechanical componentry, sensors rarely measured across multiple yardsticks. Oven thermometers measured only oven temperature, and displayed the information locally, if at all (given that perhaps a majority of sensor traffic informs systems rather than persons, the oven temperature might only drive the thermostat rather than a human-readable display). Electric meters only counted watt-hours in aggregate. Fast forward to today: a consumer Global Positioning Satellite (GPS) unit or smartphone will tell location, altitude, compass heading, and temperature, along with providing weather radio.

Electromechanical sensors were not usually mass-produced, with the exception of common items such as thermometers. Because supply was limited, particularly for specialized designs, the combination of monopoly supply and small order quantities kept prices high.

The rigid architecture was a function of mechanical devices’ specificity. A vibration sensor was different from a camera was different from a humidistat. Humidity data, in turn, was designed to be moved and managed in a particular analog domain (a range of zero to 100 per cent), while image recognition in the camera’s information chain typically featured extensive use of human eyes rather than automated processing.

Ubiquity
Changes in each of these facets combine to help create today’s emerging sensor networks, which are growing in scope and capability every year. The many examples of sensor capability accessible to (or surveilling) the everyday citizen illustrate the limits of the former regime: today there are more sensors recording more data to be accessed by more end points. Furthermore, the traffic increasingly originates and transits exclusively in the digital domain.

*Computers, which sense their own temperature, location, user patterns, number of printer pages generated, etc.
*Thermostats, which are networked within buildings and now remotely controlled and readable
*Telephones, the wireless variety of which can be understood as beacons, bar-code scanners, pattern-matchers (the Shazam application names songs from a brief audio sample), and network nodes
*Motor and other industrial controllers: many cars no longer have mechanical throttle linkages, so people step on a sensor every day without thinking as they drive by wire. Automated tire-pressure monitoring is also standard on many new cars. Airbags rely on a sophisticated system of accelerometers and high-speed actuators to deploy the proper reaction for collision involving a small child versus a lamp strapped into the front passenger seat.
*Vehicles: the OBD II diagnostics module, the toll pass, satellite devices on heavy trucks, and theft recovery services such as Lojack, not to mention the inevitable mobile phone, make vehicle tracking both powerful and relatively painless
*Surveillance cameras (of which there are over 10,000 in Chicago alone, and more than 500,000 in London)
*Most hotel door handles and many minibars are instrumented and generate electronic records of people’s and vodka bottles’ comings and goings.
*Sensors, whether embedded in animals (RFID chips in both household pets and race horses) or gardens (the EasyBloom plant moisture sensor connects to a computer via USB and costs only $50), or affixed to pharmaceutical packaging.

Note the migration from heavily capital-intensive or national-security applications down-market. A company called Vitality has developed a pill-bottle monitoring system: if the cap is not removed when medicine is due, an audible alert is triggered, or a text message could be sent.

A relatively innovative industrial deployment of vibration sensors illustrates the state of the traditional field. In 2006, BP instrumented an oil tanker with "motes," which integrated a processor, solid-state memory, a radio, and an input/output board on a single 2" square chip. Each mote could receive vibration data from up to ten accelerometers, which were mounted on pumps and motors in the ship’s engine room. The goal was to determine if vibration data could predict mechanical failure, thus turning estimates—a motor teardown every 2,000 hours, to take a hypothetical example—into concrete evidence of an impending need for service.

The motes had a decided advantage over traditional sensor deployments in that they operated over wireless spectrum. While this introduced engineering challenges arising from the steel environment as well as the need for batteries and associated issues (such as lithium’s being a hazardous material), the motes and their associated sensors were much more flexible and cost-effective to implement compared to hard-wired solutions. The motes also communicate with each other in a mesh topology: each mote looks for nearby motes, which then serve as repeaters en route to the data’s ultimate destination. Mesh networks are usually dynamic: if a mote fails, signal is routed to other nearby devices, making the system fault-tolerant in a harsh environment. Finally, the motes could perform signal processing on the chip, reducing the volume of data that had to be transmitted to the computer where analysis and predictive modeling was conducted. This blurring of the lines between sensing, processing, and networking elements is occurring in many other domains as well.

All told, there are dozens of billions of items that can connect and combine in new ways. The Internet has become a common ground for many of these devices, enabling multiple sensor feeds—traffic camera, temperature, weather map, social media reports, for example—to combine into more useful, and usable, applications. Hence the intuitive appeal of "the Internet of Things." As we saw earlier, network effects and positive feedback loops mean that considerable momentum can develop as more and more instances converge on shared standards. While we will not discuss them in detail here, it can be helpful to think of three categories of sensor interaction:

*Sensor to people: the thermostat at the ski house tells the occupants that the furnace is broken the day before they arrive, or a dashboard light alerting the driver that the tire pressure on their car is low
*Sensor to sensor: the rain sensor in the automobile windshield alerts the antilock brakes of wet road conditions and the need for different traction-control algorithms
*Sensor to computer/aggregator: dozens of cell phones on a freeway can serve as beacons for a traffic-notification site, at much lower cost than helicopters or "smart highways."

An "Internet of Things" is an attractive phrase that at once both conveys expansive possibility and glosses over substantial technical challenges. Given 20+ years of experience with the World Wide Web, people have long experience with hyperlinks, reliable inter-network connections, search engines to navigate documents, and wi-fi access everywhere from McDonalds to mid-Atlantic in flight. None of these essential pieces of scaffolding has an analog in the Internet of Things, however: garage-door openers and moisture sensors aren't able to read; naming, numbering, and navigation conventions do not yet exist; low-power networking standards are still unsettled; and radio-frequency issues remain problematic. In short, as we will see, "the Internet" may not be the best metaphor for the coming stage of device-to-device communications, whatever its potential utility.

Beyond the Web metaphor
Given that "the Internet" as most people experience it is global, searchable, and anchored by content or, increasingly, social connections, the "Internet of Things" will in many ways be precisely the opposite. Having smartphone access to my house's thermostat is a private transaction, highly localized and preferably NOT searchable by anyone else. While sensors will generate volumes of data that are impossible for most humans to comprehend, that data is not content of the sort that Google indexed as the foundation of its advertising-driven business. Thus while an "Internet of Things" may feel like a transition from a known world to a new one, the actual benefits of networked devices separate from people will probably be more foreign than saying "I can connect to my appliances remotely."

Consumer applications The notion of networked sensors and actuators can usefully be subdivided into industrial, military/security, or business-to-business versus consumer categories. Let us consider the latter first. Using the smartphone or a web browser, it is already possible to remotely control and/or monitor a number of household items:

•    slow cooker
•    garage-door opener
•    blood-pressure cuff
•    exercise tracker (by mileage, heart rate, elevation gain, etc)
•    bathroom scale
•    thermostat
•    home security system
•    smoke detector
•    television
•    refrigerator.

These devices fall into some readily identifiable categories: personal health and fitness, household security and operations, entertainment. While the data logging of body weight, blood pressure, and caloric expenditures would seem to be highly relevant to overall physical wellness, few physicians, personal trainers, or health insurance companies have built business processes to manage the collection, security, or analysis of these measurements. Privacy, liability, information overload, and, perhaps most centrally, outcome-predicting algorithms have yet to be developed or codified. If I send a signal to my physician indicating a physical abnormality, she could bear legal liability if her practice does not act on the signal and I subsequently suffer a medical event that could have been predicted or prevented.

People are gradually becoming more aware of the digital "bread crumbs" our devices leave behind. Progressive Insurance's Snapshot campaign has had good response to a sensor that tracks driving behavior as the basis for rate-setting: drivers who drive frequently, or brake especially hard, or drive a lot at night, or whatever could be judged worse risks and be charged higher rates. Daytime or infrequent drivers, those with a light pedal, or people who religiously buckle seat belts might get better rates. This example, however, illustrates some of the drawbacks of networked sensors: few sensors can account for all potentially causal factors. Snapshot doesn't know how many people are in the car (a major accident factor for teenage drivers), if the radio is playing, if the driver is texting, or when alcohol might be impairing the driver's judgment. Geographic factors are delicate: some intersections have high rates of fraudulent claims, but the history of racial redlining is also still a sensitive topic, so data that might be sufficiently predictive (ZIP codes traversed) might not be used out of fear it could be abused.

The "smart car" applications excepted, most of the personal Internet of Things use cases are to date essentially remote controls or intuitively useful data collection plays. One notable exception lies in pattern-cognition engines that are grouped under the heading of "augmented reality." Whether on a smartphone/tablet or through special headsets such as Google Glass, a person can see both the physical world and an information overlay. This could be a real-time translation of a road sign in a foreign country, a direction-finding aid, or a tourist application: look through the device at the Eiffel Tower and see how tall it is, when it was built, how long the queue is to go to the top, or any other information that could be attached to the structure, attraction, or venue.

While there is value to the consumer in such innovations, these connected devices will not drive the data volumes, expenditures, or changes in everyday life that will emerge from industrial, military, civic, and business implementations.

The Internet(s) of [infrastructure] Things
Because so few of us see behind the scenes to understand how public water mains, jet engines, industrial gases, or even nuclear deterrence work, there is less intuitive ground to be captured by the people working on large-scale sensor networking. Yet these are the kinds of situations where networked instrumentation will find its broadest application, so it is important to dig into these domains.

In many cases, sensors are in place to make people (or automated systems) aware of exceptions: is the ranch gate open or closed? Is there a fire, or just an overheated wok? Is the pipeline leaking? Has anyone climbed the fence and entered a secure area? In many cases, a sensor could be in place for years and never note a condition that requires action. As the prices of sensors and their deployment drop, however, more and more of them can be deployed in this manner, if the risks to be detected are high enough. Thus one of the big questions in security -- in Bruce Schneier's insight, not "Does the security measure work?" but "Are the gains in security worth the costs?" -- gets difficult to answer: the costs of IP-based sensor networks are dropping rapidly, making cost-benefit-risk calculations a matter of moving targets.

In some ways, the Internet of Things business-to-business vision is a replay of the RFID wave of the mid-aughts. Late in 2003, Wal-Mart mandated that all suppliers would use radio-frequency tags on their incoming pallets (and sometimes cases) beginning with the top 100 suppliers, heavyweight consumer packaged goods companies like Unilever, Procter & Gamble, Gillette, Nabisco, and Johnson & Johnson. The payback to Wal-Mart was obvious: supply chain transparency. Rather than manually counting pallets in a warehouse or on a truck, radio-powered scanners could quickly determine inventory levels without workers having to get line-of-sight reads on every bar code. While the 2008 recession contributed to the scaled-back expectations, so too did two powerful forces: business logic, and physics.

To take the latter first, RFID turned out to be substantially easier in labs than in warehouses. RF coverage was rarely strong and uniform, particularly in retrofitted facilities. Noise -- in the form of everything from microwave ovens to portable phones to forklift-guidance systems -- made reader accuracy an issue. Warehouses involve lots of metal surfaces, some large and flat (bay doors and ramps), others heavy and in motion (forklifts and carts): all of these reflect radio signals, often problematically. Finally, the actual product being tagged changes radio performance: aluminum cans of soda, plastic bottles of water, and cases of tissue paper each introduce different performance effects. Given the speed of assembly lines and warehouse operations, any slowdowns or errors introduced by a new tracking system could be a showstopper.

The business logic issue played out away from the shop floor. Retail and CPG profit margins can be very thin, and the cost of the RFID tagging systems for manufacturers that had negotiated challenging pricing schedules with Wal-Mart was protested far and wide. The business case for total supply chain transparency was stronger for the end seller than for the suppliers, manufacturers, and truckers required to implement it for Wal-Mart's benefit. Given that the systems delivered little value to the companies implementing them, and given that the technology didn't work as advertised, the quiet recalibration of the project was inevitable.

RFID is still around. It is a great solution to fraud detection, and everything from sports memorabilia to dogs to ski lift tickets can be easily tested for authenticity. These are high-value items, some of them scanned no more than once or twice in a lifetime rather than thousands of times per hour, as on an assembly line. Database performance, industry-wide naming and sharing protocols, and multi-party security practices are much less of an issue.

While it's useful to recall the wave of hype for RFID circa 2005, the Internet of Things will be many things. The sensors, to take only one example, will be incredibly varied, as a rapidly growing online repository makes clear. Laboratory instruments are shifting to shared networking protocols rather than proprietary ones. This means it's quicker to set up or reconfigure an experiment, not that the lab tech can see the viscometer or Geiger counter from her smart phone or that the lab will "put the device on the Internet" like a webcam.

Every one of the billions of smartphones on the planet is regularly charged by its human operator, carriers a powerful suite of sensors -- accelerometer, temperature sensor, still and video cameras/bar-code readers, microphone, GPS receiver -- and operates on multiple radio frequencies: Bluetooth, several cellular, WiFi. There are ample possibilities for crowdsourcing news coverage, fugitive hunting, global climate research (already, amateur birders help show differences in species' habitat choices), and more using this one platform.

Going forward, we will see more instrumentation of infrastructure, whether bridges, the power grid, water mains, dams, railroad tracks, or even sidewalks. While states and other authorities will gain visibility into security threats, potential outages, maintenance requirements, or usage patterns, it's already becoming clear that there will be multiple paths by which to come to the same insight. The state of Oregon was trying to enhance the experience of bicyclists, particularly commuters. While traffic counters for cars are well established, bicycle data is harder to gather. Rather than instrumenting bike paths and roadways, or paying a third party to do so, Oregon bought aggregated user data from Strava, a fitness-tracking smartphone app. While not every rider, particularly commuters, tracks his mileage, enough do that the bike-lane planners could see cyclist speeds and traffic volumes by time of day, identify choke points, and map previously untracked behaviors.

Strava was careful to anonymise user data, and in this instance, cyclists were the beneficiaries. Furthermore, cyclists compete on Strava and have joined with the expectation that their accomplishments can show up on leader boards. In many other scenarios, however, the Internet of Things' ability to "map previously untracked behaviors" will be problematic, for reasons we will discuss later.

Industrial scenarios
GE announced its Industrial Internet initiative in 2013. The goal is to instrument more and more of the company's capital goods -- jet engines are old news, but also locomotives, turbines, undersea drilling rigs, MRI machines, and other products -- with the goal of improving power consumption and reliability for existing units, and to improve the design of future products. Given how big the company's footprint is in these industrial markets, 1% improvements turn out to yield multi-billion-dollar opportunities. Of course, instrumenting the devices, while not trivial, is only the beginning: operational data must be analyzed, often using completely new statistical techniques, and then people must make decisions and put them into effect.

This holistic vision is far-sighted on GE's part and transcends the frequent technology-centric marketing messages that often characterize Silicon Valley rhetoric. That is, GE's end-to-end insistence on sensors AND software AND algorithms AND people is considerably more nuanced and realistic than, for example, Qualcomm's vision:

“the Internet of Everything (IoE) is changing our world, but its effect on daily life will be most profound. We will move through our days and nights surrounded by connectivity that intelligently responds to what we need and want—what we call the Digital Sixth Sense. Dynamic and intuitive, this experience will feel like a natural extension of our own abilities. We will be able to discover, accomplish and enjoy more. Qualcomm is creating the fabric of IoE for everyone everywhere to enable this Digital Sixth Sense.”

Not surprisingly, Cisco portrays the Internet of Things in similar terms; what Qualcomm calls "fabric" Cisco names "connectivity," appropriately for a networking company:
“These objects contain embedded technology to interact with internal states or the     external environment. In other words, when objects can sense and communicate, it changes how and where decisions are made, and who makes them.

The IoT is connecting new places–such as manufacturing floors, energy grids,     healthcare facilities, and transportation systems–to the Internet. When an object can represent itself digitally, it can be controlled from anywhere. This connectivity means more data, gathered from more places, with more ways to increase efficiency and improve safety and security.”

The other striking advantage of the GE approach is financial focus: 1% savings in a variety of industrial process areas yields legitimately huge cost savings opportunities. This approach has the simultaneous merits of being tangible, bounded, and motivational. Just 1% savings in aviation fuel over 15 years would generate more than $30 billion, for example.

But to get there, the GE vision is notably realistic about the many connected investments that must precede the harvesting of these benefits. 
    1) The technology doesn't exist yet. Sensors, instrumentation, and user interfaces need to be made more physically robust, usable by a global work force, and standardized to the appropriate degree.
    2) Information security has to protect assets that don't yet exist, containing value that has yet to be measured, from threats that have yet to materialize.
    3) Data literacy and related capabilities need to be cultivated in a global workforce that already has many skills shortfalls, language and cultural barriers, and competing educational agendas. Traditional engineering disciplines, computer science, and statistics will merge into new configurations.

Despite a lot of vague marketing rhetoric, the good news is that engineers, financial analysts, and others are recognizing the practical hurdles that have yet to be cleared. Among these are the following:

1) Power consumption 
If all of those billions of sensors require either hard-wired power or batteries, the toxic waste impact alone could be daunting. Add to this requirement the growing pressure of the electric-car industry on the worldwide battery supply, and the need for new power management, storage, and disposal approaches becomes clear.

2) Network engineering 
It's easy to point to all those sensors, each with its own IP address, and make comparisons to the original Internet. It's quite another matter, however, to make networks work when the sensor might "wake up" only once a day -- or once a month -- to report status. Other sensors, as we saw with jet engines, have the opposite effect, that of a firehose. Some kind of transitional device will likely emerge, either collecting infrequent heterogeneous "chirps" or consolidating, error-checking, compressing, and/or pre-processing heavy sensor volumes at the edge of a conventional network. Power management, security, and data integrity might also be in some of these devices' job description.

3) Security 
As the Stuxnet virus illustrated, the Internet of Things will be attacked by both amateur and highly trained people writing a wide variety of exploits. Given that Internet security is already something of a contradiction in terms, and given widespread suspicion that the NSA has engineered back doors into U.S. firms' technology products, market opportunities for EU and other IoT vendors might increase as a result. In any event, the challenge of making lightweight, distributed systems robustly secure without undue costs in selling price, operational overhead, interoperability, or performance has yet to be solved at a large scale. In 2014 the security firm Symantec announced that all exercise monitors tested were found to be insecure.

4) Data processing The art and science of data fusion is far from standardized in fields that have been practicing it for decades. Context, for instance, is often essential for interpretation but difficult to guarantee during collection. Add to the mix humans as sensor platforms, intermittent and hybrid network connectivity, information security requirements outside a defense/intelligence cultural matrix, and unclear missions -- many organizations quite reasonably do not know why they are measuring what they are measuring until after they try to analyze the numbers -- and the path of readings off the sensors and into decision-making becomes complicated indeed.

5) Cost effectiveness 
The RFID experiment foundered in part on the price of the sensors, which even when measured in dimes became an issue when the volumes of items to be tracked ranged into the millions. With past hardware investments in memory, for example, still stinging some investors, the path to profitability for ultra-low-power, ultra-low-cost now will be considerably different from the high-complexity, high-margin world that Intel so successfully mastered in the PC era.

6) Protocols 
The process by which the actual day-to-day workings of complex systems get negotiated makes for good business-school case studies, but challenging investment and decision-making. The USB standard, for example, had substantial industry "convening power" being exercised by Intel, and the benefits have been widely shared. For the IoT, it's less clear which companies will have a similar combination of engineering know-how, intellectual property (and a management mandate to form a profitless patent pool), industry fear and respect, and so on. As the VHS/Betamax, high-resolution audio CD, and high-resolution DVD standards wars have taught many people, it's highly undesirable to be stranded on the wrong side of an industry protocol. Hence, many players may sit out pending identifiable winners in the various standards negotiations.

7) APIs and middleware
The process by which device chirps become management insights requires multiple handoffs between sensors and PCs or other user devices. Relatively high up the stack are a variety of means by which processed, analyzed data can be connected to and queried by human decision makers, and so far, enterprise software vendors have yet to make a serious commitment to integrating these new kinds of data streams (or trickles, or floods) into management applications.

8) System management 
The IoT will need to generate usage logs, integrity checks, and all manner of tools for managing these new kinds of networks. Once again, data center and desktop PC systems management tools simply are not designed to handle tasks at this new level of granularity and scale. What will an audit of a network of "motes" look like? Who will conduct it? Who will require it?

Conclusion
As this note has hinted, the label "Internet of Things" could well steer thinking in unproductive directions. Looking at the World Wide Web as a prototype has many shortcomings: privacy, security, network engineering, human-in-the-loop advantages that may not carry over, and even the basic use case. At the same time, thinking of sensor networks in the same proprietary, single-purpose terms that have dictated generations of designs is also overdue.

Beyond the level of the device, data processing is being faced with new challenges -- in both scope and kind -- as agencies, companies, and NGOs (to name but three interested parties) try to figure out how to handle billions of cellphone chirps, remote-control clicks, or GPS traces. What information can and should be collected? By what entity? With what safeguards? For how long? At what level of aggregation, anonymization, and detail? With devices and people opting in or opting out? Who can see what data at what stage in the analysis life cycle?

Once information is collected, the statistical and computer science disciplines are challenged to find patterns that are not coincidence, predictions that can be validated, and insights available in no other way. Numbers rarely speak for themselves, and the context for Internet of Things data is often difficult to obtain or manage given the wide variety of data types in play. The more inclusive the model, however, the more noise is introduced and must be managed. And the scale of this information is nearly impossible to fathom: according to IBM Chief Scientist Jeff Jonas, mobile devices in the United States alone generated 600 billion geo-tagged transactions every day -- as of 2010.

In addition to the basic design criteria, the privacy issues cannot be ignored. Here, the history of Google Glass might be instructive: whatever the benefits that accrue to the user, the rights of those being scanned, identified, recorded, or searched matter in ways that Google has yet to acknowledge. Magnify Glass to the city or nation-state level (recall that England has an estimated 6 million video cameras, but nobody knows exactly how many), as the NSA revelations appear to do, and it's clear that technological capability has far outrun the formal and informal rules that govern social life in civil society.

Early Indications October 2015: Of colleges, jobs, and analytics

It's funny how careers unfold. As a result of being in a particular place in a particular time, I find myself teaching analytics, supply-chain management, and digital strategy, mostly at the masters level. Not only did I not study any of these subjects in graduate school, none of these disciplines existed under their current name as recently as 20 years ago or so. What follows are some reflections on careers, skills, and patterns in education prompted by my latest adventures as well as some earlier ones.

1) What should I major in?
Across the globe, parents and students look at the cost of college, salary trends, layoffs, predilections, and aspirations, then take a deep breath and sign up for a major. I have seen this process unfold multiple times, and people sometimes miss some less obvious questions that are tough to address, but even tougher to underestimate.

The seemingly relevant question, "what am I good at," is tough to answer with much certainty: we require students to declare a major before they've taken many (sometimes any) courses in it, and coursework and salary work are of course two different things as well. While it's tempting to ask, "who's hiring," it's much harder to ask "where will there be good jobs in 20 years?" Very few Chief Information officers in senior positions aspired to that title in college, mostly because it didn't exist. Now that CIOs are more common, it's unclear whether the title and skills will be as widely required once sensors, clouds, and algorithms improve over the next decade or two.

It's even more difficult to extrapolate what the new "hot" jobs will be. In the late 1990's, the U.S. Bureau of Labor statistics encouraged students to go into desk top publishing, based on projected demand. In light of smartphones, social networks and "green" thinking, the demand for paper media never materialized, then tablets, e- readers, and wearables cut into demand still further. It's easy to say the Internet of Things or robotics will be more important in 20 years than they are today, but a) will near-term jobs have materialized when the student loan payments come due right after graduation, or b) are there enough relevant courses at a given institution? One cause of a nursing shortage that emerged about 15 years ago was a shortfall in the number of nursing professors: there were unfilled jobs, and eager students, but not enough capacity to train sufficient numbers of people to ease the hiring crunch.

2). English (or psychology, or fill in the blank) majors are toast

Many politicians are trying to encourage STEM career development in state universities and cite low earning potential for humanities graduates as a reason to cut funding to these fields. As Richard Thaler would say, it matters whether you make French majors pay a premium, or give chemical engineers a discount: the behavioral economics of these things are fascinating. The University of Florida led the way here about three years ago, but it's hard to tell how the experiment panned out.

At the same time, the respected venture investor Bill Janeway wrote a pointed piece in Forbes this summer, arguing that overcoming the friction in the atoms-to-bits-to atoms business model (Uber being a prime example) demands not just coding or financial modeling, but something else:

"Unfortunately for those who believe we have entered a libertarian golden age, freed by digital technology from traditional constraints on market behavior, firms successful in disrupting the old physical economy will need to have as a core competency the ability to manage the political and cultural elements of the eco-systems in which they operate, as well as the purely economic ones. . . .

In short, the longer term, sustainable value of those disrupters that succeed in closing the loop from atoms to bits and back to atoms will depend as much on successful application of lessons from the humanities (history, moral philosophy) and the social sciences (the political economy and sociology of markets) as to mastery of the STEM disciplines."

http://www.forbes.com/sites/valleyvoices/2015/07/30/from-atoms-to-bits-to-atoms-friction-on-the-path-to-the-digital-future/

On the whole, as the need for such contrarian advice illustrates, we know little beyond the stereotypes of college majors. The half-life of technical skills is shrinking, so learning how to learn becomes important in building a career rather than merely landing an entry-level position. Evidence for the growing ability of computers and robots to replace humans is abundant: IBM bought the Weather Channel in part to feed the Watson AI engine, Uber wants robotic cars to replace its human drivers, and even skilled radiologists can be outperformed by algorithms. A paper by Carl Frey and Michael Osborne at Oxford convincingly rates most career fields by their propensity to be automated. It's a very illuminating, scary list (skip to the very end):

http://www.oxfordmartin.ox.ac.uk/downloads/academic/The_Future_of_Employment.pdf

To bet against one's own career, in effect short-selling an occupational field requires insight, toughness, and luck. At the same time, the jobs that require human interaction, memory of historical precedent, and tactile skills will take longer to automate. Thus the liberal arts orientation toward teaching people how to think rather than how to be a teacher, accountant, or health-club trainer will win out, I believe. This is a long term bet, to be sure, and in the interim, there will be unemployed Ivy Leaguers looking with some envy at their more vocationally focused state-school kin. Getting the timing right will be more luck than foresight.

3). What is analytics anyway?
As I've developed both a grad course and a workshop for a client in industry, I'm coming to understand this question differently. A long time ago, when I taught freshman composition, it took a few semesters to understand that while effective writing uses punctuation correctly, an expository writing (as it was called) course was an attempt to teach students how to think: to assess sources, to take a position, and to buttress an argument with evidence. All too frequently, however, colleges see the labor-intensive nature of freshman writing seminars as a cost to be cut, whether through using grad students, adjuncts, automation, or bigger section sizes. Each of these detracts from the close reading, personal attention, and rigorous exercises that neither scale well nor are done capably by many grad students or overworked adjuncts.

I'm seeing similar patterns in analytics. Once you get past the initial nomenclature, the two disciplines look remarkably similar: while courses are nominally about different things (words and numbers), each seeks to teach the skills related to assessing evidence, sustaining a point of view, and convincing a fair-minded audience with analysis and sourcing. To overstate, perhaps, analytics is no more a matter of statistics than writing is about grammar: each is a necessary but far from sufficient element of the larger enterprise. Numbers can be made to say pretty much whatever the author wants them to say, just as words can. In this context, the recent finding that very few (39%) published research findings in psychology could be replicated stands as a cautionary tale. (https://www.washingtonpost.com/news/speaking-of-science/wp/2015/08/27/trouble-in-science-massive-effort-to-reproduce-100-experimental-results-succeeds-only-36-times/) Unfortunately, American numeracy -- quantitative literacy -- is extremely low, rendering millions of people incapable of managing business, households, and retirement portfolios. Being able to write sound academic research, meanwhile, looks to be even more rare than we've thought.

A paradox emerges; at the moment when computational capability is effectively free and infinite relative to an individual's needs, the skills required to deploy that power are highly unequally distributed, with little sign of improvement any time soon. How colleges teach, who we teach, what we teach, and how it gets applied are all in tremendous upheaval: it's exciting, for sure, but the collateral damage is mounting (in the form of student loan defaults and low completion rates at for-profit colleges, to take just one example). Are we nearing a perfect storm of online learning, rapidly escalating demand for new skills, sticker shock or even buyer refusal to pay college tuition bills, abuses of student loan funding models, expensive and decaying physical infrastructure (much of it built in the higher-education boom of the 1960s), and demographics? Speaking of paradoxes, how soon will the insights of analytics -- discovering, interpreting, and manipulating data to inform better decisions -- take hold in this domain?