Monday, January 25, 2016

Early Indications November 2015: Broad thoughts on the Internet of Things



Current state

The notion of an Internet of Things is at once both old and new. From the earliest days of the World Wide Web, devices were connected so people could see the view out a window, traffic or ski conditions, a coffee pot at the University of Cambridge, or a Coke machine at Carnegie Mellon University. The more recent excitement dates to 2010 or thereabouts, and builds on a number of developments: many new Internet Protocol (IP) addresses have become available, the prices of sensors are dropping, new data and data-processing models are emerging to handle the scale of billions of device "chirps," and wireless bandwidth is getting more and more available. At a deeper level, however, the same criteria -- sense, think, act -- that define a robot for many working in the field also characterize large-scale Internet of Things systems: they are essentially meta-robots, if you will. The GE Industrial Internet model discussed below includes sensors on all manner of industrial infrastructure, a data analytics platform, and humans to make presumably better decisions based on the massive numbers from the first domain crunched by algorithms and computational resources in the second.

Building Blocks
The current sensor landscape can be understood more clearly by contrasting it to the old state of affairs. Most important, sensor networks mimicked analog communications: radios couldn't display still pictures (or broadcast them), record players couldn't record video, newspapers could not facilitate two- or multi-way dialog in real time. For centuries, sensors in increasing precision and sophistication were invented to augment human senses: thermometers, telescopes, microscopes, ear trumpets, hearing aids, etc. With the 19th century advances in electro-optics and electro-mechanical devices, new sensors could be developed to extend the human senses into different parts of the spectrum (e.g., infrared, radio frequencies, measurement of vibration, underwater acoustics, etc.).

Where they were available, electromechanical sensors and later sensor networks

*stood alone
*measured one and only one thing
*cost a lot to develop and implement
*had inflexible architectures: they did not adapt well to changing circumstances.

Sensors traditionally stood alone because networking them together was expensive and difficult. Given the lack of shared technical standards, to build a network of offshore data buoys for example, the interconnection techniques and protocols would be uniquely engineered to a particular domain, in his case, salt water, heavy waves, known portions of the magnetic spectrum, and so on. An agency seeking to connect sensors of a different sort (such as surveillance cameras) would have to start from scratch, as would a third agency monitoring road traffic.

In part because of their mechanical componentry, sensors rarely measured across multiple yardsticks. Oven thermometers measured only oven temperature, and displayed the information locally, if at all (given that perhaps a majority of sensor traffic informs systems rather than persons, the oven temperature might only drive the thermostat rather than a human-readable display). Electric meters only counted watt-hours in aggregate. Fast forward to today: a consumer Global Positioning Satellite (GPS) unit or smartphone will tell location, altitude, compass heading, and temperature, along with providing weather radio.

Electromechanical sensors were not usually mass-produced, with the exception of common items such as thermometers. Because supply was limited, particularly for specialized designs, the combination of monopoly supply and small order quantities kept prices high.

The rigid architecture was a function of mechanical devices’ specificity. A vibration sensor was different from a camera was different from a humidistat. Humidity data, in turn, was designed to be moved and managed in a particular analog domain (a range of zero to 100 per cent), while image recognition in the camera’s information chain typically featured extensive use of human eyes rather than automated processing.

Ubiquity
Changes in each of these facets combine to help create today’s emerging sensor networks, which are growing in scope and capability every year. The many examples of sensor capability accessible to (or surveilling) the everyday citizen illustrate the limits of the former regime: today there are more sensors recording more data to be accessed by more end points. Furthermore, the traffic increasingly originates and transits exclusively in the digital domain.

*Computers, which sense their own temperature, location, user patterns, number of printer pages generated, etc.
*Thermostats, which are networked within buildings and now remotely controlled and readable
*Telephones, the wireless variety of which can be understood as beacons, bar-code scanners, pattern-matchers (the Shazam application names songs from a brief audio sample), and network nodes
*Motor and other industrial controllers: many cars no longer have mechanical throttle linkages, so people step on a sensor every day without thinking as they drive by wire. Automated tire-pressure monitoring is also standard on many new cars. Airbags rely on a sophisticated system of accelerometers and high-speed actuators to deploy the proper reaction for collision involving a small child versus a lamp strapped into the front passenger seat.
*Vehicles: the OBD II diagnostics module, the toll pass, satellite devices on heavy trucks, and theft recovery services such as Lojack, not to mention the inevitable mobile phone, make vehicle tracking both powerful and relatively painless
*Surveillance cameras (of which there are over 10,000 in Chicago alone, and more than 500,000 in London)
*Most hotel door handles and many minibars are instrumented and generate electronic records of people’s and vodka bottles’ comings and goings.
*Sensors, whether embedded in animals (RFID chips in both household pets and race horses) or gardens (the EasyBloom plant moisture sensor connects to a computer via USB and costs only $50), or affixed to pharmaceutical packaging.

Note the migration from heavily capital-intensive or national-security applications down-market. A company called Vitality has developed a pill-bottle monitoring system: if the cap is not removed when medicine is due, an audible alert is triggered, or a text message could be sent.

A relatively innovative industrial deployment of vibration sensors illustrates the state of the traditional field. In 2006, BP instrumented an oil tanker with "motes," which integrated a processor, solid-state memory, a radio, and an input/output board on a single 2" square chip. Each mote could receive vibration data from up to ten accelerometers, which were mounted on pumps and motors in the ship’s engine room. The goal was to determine if vibration data could predict mechanical failure, thus turning estimates—a motor teardown every 2,000 hours, to take a hypothetical example—into concrete evidence of an impending need for service.

The motes had a decided advantage over traditional sensor deployments in that they operated over wireless spectrum. While this introduced engineering challenges arising from the steel environment as well as the need for batteries and associated issues (such as lithium’s being a hazardous material), the motes and their associated sensors were much more flexible and cost-effective to implement compared to hard-wired solutions. The motes also communicate with each other in a mesh topology: each mote looks for nearby motes, which then serve as repeaters en route to the data’s ultimate destination. Mesh networks are usually dynamic: if a mote fails, signal is routed to other nearby devices, making the system fault-tolerant in a harsh environment. Finally, the motes could perform signal processing on the chip, reducing the volume of data that had to be transmitted to the computer where analysis and predictive modeling was conducted. This blurring of the lines between sensing, processing, and networking elements is occurring in many other domains as well.

All told, there are dozens of billions of items that can connect and combine in new ways. The Internet has become a common ground for many of these devices, enabling multiple sensor feeds—traffic camera, temperature, weather map, social media reports, for example—to combine into more useful, and usable, applications. Hence the intuitive appeal of "the Internet of Things." As we saw earlier, network effects and positive feedback loops mean that considerable momentum can develop as more and more instances converge on shared standards. While we will not discuss them in detail here, it can be helpful to think of three categories of sensor interaction:

*Sensor to people: the thermostat at the ski house tells the occupants that the furnace is broken the day before they arrive, or a dashboard light alerting the driver that the tire pressure on their car is low
*Sensor to sensor: the rain sensor in the automobile windshield alerts the antilock brakes of wet road conditions and the need for different traction-control algorithms
*Sensor to computer/aggregator: dozens of cell phones on a freeway can serve as beacons for a traffic-notification site, at much lower cost than helicopters or "smart highways."

An "Internet of Things" is an attractive phrase that at once both conveys expansive possibility and glosses over substantial technical challenges. Given 20+ years of experience with the World Wide Web, people have long experience with hyperlinks, reliable inter-network connections, search engines to navigate documents, and wi-fi access everywhere from McDonalds to mid-Atlantic in flight. None of these essential pieces of scaffolding has an analog in the Internet of Things, however: garage-door openers and moisture sensors aren't able to read; naming, numbering, and navigation conventions do not yet exist; low-power networking standards are still unsettled; and radio-frequency issues remain problematic. In short, as we will see, "the Internet" may not be the best metaphor for the coming stage of device-to-device communications, whatever its potential utility.

Beyond the Web metaphor
Given that "the Internet" as most people experience it is global, searchable, and anchored by content or, increasingly, social connections, the "Internet of Things" will in many ways be precisely the opposite. Having smartphone access to my house's thermostat is a private transaction, highly localized and preferably NOT searchable by anyone else. While sensors will generate volumes of data that are impossible for most humans to comprehend, that data is not content of the sort that Google indexed as the foundation of its advertising-driven business. Thus while an "Internet of Things" may feel like a transition from a known world to a new one, the actual benefits of networked devices separate from people will probably be more foreign than saying "I can connect to my appliances remotely."

Consumer applications
The notion of networked sensors and actuators can usefully be subdivided into industrial, military/security, or business-to-business versus consumer categories. Let us consider the latter first. Using the smartphone or a web browser, it is already possible to remotely control and/or monitor a number of household items:

•    slow cooker
•    garage-door opener
•    blood-pressure cuff
•    exercise tracker (by mileage, heart rate, elevation gain, etc)
•    bathroom scale
•    thermostat
•    home security system
•    smoke detector
•    television
•    refrigerator.

These devices fall into some readily identifiable categories: personal health and fitness, household security and operations, entertainment. While the data logging of body weight, blood pressure, and caloric expenditures would seem to be highly relevant to overall physical wellness, few physicians, personal trainers, or health insurance companies have built business processes to manage the collection, security, or analysis of these measurements.  Privacy, liability, information overload, and, perhaps most centrally, outcome-predicting algorithms have yet to be developed or codified. If I send a signal to my physician indicating a physical abnormality, she could bear legal liability if her practice does not act on the signal and I subsequently suffer a medical event that could have been predicted or prevented.

People are gradually becoming more aware of the digital "bread crumbs" our devices leave behind. Progressive Insurance's Snapshot campaign has had good response to a sensor that tracks driving behavior as the basis for rate-setting: drivers who drive frequently, or brake especially hard, or drive a lot at night, or whatever could be judged worse risks and be charged higher rates. Daytime or infrequent drivers, those with a light pedal, or people who religiously buckle seat belts might get better rates. This example, however, illustrates some of the drawbacks of networked sensors: few sensors can account for all potentially causal factors. Snapshot doesn't know how many people are in the car (a major accident factor for teenage drivers), if the radio is playing, if the driver is texting, or when alcohol might be impairing the driver's judgment. Geographic factors are delicate: some intersections have high rates of fraudulent claims, but the history of racial redlining is also still a sensitive topic, so data that might be sufficiently predictive (ZIP codes traversed) might not be used out of fear it could be abused.

The "smart car" applications excepted, most of the personal Internet of Things use cases are to date essentially remote controls or intuitively useful data collection plays. One notable exception lies in pattern-cognition engines that are grouped under the heading of "augmented reality." Whether on a smartphone/tablet or through special headsets such as Google Glass, a person can see both the physical world and an information overlay. This could be a real-time translation of a road sign in a foreign country, a direction-finding aid, or a tourist application: look through the device at the Eiffel Tower and see how tall it is, when it was built, how long the queue is to go to the top, or any other information that could be attached to the structure, attraction, or venue.

While there is value to the consumer in such innovations, these connected devices will not drive the data volumes, expenditures, or changes in everyday life that will emerge from industrial, military, civic, and business implementations.

The Internet(s) of [infrastructure] Things
Because so few of us see behind the scenes to understand how public water mains, jet engines, industrial gases, or even nuclear deterrence work, there is less intuitive ground to be captured by the people working on large-scale sensor networking. Yet these are the kinds of situations where networked instrumentation will find its broadest application, so it is important to dig into these domains.

In many cases, sensors are in place to make people (or automated systems) aware of exceptions: is the ranch gate open or closed? Is there a fire, or just an overheated wok? Is the pipeline leaking? Has anyone climbed the fence and entered a secure area? In many cases, a sensor could be in place for years and never note a condition that requires action. As the prices of sensors and their deployment drop, however, more and more of them can be deployed in this manner, if the risks to be detected are high enough. Thus one of the big questions in security -- in Bruce Schneier's insight, not "Does the security measure work?" but "Are the gains in security worth the costs?" -- gets difficult to answer: the costs of IP-based sensor networks are dropping rapidly, making cost-benefit-risk calculations a matter of moving targets.

In some ways, the Internet of Things business-to-business vision is a replay of the RFID wave of the mid-aughts. Late in 2003, Wal-Mart mandated that all suppliers would use radio-frequency tags on their incoming pallets (and sometimes cases) beginning with the top 100 suppliers, heavyweight consumer packaged goods companies like Unilever, Procter & Gamble, Gillette, Nabisco, and Johnson & Johnson. The payback to Wal-Mart was obvious: supply chain transparency. Rather than manually counting pallets in a warehouse or on a truck, radio-powered scanners could quickly determine inventory levels without workers having to get line-of-sight reads on every bar code. While the 2008 recession contributed to the scaled-back expectations, so too did two powerful forces: business logic, and physics.

To take the latter first, RFID turned out to be substantially easier in labs than in warehouses. RF coverage was rarely strong and uniform, particularly in retrofitted facilities. Noise -- in the form of everything from microwave ovens to portable phones to forklift-guidance systems -- made reader accuracy an issue. Warehouses involve lots of metal surfaces, some large and flat (bay doors and ramps), others heavy and in motion (forklifts and carts): all of these reflect radio signals, often problematically. Finally, the actual product being tagged changes radio performance: aluminum cans of soda, plastic bottles of water, and cases of tissue paper each introduce different performance effects. Given the speed of assembly lines and warehouse operations, any slowdowns or errors introduced by a new tracking system could be a showstopper.

The business logic issue played out away from the shop floor. Retail and CPG profit margins can be very thin, and the cost of the RFID tagging systems for manufacturers that had negotiated challenging pricing schedules with Wal-Mart was protested far and wide. The business case for total supply chain transparency was stronger for the end seller than for the suppliers, manufacturers, and truckers required to implement it for Wal-Mart's benefit. Given that the systems delivered little value to the companies implementing them, and given that the technology didn't work as advertised, the quiet recalibration of the project was inevitable.

RFID is still around. It is a great solution to fraud detection, and everything from sports memorabilia to dogs to ski lift tickets can be easily tested for authenticity. These are high-value items, some of them scanned no more than once or twice in a lifetime rather than thousands of times per hour, as on an assembly line. Database performance, industry-wide naming and sharing protocols, and multi-party security practices are much less of an issue. 

While it's useful to recall the wave of hype for RFID circa 2005, the Internet of Things will be many things. The sensors, to take only one example, will be incredibly varied, as a rapidly growing online repository makes clear. Laboratory instruments are shifting to shared networking protocols rather than proprietary ones. This means it's quicker to set up or reconfigure an experiment, not that the lab tech can see the viscometer or Geiger counter from her smart phone or that the lab will "put the device on the Internet" like a webcam.

Every one of the billions of smartphones on the planet is regularly charged by its human operator, carriers a powerful suite of sensors -- accelerometer, temperature sensor, still and video cameras/bar-code readers, microphone, GPS receiver -- and operates on multiple radio frequencies: Bluetooth, several cellular, WiFi. There are ample possibilities for crowdsourcing news coverage, fugitive hunting, global climate research (already, amateur birders help show differences in species' habitat choices), and more using this one platform.

Going forward, we will see more instrumentation of infrastructure, whether bridges, the power grid, water mains, dams, railroad tracks, or even sidewalks. While states and other authorities will gain visibility into security threats, potential outages, maintenance requirements, or usage patterns, it's already becoming clear that there will be multiple paths by which to come to the same insight. The state of Oregon was trying to enhance the experience of bicyclists, particularly commuters. While traffic counters for cars are well established, bicycle data is harder to gather. Rather than instrumenting bike paths and roadways, or paying a third party to do so, Oregon bought aggregated user data from Strava, a fitness-tracking smartphone app. While not every rider, particularly commuters, tracks his mileage, enough do that the bike-lane planners could see cyclist speeds and traffic volumes by time of day, identify choke points, and map previously untracked behaviors.

Strava was careful to anonymise user data, and in this instance, cyclists were the beneficiaries. Furthermore, cyclists compete on Strava and have joined with the expectation that their accomplishments can show up on leader boards. In many other scenarios, however, the Internet of Things' ability to "map previously untracked behaviors" will be problematic, for reasons we will discuss later.

Industrial scenarios
GE announced its Industrial Internet initiative in 2013. The goal is to instrument more and more of the company's capital goods -- jet engines are old news, but also locomotives, turbines, undersea drilling rigs, MRI machines, and other products -- with the goal of improving power consumption and reliability for existing units, and to improve the design of future products. Given how big the company's footprint is in these industrial markets, 1% improvements turn out to yield multi-billion-dollar opportunities. Of course, instrumenting the devices, while not trivial, is only the beginning: operational data must be analyzed, often using completely new statistical techniques, and then people must make decisions and put them into effect.

This holistic vision is far-sighted on GE's part and transcends the frequent technology-centric marketing messages that often characterize Silicon Valley rhetoric. That is, GE's end-to-end insistence on sensors AND software AND algorithms AND people is considerably more nuanced and realistic than, for example, Qualcomm's vision:

“the Internet of Everything (IoE) is changing our world, but its effect on daily life will be most profound. We will move through our days and nights surrounded by connectivity that intelligently responds to what we need and want—what we call the Digital Sixth Sense. Dynamic and intuitive, this experience will feel like a natural extension of our own abilities. We will be able to discover, accomplish and enjoy more. Qualcomm is creating the fabric of IoE for everyone everywhere to enable this Digital Sixth Sense.”

Not surprisingly, Cisco portrays the Internet of Things in similar terms; what Qualcomm calls "fabric" Cisco names "connectivity," appropriately for a networking company:
“These objects contain embedded technology to interact with internal states or the     external environment. In other words, when objects can sense and communicate, it changes how and where decisions are made, and who makes them.

The IoT is connecting new places–such as manufacturing floors, energy grids,     healthcare facilities, and transportation systems–to the Internet. When an object can represent itself digitally, it can be controlled from anywhere. This connectivity means more data, gathered from more places, with more ways to increase efficiency and improve safety and security.”

The other striking advantage of the GE approach is financial focus: 1% savings in a variety of industrial process areas yields legitimately huge cost savings opportunities. This approach has the simultaneous merits of being tangible, bounded, and motivational. Just 1% savings in aviation fuel over 15 years would generate more than $30 billion, for example.

But to get there, the GE vision is notably realistic about the many connected investments that must precede the harvesting of these benefits.

    1) The technology doesn't exist yet. Sensors, instrumentation, and user interfaces need to be made more physically robust, usable by a global work force, and standardized to the appropriate degree.
    2) Information security has to protect assets that don't yet exist, containing value that has yet to be measured, from threats that have yet to materialize.
    3) Data literacy and related capabilities need to be cultivated in a global workforce that already has many skills shortfalls, language and cultural barriers, and competing educational agendas. Traditional engineering disciplines, computer science, and statistics will merge into new configurations.

Despite a lot of vague marketing rhetoric, the good news is that engineers, financial analysts, and others are recognizing the practical hurdles that have yet to be cleared. Among these are the following:

1) Power consumption

If all of those billions of sensors require either hard-wired power or batteries, the toxic waste impact alone could be daunting. Add to this requirement the growing pressure of the electric-car industry on the worldwide battery supply, and the need for new power management, storage, and disposal approaches becomes clear.

2) Network engineering

It's easy to point to all those sensors, each with its own IP address, and make comparisons to the original Internet. It's quite another matter, however, to make networks work when the sensor might "wake up" only once a day -- or once a month -- to report status. Other sensors, as we saw with jet engines, have the opposite effect, that of a firehose. Some kind of transitional device will likely emerge, either collecting infrequent heterogeneous "chirps" or consolidating, error-checking, compressing, and/or pre-processing heavy sensor volumes at the edge of a conventional network. Power management, security, and data integrity might also be in some of these devices' job description.

3) Security

As the Stuxnet virus illustrated, the Internet of Things will be attacked by both amateur and highly trained people writing a wide variety of exploits. Given that Internet security is already something of a contradiction in terms, and given widespread suspicion that the NSA has engineered back doors into U.S. firms' technology products, market opportunities for EU and other IoT vendors might increase as a result. In any event, the challenge of making lightweight, distributed systems robustly secure without undue costs in selling price, operational overhead, interoperability, or performance has yet to be solved at a large scale. In 2014 the security firm Symantec announced that all exercise monitors tested were found to be insecure.

4) Data processing
The art and science of data fusion is far from standardized in fields that have been practicing it for decades. Context, for instance, is often essential for interpretation but difficult to guarantee during collection. Add to the mix humans as sensor platforms, intermittent and hybrid network connectivity, information security requirements outside a defense/intelligence cultural matrix, and unclear missions -- many organizations quite reasonably do not know why they are measuring what they are measuring until after they try to analyze the numbers -- and the path of readings off the sensors and into decision-making becomes complicated indeed.

5) Cost effectiveness

The RFID experiment foundered in part on the price of the sensors, which even when measured in dimes became an issue when the volumes of items to be tracked ranged into the millions. With past hardware investments in memory, for example, still stinging some investors, the path to profitability for ultra-low-power, ultra-low-cost now will be considerably different from the high-complexity, high-margin world that Intel so successfully mastered in the PC era.

6) Protocols

The process by which the actual day-to-day workings of complex systems get negotiated makes for good business-school case studies, but challenging investment and decision-making. The USB standard, for example, had substantial industry "convening power" being exercised by Intel, and the benefits have been widely shared. For the IoT, it's less clear which companies will have a similar combination of engineering know-how, intellectual property (and a management mandate to form a profitless patent pool), industry fear and respect, and so on. As the VHS/Betamax, high-resolution audio CD, and high-resolution DVD standards wars have taught many people, it's highly undesirable to be stranded on the wrong side of an industry protocol. Hence, many players may sit out pending identifiable winners in the various standards negotiations.

7) APIs and middleware
The process by which device chirps become management insights requires multiple handoffs between sensors and PCs or other user devices. Relatively high up the stack are a variety of means by which processed, analyzed data can be connected to and queried by human decision makers, and so far, enterprise software vendors have yet to make a serious commitment to integrating these new kinds of data streams (or trickles, or floods) into management applications.

8) System management

The IoT will need to generate usage logs, integrity checks, and all manner of tools for managing these new kinds of networks. Once again, data center and desktop PC systems management tools simply are not designed to handle tasks at this new level of granularity and scale. What will an audit of a network of "motes" look like? Who will conduct it? Who will require it?

Conclusion
As this note has hinted, the label "Internet of Things" could well steer thinking in unproductive directions. Looking at the World Wide Web as a prototype has many shortcomings: privacy, security, network engineering, human-in-the-loop advantages that may not carry over, and even the basic use case. At the same time, thinking of sensor networks in the same proprietary, single-purpose terms that have dictated generations of designs is also overdue.

Beyond the level of the device, data processing is being faced with new challenges -- in both scope and kind -- as agencies, companies, and NGOs (to name but three interested parties) try to figure out how to handle billions of cellphone chirps, remote-control clicks, or GPS traces. What information can and should be collected? By what entity? With what safeguards? For how long? At what level of aggregation, anonymization, and detail? With devices and people opting in or opting out? Who can see what data at what stage in the analysis life cycle?

Once information is collected, the statistical and computer science disciplines are challenged to find patterns that are not coincidence, predictions that can be validated, and insights available in no other way. Numbers rarely speak for themselves, and the context for Internet of Things data is often difficult to obtain or manage given the wide variety of data types in play. The more inclusive the model, however, the more noise is introduced and must be managed. And the scale of this information is nearly impossible to fathom: according to IBM Chief Scientist Jeff Jonas, mobile devices in the United States alone generated 600 billion geo-tagged transactions every day -- as of 2010.

In addition to the basic design criteria, the privacy issues cannot be ignored. Here, the history of Google Glass might be instructive: whatever the benefits that accrue to the user, the rights of those being scanned, identified, recorded, or searched matter in ways that Google has yet to acknowledge. Magnify Glass to the city or nation-state level (recall that England has an estimated 6 million video cameras, but nobody knows exactly how many), as the NSA revelations appear to do, and it's clear that technological capability has far outrun the formal and informal rules that govern social life in civil society.