Monday, June 01, 2009

May 2009 Early Indications: Clouded Over

The proliferation of so-called cloud computing platforms has been rapid. Because there is so much material available that defines the phenomenon, we'll move here to an examination of some of the unexpected consequences and complicated implications of moving some or all of a computing environment to offsite, third-party environments.

To get the problematic and inevitable definitional question out of the way, here is one from Information Week's John Foley: "Cloud computing is on-demand access to virtualized IT resources that are housed outside of your own data center, shared by others, simple to use, paid for via subscription, and accessed over the Web."

There are of course other contending definitions, but Foley's is mercifully brief. Even so, it begs the questions of private clouds, how small a cloud can be before it starts being something else, and how individual uses of clouds (I don't own a data center but hit on most of the other conditions) vary from and overlap corporate ones. It does get us started in more or less the right direction.

First, here are some resources to get up to speed:

The Economist special report from last October

Accenture Cloud homepage

Amazon Elastic Compute Cloud

Google App Engine

HP Cloud Assure

IBM Cloud Computing

Microsoft Azure platform

Rather than handicap the vendors, or the vendors' definitions, I'd like to focus a bit farther out. In a series of conversations with our corporate and university advisors, a number of questions have surfaced. In particular, I'm building on remarks by Mssrs. Smith and Parkinson at our member meeting earlier this month.

1) What is a vendor's profit path? What can be differentiated and thus generate margins? Compared to the conventional model of data centers, which is often measured in $10,000 or $100,000 units, cloud computing usage at Amazon is measured in dimes.

2) How will incumbents respond? If I have an established business selling hardware as capital expenditure, and a competing model shifts MIPS to an operating-expense model, presumably I don't stand still. Oracle's plans for Sun will be relevant here.

3) How does cloud lock-in vary from existing software (a la classical Microsoft) or hardware (the vintage IBM model) variants?

4) As with so much of the world's infrastructure, what is the incentive to invest in "pipes" when the value-add lies elsewhere, or nowhere?

5) If for legal or other reasons I need performance, security, and/or reliability guarantees, how do I get them if I cannot see or physically access my assets?

6) There are no free lunches. Every one of the Web's elite destinations has suffered from major outages at some point. Just weeks ago, Google suffered a technical breakdown about which the company released few particulars, but it managed to slow down service to millions (or more) of users for several hours on May 14. Gmail also failed at scale in February. In light of that history, what does a fault-tolerant cloud environment look like, require, and cost? (For an amazing graphic of the "Great GoogleLapse," see here)

7) Can there be "one throat to choke" in a virtual environment? Just as outsourcers are arbitraging labor rates by shifting contracted work to other shores, so too will cloud vendors assemble services from multiple entities to create bundled offerings. What will be the unexpected consequences for customers?

8) How does optimization work in a cloud? The vendor may be managing to power consumption, say, while customer A wants stable (not necessarily fast, but predictable) transaction times for a shopping cart scenario. Customer B needs fast compute capability despite bag and frequent reads and writes to disk. How can all three parties go home happy at the end of the day?

9) How can virtual, hybrid environments be tested before major real-world events: a quarterly close, a consumer promotion, a currency meltdown? While there will be some greenfield successes, a big question relates to how well clouds can integrate with existing data centers and other assets. (What constitutes unit testing in a cloud?)

10) What can I as a customer ask for by way of customization? Who can and will provide it, and at what costs in money and performance? The price points reflect commodity economics, but sooner or later most of us stumble upon needs that surpass plain vanilla.

11) Long ago, factory layouts (and locations) changed as power shifted from waterwheels that drove a central shaft around which looms were arranged, to individual electric motors for each machine. White collar offices after the rise of the PC no longer feature typing pools. What will be the organizational innovations that cloud computing makes possible? Focusing on power savings in the data center is a useful first step, but the technology will have many other implications for the ways people come together to achieve goals.

12) The PC architecture flourished in part because of its interoperability: I could choose a big Maxtor hard drive or a faster Seagate, a Dell LCD or Sony CRT display, and my hardware maker could buy the cheapest CD drives, RAM, and power cords on a given day. USB made the platform more flexible yet. Once I choose a cloud provider, how must I choose my ISP, my system management vendor, my billing system? In short, what are the dependencies introduced by a cloud instance?

13) Companies don't switch casually from CA Unicenter to BMC Patrol, HP Openview, or IBM Tivoli, much less a promising startup, because the complexity issues are enormous. Will my Tivoli/Openview/whatever console be able to instrument both my owned hardware and my virtual assets, or do I rely only on the cloud vendor -- who will have good reasons for not exposing too much operational information? The various answers here will have implications for lock-in, for innovation, for risk management.

14) Cloud computing is a coherent-sounding phrase, but computing in turn has many facets. Think about the different time scales relating to

-network latency
-the laws of physics regarding hard drive access
-the laws of physics regarding hard drive failure
-various data structures (think of MapReduce versus SQL)
-load-balancing, failover, and other necessary housekeeping
-core vs. edge workload allocation.

At the end of the day, orchestrating all of those sets of events, each with their own timescapes, in a virtual world is a really, really tough technical and managerial problem. Getting the systems to work doesn't even scratch the questions of profitability, liability, audit and related requirements, etc.

The question is not, will cloud computing happen, but rather, how will this tendency unfold, and how will organizations, regulators, and other actors respond? Until the rhetoric and more important the base of experience moves beyond the current state of pilots and vaporware, the range of potential outcomes is too vast to bet on with any serious money.