Thursday, September 29, 2011

Amazon Silk: split browser architecture

Amazon Silk

Content Delivery Networks, and WAN optimization, provided a generic acceleration solution to get common content closer to the client device, but on mobile devices the delivery performance of the last mile was still a problem. Many websites still do not have mobile optimized content, and sucking down a 3Mpixel JPG and render it on a 320x240 pixel display is just plain wrong. With the introduction of Amazon Silk, which uses the cloud to aggregate, cache, precompile, and predict, the client-side experience can now be optimized for the device that everybody glamours for: the tablet.

This is going to create an even bigger disconnect between the consumer IT experience and the enterprise IT experience. On the Amazon Fire you will be able to pull up, nearly instantaneously, common TV video clips and connect to millions of books. But most enterprises will find it difficult to invest in WAN optimization gear that would replicate that experience on the corporate network for your day to day work.

Amazon Silk is another example of the power that the cloud provides for doing heavy computes and caching that enables low-capability devices to roam.

Wednesday, September 14, 2011

Trillion Triple Semantic Database

The Semantic Web captures the semantics, or meaning, of data, and machines are enabled to interact with that meta data. It is an idea of WWW pioneer Tim Berners-Lee who observed that although search engines index much of the Web's content, keywords can only provide an indirect association to the meaning of the article's content. He foresees a number of ways in which developers and authors can create and use the semantic web to help context-understanding programs to better serve knowledge discovery.

Tim Berners-Lee originally expressed the vision of the Semantic Web as follows:
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The intelligent agents people have touted for ages will finally materialize.

The world of semantic databases just got a little bit more interesting with the announcement by Franz, Inc. and Stillwater SC of having reached a trillion triple semantic data store for telecommunication data.

http://www.franz.com/about/press_room/trillion-triples.lhtml

The database was constructed with an HPC on-demand cloud service and occupied 8 compute servers and 8 storage servers. The compute servers contained dual socket Xeons with 64GB of memory connecting through an QDR IB network to a 300TB SAN. The trillion triple data set spanned roughly 100TB of storage. It took roughly two weeks to load the data, but after that database provided interactive query rates for knowledge discovery and data mining.

The gear on which this result was produced is traditional HPC gear that emphasizes scalability and low latency interconnect. As a comparison, a billion triple version of the database was created on Amazon Web Services but the performance was roughly 3-5x slower. To create a trillion triple semantic database on AWS would have cost $75k and would have taken 6 weeks to complete.

Monday, July 4, 2011

What would you do with infinite computes?

Firing up a 1000 processor deep analytics cluster in the cloud to solve a market segmentation question regarding your customer orders during Christmas 2010, or a sentiment analysis of your company's facebook fan page now costs less than having lunch in Palo Alto.

The cloud effectively provides infinite computes, and to some degree infinite storage, although the costs of non-ephemeral storage might murk that analogy up a bit. So what would you do differently now you have access to a global supercomputer?

When I pose this question to my clients, it quickly reveals that their business processes are ill-prepared to take advantage of this opportunity. We are roughly half a decade into the cloud revolution, and at least a decade into the 'competing on analytics' mind set, but the typical enterprise IT shop is still unable to make a difference in the cloud.

However, change may be near. Given the state of functionality in software stacks like RightScale and Enstratus we might see a discontinuity in this inability to take advantage of the cloud. These stacks are getting to the point that an IT novice is able to provision complex applications into the cloud. Supported by solid open source provisioning stacks like Eucalyptus and Cloud.com, building reliable and adaptive software service stacks in the cloud is becoming child's play.

What I like about these environment is that they are cloud agnostic. For proper DR/BPC a single cloud provider would be a single point of failure and thus a non-starter. But these tools make it possible to run a live application across multiple cloud vendors thus solving the productivity and agility requirements that come with the territory of an Internet application.

Saturday, November 27, 2010

Why is there so little innovation in cloud hardware?

With the explosion of data and the need to make sense out of it all on a smart phone is creating an interesting opportunity. Mobile devices need high performance at low power, and Apple seems to be the only one that has figured out that having your own processor team and IP is actually a key advantage. And the telcos will need Petascale data centers to manage content, knowledge management, and operational intelligence and the performance per Watt of general purpose CPUs from IBM, Intel, and AMD are at least two order of magnitude away from what is possible. So why is there so little innovation in cloud hardware?

The rule of thumb for creating a new chip venture is $50-75M. Clearly the model where your project is just an aggregation of third party IP blocks is not a very interesting investment as it would create no defendable position in the market place. So from a differentiation point of view early stage chip companies need to have some unique IP. And this IP needs to be substantial. This creates the people and tool cost that makes chip design expensive.

Secondly, to differentiate on performance, power, or space you have to be at least closer to the leading edge. When Intel is at 32nm, don’t pick 90nm as a feasible technology. So mask costs are measured in the millions for products that try to compete in high-value silicon.

Thirdly, it takes at least two product cycles to move the value chain. Dell doesn’t move until it can sell 100k units a month, and ISVs don’t move until there millions of units of installed base. So the source of the $50M-$75M needed for fabless semi is that creating new IP is a $20-25M problem if presented to the market as a chip and it takes two cycles to move the supply chain, and it takes three cycles to move the software.

The market dynamics of IT has created this situation. It used to be the case that the enterprise market drove silicon innovation. However, the enterprise market is now dragging the silicon investment market down. Enterprise hardware and software is no longer the driving force: the innovation is now driven by the consumer market. And that game is played and controlled by the high volume OEMs. Secondly, their cost constraints and margins make delivering IP to these OEMs very unattractive: they hold all the cards and attenuate pricing so that continued engineering innovation is hard to sustain for a startup. Secondly, an OEM is not interested in creating unique IP by a third party: it would deleverage them. So you end up getting only the non-differentiable pieces of technology and a race to the bottom.

Personally however, I believe that there is a third wave of silicon innovation brewing. When I calculate the efficiency that Intel gets out of a square millimeter of silicon and compare that to what is possible I see a thousand fold difference. So, there are tremendous innovation possibilities from an efficiency point of view alone. Combining it with the trend to put intelligence into every widget and connecting them wirelessly provides the application space where efficient silicon that delivers high performance per Watt can really shine AND have a large potential market. Mixed-signal and new processor architectures will be the differentiators and the capital markets will at one point recognize the tremendous opportunities present to create a next generation technology that creates these intelligent platforms.

Until then, us folks that are pushing the envelope will continue to refine our technologies so we can be ready when the capital market catches up with the opportunities.

Monday, July 19, 2010

OpenStack: potential for a cloud standard

Today, Rackspace open sourced its cloud platform and announced to create a collaborative effort that includes NASA, Citrix, and Dell to build an open source cloud platform, dubbed OpenStack. Finally, the world of cloud computing gets some weight behind a potential standard. Google, Microsoft, Amazon, and big SaaS players like Salesforce.com and Netsuite are getting too isolated and too powerful to be believable to drive any type of standard for interoperability in the cloud. An open source stack could really break this open.

If Dell is true to its word to distribute OpenStack with its storage and server products, and Citrix is true to its word to drive OpenStack into their customer base, this effort has some real power players behind it that could provide the counter weight needed to stop the economic lock-in of the pioneers.

This announcement is very powerful as it provides a platform to accelerate innovation particularly from the university research where long-tailed projects simply do not get the time of day from Google or Microsoft. By offering a path to get integrated in an open source cloud platform, applications and run-times for genetics and proteomics, and deep computational engineering problems like material science and molecular dynamics get an opportunity to leverage cloud computing as a collective.

Most of the innovation in these verticals is delivered through open source and university research. By building solutions into a collective, the university research groups can build momentum that they could not build with adopting solutions from commercial vendors like Amazon AWS, Google Apps, or Microsoft Azure.

It also opens up great innovation opportunities for university IT shops that have to manage clusters themselves. Grid computing has proven to be very complicated and heavy-handed for these IT teams, but hopefully an effort like OpenStack with backing from Rackspace, NASA, Dell, and Citrix can give these teams a shot in the arm. The university clusters can be run with utmost efficiency and tailored to the workload set of the university, and OpenStack gear at Rackspace or participating data centers can be used to deal with demand spikes without any modification to the cloud applications.

These types of problems will always exists and only a cloud computing standard will be able to smooth the landscape. Let's hope that OpenStack with its backers can be the first step towards that level playing field.

This is exciting news for cloud computing developers and users.

Friday, July 16, 2010

The Intercloud

Found a wonderful post by Greg Papadopoulos in which he postulates the trend towards interclouds. Greg argues that Amazon's AWS BYOS/IaaS (Bring Your Own Stack) is the perfect marriage of simplicity and functionality that it will be with us for a long time. SaaS is the new delivery norm of software, and PaaS is the needed productivity layer to hide the complexity of IaaS. The proliferation of SaaS on top of PaaS on top of IaaS is the wrath of early technology adoption when most of the functionality is still in its infancy.

As Greg writes:
"Productive and in-production are different concepts, however. And as much as AWS seems to have found the lowest common denominator on the former with IaaS, how at-scale production will actually unfold will be a watershed for the computing industry.

Getting deployed and in production raises an incredible array of concerns that the developer doesn't see. The best analogy here is to operating systems; basic sub-systems like scheduling, virtual memory and network and storage stacks are secondary concerns to most developers, but are primary to the operator/deployer who's job it is to keep the system running well at a predictable level of service.

Now layer on top of this information security, user/service identity, accounting and audit, and then do this for hundreds or thousands of applications simultaneously and you begin to see why it isn't so easy. You also begin to see why people get twitchy about the who, where, and how of their computing plant.

Make no mistake, I have no doubt that cloud (nee network, grid) computing will become the organizing principle for public and private infrastructure. The production question is what the balance will be. Which cloud approach will ultimately win? Will it be big public utility-like things, or more purpose-built private enterprise ones?

The answer: yes. There will be no more of a consolidation to a single cloud than there is to a single network. "


He goes on to say the cloud will organize much like the energy business with a handful of very large networks supported by hundreds of regional and national companies. In this comparison, Greg finds an analogy in the internetworking development. Connecting all these federated entities together has created tremendous value, and thus it is reasonable to expect that the cloud will organize as a federated system as well. But to accomplish that, the cloud community needs to develop the right standards, just like the Internet community did for internetworking so that nobody has to be afraid to become an isolated island.

Putting on my developer's hat, this fear of becoming isolated is what is holding me back to commit to Google Apps or Microsoft Azure: they feel too proprietary in this age of open source and federated clouds. One of my core requirements for cloud applications is that the application could migrate from private cloud to public cloud and vice versa. When the economic value of the application goes up or down I want to be able to allocate it on the right infrastructure to maximize profit or minimize cost. Closed environments like Salesforce.com, Google Apps, or Microsoft Azure are a concern as these environments create a fierce lock-in for my application. Encapsulating it in an OVF virtual appliance provides me much greater flexibility at the cost of not having productive elastic computing services. That capability is maturing as we speak as its value is as obvious as it is needed.

Wednesday, July 14, 2010

Amazon IT moves into AWS

Amazon.com attempts IT switch to cloud computing

Amazon's e-commerce site is planning to move into Amazon Web Services. Jen Boden is Amazon's e-commerce IT director.

"Boden said her organization is in the preliminary stages of moving into AWS -- she started with some simple, homegrown applications, such as a list maintained for HR, which her team moved to AWS successfully. Larger sections of IT operations will move later with the financials likely to be last, since they are the most sensitive to security and compliance needs. Planning began last year, and the whole process might take another year and a half."

From the article it is interesting to see the confirmation that Amazon's own enterprise IT needs to go through the same transformation as any other enterprise IT team that has decades of old applications and data silos to support.

This effort is a great shot in the arm for cloud computing. The past year, most enterprise class IT shops have started pilot programs to figure out how to incorporate cloud storage and cloud computing into their roadmaps. When leaders like Amazon can point to solutions that others can follow, the laggards will come on board and we can finally move to the next phase in cloud computing and that is standards. With standards, utility computing will come one step closer.