Sunday, April 24, 2011

Learning from Amazon's cloud collapse, (Mashable)

Learning from Amazon's cloud collapse

(Mashable) -- Call it Cloudgate, Cloudpocalyse or whatever you'd like, but the extended collapse of Amazon Elastic Cloud Compute (EC2) is both a setback for cloud computing and an opportunity for us to figure out how to stop it from happening again.
Amazon may be best-known for its online shopping site, but it also has a substantial cloud computing business. It provides a scalable, flexible and particularly efficient solution for companies to store and deliver massive amounts of content.
Its model of only paying for what you consume was a radical innovation when it launched in 2006.
In fact, Amazon Web Services has been so affordable and reliable that thousands of companies from Foursquare to Netflix utilize the company's cloud computing technology and servers to run their businesses.
They put their faith in Amazon's cloud because there was no reason to think that it would falter. One of cloud computing's key tenants is reliability through redundancy of both servers and data centers.
Then on Wednesday, Amazon's northern Virginia data center started experiencing problems that caused major latency and connectivity issues.
The trouble was apparently due to excessive re-mirroring of its Elastic Block Storage (EBS) volumes -- this essentially created countless new backups of the EBS volumes that took up Amazon's storage capacity and triggered a cascading effect that caused downtime on hundreds (or more likely thousands) of websites for almost 24 hours.
The collapse took its share of victims. Among the most prominent companies affected were Foursquare, Quora, Hootsuite, SCVNGR, Heroku, Reddit and Wildfire, though hundreds of other companies big and small were affected.
Luckily, one of Amazon's most prominent customers, Netflix, didn't experience problems because it's built for the loss of an entire data center, while companies relying on Amazon's four other global data centers didn't experience too many issues.
A learning moment
FathomDB founder Justin Santa Barbara has a detailed post on his blog about what may be the biggest problem to come out of this week's collapse: Amazon's cloud redundancies failed to stop a mass outage.
Its Availability Zones are supposed to be able to fail independently without bringing the whole system down. Instead, there was a single point of failure that shouldn't have been there.
This week's disaster in the cloud is a reminder to startups to build redundancy into their applications and their own systems, but as Santa Barbara points out, most startups don't have the time or resources to engineer for multiple cloud systems (each Amazon global region/data center has its own rules and features, making a simple "switch" to another center difficult).
These companies trusted Amazon to keep them online, and Amazon failed to deliver.
Catastrophic issues will always occur, but in the pre-cloud era, downtime only affected a single computer or website. Today, a catastrophic event takes down thousands of websites, causing millions or even billions of dollars in lost revenue and productivity.
This incident is no reason for us to shun cloud computing, though. Its benefits (scalability, cost reduction, device independence, performance and more) far outweigh its cons.
We do need to take a hard look at how we structure our cloud infrastructure though and find new ways to either prevent single points of failure or quickly move content off failing clouds faster, especially as the world's computing power is consolidated into fewer and fewer systems.
Cloud computing is still in its infancy, and today's events make it clear that we still have a lot of work to do. It could be a whole lot worse next time if we aren't prepared.
© 2010 MASHABLE.com. All rights reserved.

No comments:

Great Escape on DVD

Chana Systems can Help you.

Chana Systems Ltd. Blog and IT News.



Deals 2012 for Medium Sized Business and Organizations

Up to 40 Percent Lower Project Prices to Improve your Business.



CHANA Systems will

Help you Upgrade

Consolidate and Enhance your Organization or Business and

Brings Cost's Down. This is our Specialization.

Ask us about it.



New with IBM Linux Thin Client Solutions- Link Here

With IBM-System Integration and Consolidation Solutions

Special from Chanasys with IBM , Lenovo and other Leading Brands

Click Here for Small and Medium Business Solutions with IBM Software, Hardware

Lenovo Laptops and Desktops
and Other Leading Brands this Month.
With or without Linux + Windows and our Expert Support


Thin Client Solutions are more secure cost less and are popular in large organizations
and use a lot less energy. We can offer leading Thin Client Computers.
Any Questions? Send us an e-mail with "Linux Solutions" as subject.
ASUS eeePC BIG HIT BestSeller Now in Israel
With Linux
Best as a Portable Second Computer for e-mail
and Browsing Wireless WiFi

Lenovo Laptops from 2000 Shekel
with Windows 7 Vista or XP and
Linux with our expert Support
Get in Touch for similar Packages for Business and Desktop Computers




NEW CHANA Amazon Bookstore

Tip'd