The Chicken Coop

cloud computing – aws vrs gae and azure

6

So, over the break I’ve been looking at what it would take to use Amazon EC2 for some of our sites, and I’ve been meaning to look into the likes of Google App Engine (GAE) and Azure for a while anyway.

What I found wasn’t really what I expected.

Side note: For those who dont know, there are basically three types of “cloud platforms” being pushed at the moment:

  • Use someone’s app, but not hosted on my servers. Think Gmail/Zoho vrs Exchange, Xero vrs Quickbooks, salesforce.com vrs SAP etc.
  • Run my code on someone elses servers, using their APIs. This is the GAE and Azure model. I’m sure there are others, too. I dont know, or care, what the DB in the back end is. Or if it’s backed up. Or what the load balancer up the front is doing. I just care it works and it scales for me.
  • Run my MACHINE on someone elses servers, using whatever API I want. This is the model from Amazon AWS, Rackspace, (mt) Media Temple, and even the likes of Dreamhost, to some degree.

In this case, I only care about the last two, even tho I make a lot of use of the first one.

My opinion, going in, was that if you wanted to make a little app, maybe something that works on Facebook or the like, then Google App Engine and Azure were a great option. Write to their api’s, deploy it on their servers, and you are done. I thought: sure, you are giving up a lot in flexibility, but it’s maybe worth it for the ease of it.

And to some degree, thats correct. But what I didn’t factor in was lock in.

Normally, I dont care much about lock in. If the company is not looking like that are going to be around, then I’m not going to use them from day 1. Sorry, small-company-doing-something-cool, but it’s just not going to happen. Google and Microsoft are not small companies, regardless of how you look at it.

So, here’s the options, and my opinions on when they are useful:

The Azure/GAE option

I think this is a good option if you love (not like) their API set, and if the project fails, you dont care too much. If it grows big, fantastic, it might replace your day job, but thats not the focus. You just want to get this cool idea out there, quickly, and see what happens. GAE and Azure are fantastic for this. Personally, I rather like the Django + the persistance engine they have on GAE, which is why I’ve kept interested in them.

But I dont see anyone rewriting large applications using either of these tools – the ecosystem around a large app is too big, and usually not compatible with the restrictions places on the sandbox. You just need to do more than you can do via serving a webpage.

The biggest problem is lock in. If, for some reason, you can’t use the microsoft or google platform (eg, you get too big, you run into an API problem that can’t be solved, they price you out etc), then you are screwed. Totally. Their API’s dont run on any other servers – GAE can kinda run in an EC2 instance, but it’s neither scalable nor supported. Azure even more so. You are stuck with their architecture, and if you out grow it, it’s all over.

So, while it’s very easy to get started, and the api’s are rather nice, if you need to get out, you need to be thinking about a full rewrite. And thats just nasty.

Oh, and Azure does support background processes, which is one up on GAE, but I doubt thats going to last long.

The Amazon* option.

* and (mt), Rackspace and Dreamhost. And most other players out there.

Even tho they are usually talked about in the same sentance, AWS is almost the polar opposite to GAE/Azure. Amazon is about making servers into a utility commodity. Need 10 this week? No problem. A thousand more in 30 mins? Also not really a problem.

But from a developers point of view, this is very close to the normal application development story. Want to use Rails? No problem. .NET talking to SQL Server? Also, no problem. Hadoop cluster? Yup. Run an Asterix PABX? Yeah, why not. As long as your credit card is valid.

But with great power comes great responsibility. Need to scale your app out? Better write it so you can spin up a new VM and deploy automatically, because manual configuration of even 10 servers gets old quickly, let alone 1000. Need a load balancer (and you will)? Better get familiar with Pound, Apache, Squid or similar. Do you know how to make MySQL into a write-master-with-reader-slaves? What about memcached? You control the architecture, so if something goes wrong with it, you are the only one to blame. But you can almost literally do anything.

With my $6/month dreamhost account, I can happily run most of that too – atleast Rails, PHP, MySQL, install stuff etc etc. It’s a good solution for me as I dont want to pay a large amount of money for it. But thats not something which will scale much beyond the 500 or so hits a day this site gets.

But what happens when you need a lot of servers, in (and for) a short amount of time? Amazon is really the only option – you can go from 10 servers to 1500 in hours to handle a facebook spike. Then scale it back down when the spike goes away.

AWS also does a good job of avoiding lock in. The machine you are (usually) using is a generic linux VM, running in the Xen hyperviser-based virtualization environment. Or a generic Windows VM on the same hypervisor. Nothing special about that, at all. Want to move it to your Parallels VM on your Mac? Go right ahead. Got a big ESX cluster at work and want to move the servers inhouse? Again, not a problem. Even easier if you have nice deployment scripts.

And there in lies the major difference. GAE/Azure is the don’t-make-me-think, developer option. AWS is the I-have-a-business-here option, and you need the skills to run that – and you pay for it. AWS is not the cheap option (tho it’s a load less expensive than the likes of Rackspace managed hosting), but it gives you a huge amount of flexibility and nearly zero lock in, while still allowing you to scale quickly if you need to. There is a logical progression from your laptop to a cheap host (eg dreamhost), to AWS and back. With GAE or Azure, you can develop on your laptop, but your only option for deployment is Google or Microsoft. Period.

6 Comments

  1. Simone
    Simone12-30-2008

    Personally I think that AWS EC2 and Azure/GAE have different usages.
    One is made to run “normal” applications on a variable number of nodes.
    It means that if you want to take advantage of the processing power of all of them, you have to write your app to scale up.

    With Azure/GAE you write an app, and it’s scalable by default. Sure, you have to write your app to their API, but once you have done it, it’s scalable by default.

    As with everything in IT, they both have pro and cons. Personally I’m more leaning toward Azure, since it allows you also to write any kind of application, web apps, background apps, both .NET and unmanaged, and not only web application written in Pyton.

  2. Nic Wise
    Nic Wise12-30-2008

    Yup, you are right – it will scale by default (well, we hope it will once it comes out of CTP!) But if it doesn’t (or you get a problem like GAE has or had where you can’t fetch > 5000 records out of a GQL statement), then you are stuck – you need to work around it or….. well, rewrite.

    I dont think AWS is perfect, but I think in a lot of larger cases, it’s a more secure option – I’d have more options if I wanted to move. But the other two are still good :)

  3. Steven Quick
    Steven Quick01-05-2009

    With Microsoft, Google and Amazon grabbing the headlines lately it’s easy to forget there are a number of quality VPS hosts with solid API’s that have been doing scale out/up for 2 years+.

    eg: Slicehost, Linode etc

  4. Nic Wise
    Nic Wise01-05-2009

    yes, there are, however they are usually quite small players, and dont appear to have the immediacy of amazon, or the program-to-our-api’s of microsoft or google.

    But you are right – they are not the only ones out there.

  5. John
    John06-10-2009

    Do you even know what you’re talking about?

    John.

  6. Nic Wise
    Nic Wise06-10-2009

    Yup, I think I do. Of course, if you decided to post something more specific than a drive-by troll, I might be able to answer.