In previous blog posts, I’ve talked about some of the patterns you can use to build your apps for the cloud, including Task-Queue-Task and de-normalizing your data using that pattern. But now something on scaling out.
When you are building apps in the cloud, you have to remember you are running in a shared environment and have no control over the hardware.
Let’s think about that for a moment.
In Windows Azure and SQL Azure we run you on hardware. Information on roughly what to expect can be found here (scroll down and expand compute instances), but here is a table of the compute part of Windows Azure:
|Compute Instance Size||CPU||Memory||Instance Storage||I/O Performance|
|Small||1.6 GHz||1.75 GB||225 GB||Moderate|
|Medium||2 x 1.6 GHz||3.5 GB||490 GB||High|
|Large||4 x 1.6 GHz||7 GB||1,000 GB||High|
|Extra large||8 x 1.6 GHz||14 GB||2,040 GB||High|
So how fast is the memory? What kind of CPU caching do we have? How fast are the drives? What about the network?
For SQL Azure we don’t even tell you what it’s running on, although you can watch this to get a better idea of how “shared” you are.
The point I’m going to make is that when you control the hardware, you can figure out lots of things like the throughput of disk controllers, CPU & Memory and based on that knowledge create filegroups for databases that span multiple drives, install more cores, faster drives, more memory, faster networking – all to improve performance. You are scaling up.
In the cloud, things work differently – you have to scale out. You have lots of little machines doing little chunks of work. No more 32-way servers at your disposal to crank through that huge workload. Instead you need 32 x 1 way servers to crank through that workload. There are no file groups, no 15,000 rpm drives. Just lots of cheap little servers ready for you whenever you need them.
I get a lot of questions about the performance of this, that and the other and while I understand that information can be useful, I think they somewhat miss one of the points and the potential of using the cloud.
I don’t need 15,000rpm drives and 8 cores to handle my anticipated peak workload. Instead I can have 20 servers working my data at peak times, and 5 servers the rest of the time. So stop thinking about how fast the memory is, and start figuring out how you can use as many servers as you need – when you need it.
Remember OUT not UP.