Windows Azure Development
Posts tagged Fail
Design to fail
Apr 20th
I was chatting to a customer earlier about a solution they had built for Azure. They had implemented a thingythangy that stored a few hundred requests in memory, before dumping it into a blob. My immediate reaction was – “What happens when your role gets recycled, do you loose the cached requests?”
Windows Azure does not have an SLA for restarting your service, but if it did it would be 100%. Restarting is just a reality. Hardware fails, OS’s get patched, etc. At some point you will be restarted, maybe with a warning, but maybe not.
This by the way is no different from when you run your service on the server under your desk. At some point it will get restarted, loose power or some other such calamity.
One thing you need to think about when writing good code for the cloud is how to deal with this. There are a few choices to think about:
Ignore it and carry on
My Dad used to say “nothing to see here, move along” – sometimes it really doesn’t matter. You can ignore some things that happen twice. As an example if you were counting web site hits, and extra “count” here or there isn’t really going to change the outcome or purpose, however if the action is generating a patients prescription, that extra Hydrocodone tablet will probably make a significant difference.
Write code to handle failure
Write code to detect if the action has already been completed as well as write code to recover in-complete actions. You can do things like:
- Check how many times a message has been dequeued – a message that has been dequeued more than once is either a poison message, or was the subject of a failure (or both.
- Check the eTag & timestamp on data from Windows Azure storage. Has it been updated recently, is the message you are trying to process older than the last update?
Essentially you are trying to write code that is idempotent. Idempotent code is code that can be executed multiple times without changing the outcome. There are a bunch of techniques which I’ll cover over the coming weeks (with code), but the bottom line is:
Your code will fail – make sure you handle that.
THIS POSTING IS PROVIDED “AS IS” WITH NO WARRANTIES, AND CONFERS NO RIGHTS, EVEN IF YOU SAY PLEASE