OTHER THINGS ALSO CHANGE
Mainly based on the ‘Windows Azure Programming Model’ white paper by David Chapell, available at http://www.microsoft.com/windowsazure/whitepapers/default.aspx
The changes are mainly how role instances interact in three areas:
- Operating system
- Persistent storage
- Other role instances.
INTERACTIONS WITH THE OPERATING SYSTEM
In Windows Azure, the administrator of all of the servers is the fabric controller. It decides when VMs or machines should be rebooted, and for Web and Worker roles (although not for VM roles), the fabric controller also installs patches and other updates to the system software in every instance. This is very different from a normal Windows machine, where the administrator(s) of that machine have control. S/he can reboot VMs or the machine they run on, install Windows patches, and must do whatever else is required to keep it available.
This approach creates restrictions. As the fabric controller can modify the operating system at will, there is no guarantee that changes a role instance makes to the system it’s running on won’t be overwritten. Besides, the specific virtual (and physical) machines an application runs in change over time. This implies that any changes made to the default local environment must be made each time a role instance starts running. Anybody creating a Windows Azure application needs to understand what the fabric controller is doing, then design applications accordingly.
INTERACTIONS WITH PERSISTENT STORAGE
Applications use data, the way data is stored and accessed must also change in order to make applications more available and more scalable. The big changes are these:
- Storage must be external to role instances. Even though each instance is its own VM with its own file system, data stored in those file systems isn’t automatically made persistent. If an instance fails, any data it contains may be lost. This implies that for applications to work correctly in the face of failures, data must be stored persistently outside role instances. Another role instance can now access data that otherwise would have been lost if that data had been stored locally on a failed instance.
- Storage must be replicated. Just as a Windows Azure application runs multiple role instances to allow for failures, Windows Azure storage must provide multiple copies of data. Without this, a single failure would make data unavailable, something that’s not acceptable for highly available applications.
- Storage must be able to handle very large amounts of data. Traditional relational systems aren’t necessarily the best choice for very large data sets. Since Windows Azure is designed in part for massively scalable applications, it must provide storage mechanisms for handling data at this scale.
To allow this, Azure has blobs for storing binary data along with a non-SQL approach called tables for storing large structured data sets.
While applications see a single copy, Windows Azure storage replicates all blobs and tables three times.
This improves the application’s availability, since data is still accessible even when some copies are unavailable. And because persistent data is stored outside any of the application’s role instances, an instance failure loses only whatever data it was using at the moment it failed.
The Windows Azure programming model requires an application to behave correctly when a role instance fails. To do this, every instance in an application must store all persistent data in Windows Azure storage or another external storage mechanism (such as SQL Azure, Microsoft’s cloud-based service for relational data).
There is one more option introduced recently, Windows Azure drives. Normally, any data an application writes to the local file system of its own VM can be lost when that VM stops running. Windows Azure drives change this, using a blob to provide persistent storage for the file system of a particular instance. These drives have some limitations—only one instance at a time is allowed to both read from and write to a particular Windows Azure drive, for example, with all other instances in this application allowed only read access—but they can be useful in some situations.
INTERACTIONS AMONG ROLE INSTANCES
When an application is divided into multiple parts, those parts commonly need to interact with one another. In a Windows Azure application, this is expressed as communication between role instances. For example, a Web role instance might accept requests from users, and then pass those requests to a Worker role instance for further processing.
The way this interaction happens isn’t identical to how it’s done with ordinary Windows applications. Once again, a key fact to keep in mind is that, most often, all instances of a particular role are equivalent—they’re interchangeable. This means that when, say, a Web role instance passes work to a Worker role instance, it shouldn’t care which particular instance gets the work. In fact, the Web role instance shouldn’t rely on instance-specific things like a Worker role instance’s IP address to communicate with that instance. More generic mechanisms are required.
The most common way for role instances to communicate in Windows Azure applications is through Windows Azure queues.
Windows Azure queues don’t support transactional reads, and so they don’t guarantee exactly-once, in-order delivery.
Most of the time, queues are the best way for role instances within an application to communicate. It’s also possible for instances to interact directly, however, without going through a queue. To allow this, Windows Azure provides an API that lets an instance discover all other instances in the same application that meet specific requirements, then send a request directly to one of those instances. In the most common case, where all instances of a particular role are equivalent, the caller should choose a target instance randomly from the set the API returns. This isn’t always true—maybe a Worker role implements an in-memory cache with each role instance holding specific data, and so the caller must access a particular one. Most often, though, the right approach is to treat all instances of a role as interchangeable.
Next post on Azure will again be based on David Chappell’s paper, with some bits from my own experience – some tips on how to port existing Windows Server applications to Azure.