Versioning in a heterogeneous environment
Thursday, June 29th, 2006I work on Enterprise software systems. These systems tend to have many components distributed across several machines. These machines are joined by networks, but separated by time and space.
Some of the machines are under our control. Some are not. We control the machines in the datacenter: web servers, application servers, and database servers, among others. We do not control the customer's machines that run our client software, the customer datacenters that replicates our tables, or the third-party integration servers that call our web services.
We run heterogeneous versions of our software every day. We can't force someone else to upgrade their machines when we upgrade ours. And even on the machines that we control, we have to tolerate heterogeneous versions during the roll out. This is a 24*7 environment. We can't take the datacenter down, roll out all the upgrades, then bring it back up. We have to roll out while live. And we have to be careful. So a rollout may take a week or more.
We have found a few patterns that help us cope with heterogeneous versions of our software running on several machines. Here are a few of them.
When making changes to the database, we can only add to the schema. We can add tables, add columns to existing tables, add stored procedures, add parameters to the end of existing stored procedures, and add columns that stored procedures return. And all new parameters must have defaults. We cannot take away or rearrange anything. All database access is accomplished through stored procedures.
These rules allow us to upgrade the database before the application servers. The older code will work just fine with the new procs.
Since messages flow both ways between clients and servers, we can't get by with the same rules that we use for the database. So we use loose data types for client-server boundaries. Our favorite data type is the property bag. A property bag simply contains named properties of various types. Bags can contain bags, and arrays of bags, allowing for fairly complex data structures. But these data structures are not strict. If either the client or the server don't find a property that they expect, they have to assume a reasonable default.
Property bags allow us to upgrade distributed components on varying schedules, but they are loose data types. Strict data types are better for third-party integration. So we have a different pattern for that. We usually do this sort of integration through web services. So we will publish a certain version of the WSDL that includes the strict data types that that version expects. We will create a virtual folder for that web service that is named according to the WSDL version. When we need to make a change, we will publish new WSDL to a new virtual folder. Even if only one data type changed, we will make that an entirely new WSDL deployment.
These are just three of the patterns that we've learned over time. Post comments and let me know what patterns you use.