Archive for the ‘Contracts’ Category
Friday, April 3rd, 2009
It's a common refrain among the DBAs that I've worked with. We talk about a new feature, and they describe how it can be done in the database. They could write a stored procedure, a calculated column, a view, and a series of triggers to get exactly the behavior required. Their toolset is not limited to tables and indexes. They want to do more than just storage.
I respect that. I see what databases are capable of, and I want to take advantage of those capabilities where appropriate. On the other hand, I sympathize with arguments about separation of concerns. View logic is in the view, business logic is in the middle tier, and data access logic is in the database. Putting view logic or business logic into the data tier leads to trouble.
The business has asked for a rich security model for our eCommerce system. They've identified several privileges that map to features of the site. They want an admin user to create roles containing those privileges. The admin user can then assign those roles to other users of the system.
Occasionally, there will be a one-off privilege that you want to explicitly grant or deny to a user. So after assigning a user some roles, the admin should be able to specify overrides. The ERD appears on the right.
The UI for managing roles and privileges is somewhat complex. The admin user searches for a user, and is presented with a list of roles. The admin clicks checkboxes next to the roles to assign them to the user.
Then the admin can navigate to another page where they are presented with a list of privileges. A green icon indicates default privileges -- the privileges that are part of a role to which the user is assigned. A checkbox indicates whether the user is granted that privilege. If there are no overrides, the green icons and the checkboxes agree. By checking and unchecking the privileges, the admin can create overrides.
In more formal notation, here is the dependent behavior that interprets the tables.
- user_in_role(User u, Role r) = there exists UserRole ur where ur.user = u and ur.role = r
- privilege_in_role(Privilege p, Role r) = there exists RolePrivilege rp where rp.privilege = p and rp.role = r
- is_default_privilege(User u, Privilege p) = there exists Role r such that user_in_role(u, r) and privilege_in_role(p, r)
- user_has_privilege(User u, Privilege p) =
there exists Override o such that o.user = u and o.privilege = p -> o.isGranted
else -> is_default_privilege(u, p)
The green icon reflects is_default_privilege, and the checkbox reflects user_has_privilege.
Calculating privileges in a view
The DBA, upon seeing these requirements, designed a view that calculates privileges for a user. Each row is a privilege. One bit column indicates whether the privilege is a default for the user, and another indicates whether the privilege is granted to the user. Looking at the formal notation, you can see that SQL is a natural language in which to describe this behavior.
But what happens when the user checks a checkbox? We want to either create or delete an Override. If the application code consumes the view, why should it need to also know about this logic?
The DBA also took care of this. He created an INSTEAD-OF UPDATE trigger on the view. When user_has_privilege is updated, an Override is either created or deleted. Now all of the logic for interpreting this table structure is encapsulated in a view. The view is the contract with the database.
Appropriate for our architecture?
This solution would be appropriate for a two-tier application. The page could bind directly to the view. But we have chosen a three-tier architecture. Between the page and the database is application logic. This approach is less desirable when a middle-tier is available.
The view combines all of the information into one result set, which takes away the ability for the application tier to cache privileges. Privileges do not change, unless we deploy a new version of the code. So caching them indefinitely is appropriate. The combined result set is bigger than the source data that could change, so it is less efficient.
The view is not a complete contract. The application still needs the ability to create and delete roles as individual entities. While this view encapsulates some of the business logic, it cannot encapsulate all of it. It is a leaky abstraction.
What is your opinion? Is this a data access contract? Is it business logic? Or is it perhaps UI logic? Which tier should handle this behavior?
Monday, June 30th, 2008
I've started a new project using Spring MVC. In doing so, I've had to invert my thinking.
Spring is an inversion of control container, which means that you don't code dependencies of one class directly upon another. Instead you put all of your dependencies into one configuration file and keep your code as loosely coupled as possible. This one configuration file creates a graph of objects, each with references to the others. Since the configuration file specifies the classes and references, the code for one class doesn't need to know the names of other classes.
Why is this important?
The dependency inversion principle tells us that it is better to depend upon something abstract than something concrete. This helps us to change how something is done without breaking the things that need it to be done. If your code depends upon an interface, you can change the thing that implements that interface without changing your code.
There are other reasons which have come to light since Robert Martin published his paper back in 1996. Dependency inversion makes unit testing easier, because you can replace the components that a unit calls with mock objects. Dependency inversion makes distributed computing easier, because you can replace business objects with proxies that call business logic on remote servers. In general, dependency inversion is a good goal.
How does Spring do it?
So if your class depends upon an interface rather than a concrete class, how does it get a reference to an object that implements that interface? It can't use "new" to create an instance, because "new" needs a concrete class name. To invert dependency, you have to move all of your "new"s into one place. That place is an inversion of control (IoC) container.
Spring reads an XML file that contains a bunch of object descriptions. You can think of each of these as a call to "new". Each one specifies a class name, an instance name, and a set of properties. These are write-once properties (what I like to call definitive), and should be initialized and never changed. These properties can include references to other object instances, thus forming a graph.
Spring MVC combines the dependency inversion principle with the model-view-controller pattern to create a pretty compelling web framework. The controllers and URL mappings are all configured through the IoC container. The URL mapper has a reference to each of the controllers, so it knows how to delegate the handling of a request. Because dependency has been inverted, the URL mapper doesn't know about the concrete classes that are the controllers, it only knows about an interface. So you can use the out-of-the-box URL mapper with your own custom controllers.
But it just so happens that Spring has a quaint mechanism for database access. In your XML file, you can configure a data source by providing a driver class name and a connection string to an instance of "org.apache.commons.dbcp.BasicDataSource". Then you can use this data source to execute queries using "org.springframework.jdbc.core.simple.SimpleJdbcTemplate". I wanted to use this technique from my data access layer. However, the XML file that defines the object graph is way up in my web layer. How can I push that object graph, or at least the data source, down through the business logic layer and into the data access layer?
That's when the full realization of dependency inversion hit me. I was thinking about the web layer depending upon the business logic layer, which then depended upon the data access layer. This is not the Spring way. Instead, the web, business logic, and data access layers are all independent. The IoC container depends upon them all. The individual components within these layers only depend upon interfaces.
So the one XML file declares a data source. It then declares a data access component and gives it a reference to the data source. Next comes a business logic component and with a reference to the data access component. Then, the controller with a reference to the business logic component. And finally, the URL mapper comes last with a reference to the controller. As more URLs, controllers, and components are added, this chain widens into a graph.
You can't pick and choose which pieces of your application use dependency inversion. Please don't try. Once you start down the Spring path, all dependencies are inverted. The graph that's defined at the highest layer of your application delves deep into the lowest, and touches all layers in between. Consider dependency inversion for your next project, and think carefully about the consequences.
Tuesday, December 11th, 2007
When discussing topics such as definitiveness or dependency, I'll use the word "behavior" when others might say "state". The difference might seem academic, but it has some practical ramifications that become apparent later.
The behavior of an object is what you can observe. It can be measured using the object's public interface alone. The interface is a contract that constrains an object's behavior, and that contract doesn't allow you to see anything else. This is the object-oriented concept of information hiding.
The state of an object is what it contains. You need to pierce the public interface or use a debugger to observe an object's state. A loose contract (a.k.a. a "leaky abstraction") might let some of that state show through. The object uses its own state to implement its behavior. This is the object-oriented concept of encapsulation.
So the behavior of an object is implemented in terms of state. But behavior can be described without invoking state at all. I can say that the behavior of a business object is dependent upon the behavior of a row in a database. There may be caching going on, and that cache may have state that can change. But that doesn't make it dynamic. A properly implemented cache is still dependent, because it behaves that way.
When considering the public interface of an object, behavior is all that matters. So when I want to say something about an object and its interaction with other objects, I'll limit myself to talking about behavior. I'll ignore state until it's time to code.
The behavior of numbers
Consider the number 3. I can represent this number in any of a number of ways. I can draw three vertical lines. I can hold up three fingers. I can draw the numeral 3. Or I can store the bits 00000011. These are different states that represent the object 3, depending upon its implementation.
Mathematicians have defined integers based entirely on their behavior. They say that there exists an integer called zero. They also say that if you increment or decrement any integer, you will get another integer. This is a sufficient definition, and it has nothing to do with representation.
So you can define the number 3 as the integer that you get when you increment zero, increment the result, and increment again. It doesn't matter how you store the state "3", a correct implementation will exhibit this behavior.
If we use vertical lines, you increment by drawing one more. If we use fingers, you hold up one more. If we use binary, you flip bits from left to right, stopping at the first zero-to-one transition. These are all implementation details based on state. But the behavior is always the same.
The effect of describing behavior
If I limit myself to talking about behavior, then I get to do a few good things. First, I get to say what I mean without a bunch of qualifications. I can say that the business object depends upon the database row without having to describe the cache that sits in between. I can say that history grows indefinitely, even though a program that actually did that would eventually crash.
Second, I get to procrastinate. I can put decisions off until I have enough information to make them. I can implement an object's behavior one way, then measure the system to see if I need to change it.
Third, I get to prove things. There is a reason that mathematicians define integers based on behavior. It's hard to prove things about state. Formal proofs of algorithms is notoriously difficult. But proofs based on behavior are trivial by comparison.
When I explore some techniques for historic modeling in upcoming posts, I'll be talking about the behavior of objects. When getting down to the implementation details, I'll start talking about state. It's the difference between the two that makes these techniques so useful.
Friday, July 13th, 2007
Yesterday I attended a presentation at the Dallas .NET User's Group. Norm Headlam presented the Microsoft Patterns and Practices Web Service Software Factory (or Service Factory for short). You can download this tool from http://msdn2.microsoft.com/practices. Follow the "Web Service Software Factory" link.
Norm demonstrated how the Service Factory generates a solution with a set of well-organized projects. It generates a service layer, a data contract, a business layer, and a data access layer. Each of these assemblies has one and only one role. The service layer defines the web service. The data contract defines the data types that are exchanged between the client and the server. The business layer contains all of your business rules. And the data access layer interfaces with the database.
No longer will you return a dataset from a web service. These layers are separated according to best practices.
Norm showed how the Service Factory offers ongoing guideance. It doesn't just generate this skeleton and leave you to fill it in. Instead, you right-click on your projects to add data objects, data containers, and service endpoints. As you work, the Service Layer keeps a log of your changes. And it gives you links to your most likely next step.
Web services are no longer generated from your code. Instead, your code is generated from your contracts. You can import a database schema as a data object in the data access layer. Or you can import an XML schema to generate an object in the data contract. This is not like NHibernate, where the data layer is generated at runtime from business objects. This actually does it the right way, where the stored procedures that your DBA authors are honored, and all code is generated at design time.
The only problem I have with Microsoft's Web Service Software Factory is the name. This is not a Software Factory.
According to the book (Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools), a software factory is a set of tools that generate several products within a software product line. All products in a product line serve the same vertical domain. They are described using domain specific languages.
"Web services" is not a product line. It is a horizontal slice of software technology, not a vertical problem domain. And these tools, useful as they are, are not a domain specific language. I look forward to using Service Factory on a real project in the near future, but I will also continue to create software factories in the real sense of the term.
Tuesday, June 12th, 2007
Several recent events are converging on a common theme.
Windows Presentation Foundation brings new user interface possibilities to Vista. Silverlight delivers WPF to the browser. Microsoft’s surface computer let’s us interact with the computer on a near-tactile level. Steve Jobs announced that the programming model for the iPhone is Ajax in Safari. Apple ported Safari to Windows. And Google Gears allows you to run a web application while not on the web.
All of these events demonstrate that the new programming model is markup and script. This has been the programming model for Web 2.0, but it is now it is breaking out of the browser.
This programming model has some promise, but there is a dark side. This moves us away from computer science based in proof. The tools that we have chosen to use for the new programming model are usually dynamically typed. With the exception of C# in WPF, the scripting languages that we bolt on to the markup are neither compiled nor statically typed.
Static typing is not just a way to give us intellisense, but it is a way to express intent in the form of a contact. Type checking is one simple way for the compiler to check a contract. But wouldn’t it be great if the compiler could do even more for us? Current research languages can prove that null pointers are not derefferenced. And with existing theorem provers, it is feasible to verify preconditions and post conditions at compile time. Despite the apparent evidence to the contrary, the web model is leading away from a path of increased productivity.
In addition, the web model does not support real-time collaboration. The best we can achieve with a request/response protocol like HTTP is rapid refresh. That solution is not scalable, and not feasible for many kinds of highly collaborative applications. And even when it works, the programming model forces developers to be aware of the fact that they are checking for updates. They can’t just express the relationships among objects and leave it at that.
I see a new programming model on the horizon after the markup/script wave has passed. In this programming model, the code represents the relationships between objects, not the triggers that cause the next domino to fall in a Rube Goldberg machine made of script. The compiler and the runtime can work together to make inter-related objects behave in a provably consistent way.
Update Controls is the first step along this path. With this library, you express the relationships between your user interface components and your data. You say what is to be displayed; the runtime takes care of when. If the data is changed in one window, the library pushes that change to the others automatically.
The next step is a collaborative object model that pushes those changes across the wire. When that step is ready, data can be changed on one computer, and the effect will automatically appear on another. Best of all, application developers won’t need to code that behavior. They simply express the relationships among objects and the runtime makes it happen.
I hope you will be with me for the next few steps in our evolution. The future is rapidly approaching in which we bring great value to our clients. It’s closer than you think.
iPhone programming model
Safari for Windows
Tuesday, May 29th, 2007
On the SSG oven software project, Raymond and I used several mathematical truths to prove that the design met the stated requirements. Three of those truths are mapping, scope, and partial order.
A mapping is a function from one set, called the domain, to another set, called the range. The range is not larger than the domain. If the mapping is one-to-one, then a reverse mapping exists and the range is exactly the same size as the domain. A hash is a many-to-one mapping, so the range is smaller than the domain. Therefore, no reverse mapping exists.
A scope is a context in which simpler objects reside. You can abstract the relationships among the enclosed objects as relationships among the scopes in which they live.
A partial order is a transitive relationship between objects in a set. It is not as restrictive as a full order, but allows for some independence. A partial order, because it is transitive, must be acyclic. Replacing a full order with a partial order can relax the constraints on a design, and give you room for optimization.
Your homework is to design an Integrated Circuit designer. Allow the user to build a component out of other components. Prove that your design does not allow the user to create a recursive cycle.
Tuesday, May 22nd, 2007
If you’ve taken a course in analysis of algorithms, then you’ve learned how to prove that an algorithm will have a certain result. You’ve proven that quicksort will terminate, that it will actually sort the list, and that it will complete in n log n time. But in the real world, all of the sorting and searching algorithms are already written. We no longer need to prove algorithms. Now we need to prove designs.
Cycle of discovery
Programmers tend to work in a code-test cycle. We’ll write some code, then we’ll see if it does what it should. Then we’ll go back and change the code and test again. The cycle continues until the feature is complete. This is a cycle of discovery. This is how we explore new things. This is the scientific process. This is not how other engineering professions work. Sure, it gets the job done, but it’s not the most efficient way to do it.
Agile development techniques put new labels on the cycle of discovery. Now instead of code-test we have red-green-refactor. But it’s still the same cycle. We don’t know ahead of time that the code we are about to write is correct. We just keep typing until it works. In this world you don’t know how close you are to the finish line until you cross it. How are we ever to estimate our work? Or mitigate risk?
Would you buy a house from someone who says, “let’s just keep nailing boards together until it looks like a building”? Of course not. But most software is more complex than a house. It takes longer to build. It has more moving parts. It’s more expensive. We should be doing at least as much as a home builder to ensure that we will deliver the right product, at high quality, on time, and on budget.
We can do that by proving our design.
Cycle of proof
In my day-to-day work, I follow a different cycle. I’ll learn the requirements of a feature, design the feature, code it once, and then fix bugs. I don’t spin in a tight code-test cycle. Once I know what the design should be, I just sit down and write it all out. Sure, I may have bugs and typos, but they tend to be easy to fix. I may go days between compiles, but in the end I know that it will work. I know because I’ve proven it.
Tuesday, October 17th, 2006
Java's Date and Calendar API is awful. Even when a quick date calculation is required, you have to stop the flow of your coding to go through the GregorianCalendar gyrations. It is impossible to simply express the intent with these defunct classes.
I've decided to use this as an learning exercise. What is it about Date and Calendar that makes it so difficult to use? What can we do to avoid these difficulties when designing our own classes? Fortunately, there is an alternative that gets it right. When comparing java.util to Joda, the problems become quite apparent.
The first problem is that the Date class has no useful methods. In order to use a Date, one must almost always put it into a GregorianCalendar. Most of us writing in Java are working on business applications, and business uses the Gregorian calendar. Different countries have different names for the days of the week and the months of the year, but we all have days, weeks, months, and years. The Date class should expose those concepts directly.
The second problem is that method calls cannot be chained. OK, I've accepted the fact that to work with a Date, I have to construct a GregorianCalendar. So I start to type something like this:
int day = new GregorianCalendar(date).get(Calendar.DAY);
But, no, GregorianCalendar has no constructor that takes a Date. I have to construct and initialize it in two separate steps. And then, to make matters worse, the method that initializes the GregorianCalendar is void. So I can't even chain this call with the get. I have to do in three lines what I would like to do in one.
The third problem is that the API provides one generic getter and setter for all fields of a Calendar. A Calendar is a collection of fields identified by constants. But I don't care how you represent a Calendar, I just want to know what day it is.
The fourth problem is that there is no distinction between a date (October 17, 2006), a time of day (10:11 pm), and a moment in time (October 17, 2006 at 10:11 pm). These different concepts need to be treated differently. When working with dates, for example, you don't apply time zones or daylight savings rules. The span between October 29 and October 30 is still 1 day, even though the span between midnight on those two days may be 25 hours (depending upon where you live).
Joda solves all of these problems. It is easy to use right where you need it without breaking stride. Do yourself a favor and start using it now, even if you are right in the middle of a project. You'll never want to be without it.
But the point of this post is not to plug Joda, but to codify four rules for creating good APIs:
- Put the useful methods on the most used classes. Don't force your audience to bring in objects that they don't already need.
- Promote method chaining. This is accomplished in two ways. First, all methods should return something, even if it is "this". And second, provide all necessary conversions. Overload constructors, or better yet have a good set of static methods (GregorianCalendar.fromDate() would have been nice).
- Break down the abstraction the way your audience will use it. Don't design the API based on implementation.
- Create a different type for each useful distinct idea, even if it appears to be a special case of another. Remember, a circle is not an ellipse, no matter what you learned in geometry.
While you are working on the API, remember the golden rule Necessary and Sufficient. Don't go crazy with the convenience methods or you will only sow confusion.
Thursday, September 14th, 2006
A good interface contains only the methods that it needs. The interface has enough methods to accomplish its task: the set methods is sufficient. In addition, the interface does not have any methods that it doesn't need: the set of methods is necessary. If you find that an interface has unnecessary methods, you should remove them. And if you find that it is insufficient, you must add to it.
A good contract is also both necessary and sufficient. A contract is more than the set of methods (the interface) that a type supports. It is also the constraints: preconditions, postconditions, and invariants. If it has unnecessary constraints, then there are otherwise valid situations that violate the contract. If it has insufficient constraints, then the contract allows some invalid situations.
A good unit test suite is also both necessary and sufficient. If it has unnecessary tests, then it is harder to maintain than it could be. If it has insufficient tests, then it won't detect all defects.
The balance of necessity and sufficiency is found all throughout mathematics. Software, being a form of applied mathematics, inherits this trait. Something that achieves this balance is beautiful to behold.
Two examples of this balance in literature can be found in my recommended book links. Euclid's Elements is the foundation for modern mathematics. He provides a small number of axioms, and from those he derives a large number of theorems. The axioms are necessary and sufficient to describe all of geometry. Bertrand Meyer's Object-Oriented Software Construction contains example after example of necessity and sufficiency in software contracts. Pay close attention to his Stack type.
Examples of software that do not achieve this balance are abundant. Many parts of the JDK have "convenience methods", which by definition are unnecessary. The SOAP specification is an insufficient contract, which leads to many interoperability problems.
Achieving this balance takes time. Only with experience can we hope to get there. It's something I strive for every day.