Math 101 for developers

My development process is deeply mathematical. It is based on the work of Bertrand Meyer and other computer scientists. Math is severely under represented among today's TDD/YAGNI/Agile/anti-BDUF digerati. I like to think through the problem before putting hands to keyboard, and I don't like being told I'm wrong for doing so. So to fill the void, I will be presenting my own mathematically-based process.

When people hear "math", they get scared. They think about how they struggled through trig class, or how statistics kicked their butts. If your palms sweat when you think about geometry, please calm down and bear with me. The math that we are going to talk about is not hard. You just have to think things through logically.

Null reference exceptions
Even though my process begins away from the keyboard, I'll begin this discussion with code. We can work backwards from there.

Your first exercise is to prove that this code does not throw a null reference exception.

class DogLover
{
	public void BuyAPet(Store store)
	{
		Pet pet = store.BuyDog();

		if (pet == null)
			throw new Exception("No dogs were available.");
		else
			pet.Feed();
	}
}

We don't know anything about the BuyDog() method. It may return null or it may not. We have to check the contract. Not many programming languages have a way to encode a contract (Eiffel and Spec# are the only two that come to mind), so we usually rely on comments if we write a contract at all. Here's the contract for BuyDog():

/// <summary>
/// Buy a dog from the pet store.
/// </summary>
/// <returns>A dog if one is available, or null if they are out of stock.</returns>
public Pet BuyDog()
{
	// ...
}

So the contract tells us that BuyDog() could return null. That is its postcondition: the part of the contract that promises what will be true. Based on this postcondition, we cannot prove that pet is not null right after the call.

But we do have a check in the code. As you enter the "else", we know that the "if" condition is false. Therefore, pet != null. pet.Feed() does not throw a null reference exception.

That is the complexity of proof that I'm talking about. It's really easy to do. Much easier than anything you'll see in Euclid's Elements.

Preconditions
Go back and check the code again. There is a case that we did not consider. We proved that pet.Feed() will not throw, but we did not prove that store.BuyDog() is safe. There are two ways to do so: add a check, or add a precondition.

If we add a check before store.BuyDog(), then the same proof as above applies. This is known as "defensive programming". I personally do not like defensive programming because of all the noise it adds to code. I much prefer preconditions.

A precondition is part of the contract. It is a requirement that certain conditions must be true before you call a method. We add a precondition like so:

/// <summary>
/// Buy a pet from a store.
/// </summary>
/// <param name="store">The store from which to buy the pet.
/// Must not be null.</param>
public void BuyAPet(Store store)
{
	// ...
}

Now the caller knows that he cannot pass null.

But this comment did nothing to change the behavior of the code! How does this solve the problem?

The problem is not what the code actually does. The problem is that you couldn't prove that the code worked. Now you can. The BuyAPet() method lives inside of a larger system. If there ever was a problem, you would know that the author of BuyAPet() proved his code; the caller did not. Therefore the bug is in the calling code.

Most compilers do not do this proof for you. Eiffel verifies contracts at run time, not at compile time. Spec# can do some simple proofs at compile time (including this one), but is not yet adopted for commercial use. Still, if dynamic languages can get away with not checking types until run time, I'm fine with not checking contracts until run time. That's what unit tests are for.

Predicates
Let's modify the code a bit.

/// <summary>
/// Buy a pet from a store.
/// </summary>
/// <param name="store">The store from which to buy the pet.
/// Must not be null.</param>
public void BuyAPet(Store store)
{
	if (store.DogsInStock < 1)
		throw new Exception("No dogs were available.");

	Pet pet = store.BuyDog();
	pet.Feed();
}

We've removed the check for pet == null. Can you prove that this code is safe?

You still can if you define a predicate on Store.

/// <summary>
/// The number of dogs available for sale.
/// </summary>
public int DogsInStock { get; private set; }

/// <summary>
/// Buy a dog from the pet store.
/// </summary>
/// <returns>A dog if DogsInStock > 0, or null otherwise.</returns>
public Pet BuyDog()
{
	// ...
	return null;
}

The postcondition on BuyDog() now references the property DogsInStock. We can use that in our proof. Now the steps are:

  1. If we reached store.BuyDog(), then store.DogsInStock < 1 is false.
  2. store.DogsInStock < 1 is false implies store.DogsInStock >= 1 is true.
  3. store.DogsInStock >= 1 implies store.DogsInStock > 0 (since DogsInStock is an integer).
  4. By the precondition of BuyDog(), DogsInStock > 0 implies that a non-null dog is returned.
  5. pet is not null.
  6. pet.Feed() does not throw.

I would usually not spell out these steps so explicitly, but these are the thoughts that go through my mind as I read and write code.

Concurrency
There is one small detail that I conveniently ignored in the prior proof. If we allow for concurrent access of the Store, then the proof is invalid. We cannot rely on DogsInStock > 0 when we get to step 4, since that might be changed in a different thread.

Concurrency is the devil for proofs. Functional languages lend themselves better to proof specifically because they disallow concurrent modification of an object; all objects are immutable, so there is no way that another thread could modify it. But hope is not lost.

In my code, I explicitly call out the parts that could be used concurrently. If nothing is said, then the code is not assumed to be thread-safe. You are therefore not allowed to call it concurrently. Since BuyAPet() makes no promise of thread safety, it would be incorrect to call it on multiple threads without guaranteeing that no other thread is using the same DogLover or Store. It is an implied precondition, but it is still a precondition.

I limit the amount of multi-threaded code in my applications. By the time I get to business logic, I've isolated the object and all of its parameters. They are either immutable or owned by that thread by the time the method is called. This allows me to use proof even in the face of concurency. And I can do so without adding noise to the business code in the form of locks or explicit preconditions.

Proof is a valuable tool in software development. Take a look at the code you wrote today and see if you can prove that it does not reference null.

Leave a Reply

You must be logged in to post a comment.