Build your own rules engine 2: If-then is backwards

Download the source code and follow along.

This post is part of a series:

  1. The customer is not you.
  2. If-then is backwards.
  3. Condition and outcome templates.
  4. Condition composition.
  5. The rule model.

It seems that every new rules technology is just another way to write an "if" statement. We are used to expressing conditional logic in procedural code with flow control. Most of the rules engines that you could buy apply that same paradigm to rules. This is a mistake. It is better to turn that around.

Take, for example, the Windows Workflow Foundation Rules Engine. Every rule in the rule set is made up of three components:

  • Condition
  • Then actions
  • Else actions

The condition can read any property of the workflow. It determines which of the two actions to execute. Both of the actions can modify any property of the workflow. Reading and writing from the same set of properties causes problems.

Conflicts
Replace One problem occurs when two rules have actions that write to the same property. For example:

  • If A = 1 then C = 3
  • If B = 2 then C = 4

Given this rule set, when A=1 and B=2, what is the correct value of C?

Workflow and other rule engines solve this problem by prioritizing rules. Rules with higher priority take precedence over rules with lower priority. The unfortunate consequence of this is that you have to understand the priorities of existing rules in order to correctly prioritize a new rule. You cannot insert the new rule with confidence that it won't break anything.

Cycle Cycles
Another problem occurs when a cycle arises among conditions and actions:

  • If D > 4 then E = 7
  • If E > 5 then D = 3

If the second rule fires, should we re-run the first? How do we know when to stop?

Many rules engines use a sophisticated pattern-matching strategy called the Rete Algorithm to solve this problem. This algorithm runs subsets of the rules on each iteration based on the effects of the prior iteration. It walks the deltas toward a satisfactory solution.

Unfortunately, this algorithm is complex and can produce unexpected results. A business user defining a rule set cannot easily reason through cycles to determine exactly what the system will do. It is easier to reason about an acyclic system.

Goal directed rules
What works better is to express the rules in terms of the goal, not in terms of the condition. Rather than saying "If the amount is less than $500, the loan is approved", turn it around. "The loan is approved when the amount is less than $500." While this seems like a semantic distinction, it solves the problems of conflicts and cycles quite neatly.

When I say "The loan is approved when the amount is less than $500", I am also saying that there are no other conditions. To add another condition, I would have to modify the statement. For example: "The loan is approved when the amount is less than $500 or the amount is between $500 and $1000 and the credit score is over 700."

Think about the way that Excel spreadsheets work. I don't write a rule that says "Add one to B2 and store the result in C3". I put a formula in C3 that says "=B2+1". I cannot simply add another rule to the system that also affects C3. I would have to modify the formula in C3. The result is unambiguous.

Think also about the way that Linq works. I don't write a loop that modifies my target collection. I write an expression that calculates it. The behavior of the loop is harder to predict; the behavior of the expression is explicit.

A goal directed rule states the value of one property based on the values of other properties. The rule is explicit. Properties are not allowed to change, and rules are complete, so one rule cannot conflict with another.

Stack Find and eliminate cycles
When rules are goal directed, cycles are easy to find. The algorithm is simple. Start with one property that you intend to solve for. Mark it as being evaluated, and then evaluate it. If that rule refers to another property, check to see if it's marked. If it is not, mark it and evaluate it. If it is, you have found a cycle.

Suppose we wanted to add this rule to our point-of-sale system:

  • People who spend more than $500 get a 5% big spender discount.

More formally:

  • Discount  = if Big Spender, 5%, otherwise 0.
  • Big Spender = Order Total > $500.
  • Order Total = sum of Items minus Discount.

We mark Discount and evaluate it. We see that it requires Big Spender, so we mark and evaluate Big Spender. This leads us to Order Total, and then back to Discount. Since Discount is marked, our rules form a cycle.

To break this cycle, we make our rule more explicit:

  • People who spend more than $500 before discounts are applied get a 5% big spender discount.

Now our formal definition looks like this:

  • Discount  = if Big Spender, 5%, otherwise 0.
  • Big Spender = Order Total > $500.
  • Order Total = sum of Items minus Discount.

The cycle has been eliminated, and it is easier to reason about the system. No complex algorithm is required.

Point-of-sale as goal directed rules
Let's rewrite the example rules of the point-of-sale system. We turn the if-then around to make it goal directed. We combine rules that influence the same goal. And we eliminate cycles by being explicit. Here's the result:

  • Free items = 1 sandwich for every 4 on the original check.
  • Discount =
    • 75 cents for every combination of sandwich, chips, and drink on the original check, plus
    • 5 dollars for every 10 prior visits of 5 dollars or more that have not already been rewarded.
  • Coupons = 1 half-off coupon for every ice cream.

These are some rules that we can easily reason about. Coming up in the series, we'll start to put these rules into code with Condition and outcome templates.

Leave a Reply

You must be logged in to post a comment.