Archive for October, 2006

Alpahbetic Options

Saturday, October 28th, 2006

In Windows, open a folder and open "Tools: Folder Options...". Click on the View tab. The Advanced Settings is a scrollable list of options.

folderoptions.jpg
If I want to turn on known extensions, do I look under D for for "Display", S for "Show", or H for "Hide"?

The same metaphor appears in Word 2003. Go to "Tools: Options..." and click the Compatibility tab.

wordoptions.jpg

This options are grouped by words like "Don't", "Surpress", and "Use". When looking for hanging indent tab stops, I won't be looking for "Don't".

The only explanation I can find for this metaphor is laziness, and not the good kind. Instead of taking the time to group and organize options for us, the developer just throws them into a checked list box. This is also an indicator of feature indecision. If you have so many options that you need to scroll through them, you have failed to decide how your features work.

Java Tuple

Friday, October 20th, 2006

Final UpdateThe latest code is at http://adventuresinsoftware.com/blog/?p=546. The javatuple.com site is going off-line.

Update I've created a new version of this Java Tuple library. It can be found at http://javatuple.com. The new version uses generics for type safety. It also has extractor methods. If you use Java 1.4 or earlier, or you just prefer the previous version, please read on. Otherwise, visit javatuple.com.

Thanks.

I don't know how many times I've needed a hash map with a compound key, but it seems to come up often. Every time, I am faced with the same question: do I really want to define a class for this key?It's seems like such a simple thing at first. Just a class with a few members. But then you have to initialize those members in the constructor. And then you need to write equals(). And don't forget hashCode(). Sometimes I'm tempted to create a concatenated string to get around the extra baggage. But that would be a hack.To solve this problem, I finally decided to create a Tuple class. A tuple is a an ordered collection of heterogeneous elements. Compare this to a vector, which is homogeneous. The type of a tuple is defined by the number and type of its elements.Examples of tuples that naturally occur in software are parameter lists and database rows. Both of these things have a fixed, finite number of elements that must occur in a certain order. All rows in a table have the same number and types of columns, but one column can have a different type from another. Some languages treat tuples as native data types, but not Java.My Tuple class comes in two pieces. The base class implements all of the comparison behavior. The concrete class lets you build a generic tuple.

package com.adventuresinsoftware.tuple;import java.util.ArrayList;import java.util.Iterator;public abstract class BaseTuple implements Comparable {    // Ordered collection of elements.    ArrayList elements = new ArrayList();    // Strings used to display the tuple.    String open;    String separator;    String close;    // Initialize the strings for this tuple type.    protected BaseTuple(String open, String separator, String close) {        this.open = open;        this.separator = separator;        this.close = close;    }    // Add elements to the tuple. Supports dot-chaining.    protected BaseTuple addElement(Object o) {        elements.add(o);        return this;    }    protected BaseTuple addElement(int i) {        return addElement(new Integer(i));    }    // Compare two tuples. All elements must be equal.    public boolean equals(Object obj) {        if (obj == null)            return false;        if (!(obj instanceof BaseTuple))            return false;        BaseTuple that = (BaseTuple) obj;        if (that.elements.size() != this.elements.size())            return false;        for (int i = 0; i < elements.size(); ++i) {            if (!that.elements.get(i).equals(this.elements.get(i)))                return false;        }        return true;    }    // Calculate a hash code based on the hash of each element.    public int hashCode() {        int result = 0;        Iterator it = elements.iterator();        while (it.hasNext()) {            result = result * 37 + it.next().hashCode();        }        return result;    }    // Display the tuple using the open, separator, and close    // specified in the constructor.    public String toString() {        StringBuffer result = new StringBuffer(open);        Iterator it = elements.iterator();        while (it.hasNext()) {            result.append(it.next());            if (it.hasNext())                result.append(separator);        }        return result.append(close).toString();    }    // Order by the most significant element first.    // The tuples must agree in size and type.    public int compareTo(Object o) {        BaseTuple that = (BaseTuple) o;        for (int i = 0; i < elements.size(); ++i) {            int compare = ((Comparable) this.elements.get(i))                .compareTo((Comparable) that.elements.get(i));            if (compare != 0)                return compare;        }        return 0;    }}
package com.adventuresinsoftware.tuple;public class Tuple extends BaseTuple {    // Display a coma-separated list of elements.    public Tuple() {        super("(", ", ", ")");    }    // Add generic elements to the tuple.    // Supports dot-chaining.    public Tuple add(Object o) {        super.addElement(o);        return this;    }    public Tuple add(int i) {        super.addElement(i);        return this;    }}

To use a Tuple as a compound HashMap key, just construct one in place using dot-chaining.

    public void testTupleHashMap() {        Map population = new HashMap();        population.put(            new Tuple().add("TX").add("Dallas"),            new Integer(1213825));        population.put(            new Tuple().add("TX").add("Fort Worth"),            new Integer(624067));        population.put(            new Tuple().add("IL").add("Springfield"),            new Integer(203564));        population.put(            new Tuple().add("NM").add("Albuquerque"),            new Integer(494236));        assertEquals(new Integer(624067), population.get(            new Tuple().add("TX").add("Fort Worth")));        assertNull(population.get(            new Tuple().add("NM").add("Roswell")));        assertNull(population.get(            new Tuple().add("NJ").add("Springfield")));    }

The Tuple class has a useful toString method.

    public void testTupleToString() {        Tuple tuple = new Tuple().add("MLB").add(1);        assertEquals( "(MLB, 1)", tuple.toString() );    }

And if you want to create your own strictly typed tuples, you can declare your own derived class.

package com.adventuresinsoftware.tuple;public class Version extends BaseTuple {    // A version is a tuple of 4 integers separated by periods.    public Version(int major, int minor, int release, int qfe) {        super("", ".", "");        addElement(major).addElement(minor).addElement(release).addElement(qfe);    }}
    public void testVersion() {        Version v1 = new Version(2,0,2,3425);        Version v2 = new Version(2,1,0,241);        assertEquals("2.0.2.3425", v1.toString());        assertEquals("2.1.0.241", v2.toString());        assertTrue("Version 2.1.0.241 > 2.0.2.3425", v2.compareTo(v1) > 0);    }

Enjoy!

Four Rules for APIs

Tuesday, October 17th, 2006

Java's Date and Calendar API is awful. Even when a quick date calculation is required, you have to stop the flow of your coding to go through the GregorianCalendar gyrations. It is impossible to simply express the intent with these defunct classes.

I've decided to use this as an learning exercise. What is it about Date and Calendar that makes it so difficult to use? What can we do to avoid these difficulties when designing our own classes? Fortunately, there is an alternative that gets it right. When comparing java.util to Joda, the problems become quite apparent.

The first problem is that the Date class has no useful methods. In order to use a Date, one must almost always put it into a GregorianCalendar. Most of us writing in Java are working on business applications, and business uses the Gregorian calendar. Different countries have different names for the days of the week and the months of the year, but we all have days, weeks, months, and years. The Date class should expose those concepts directly.

The second problem is that method calls cannot be chained. OK, I've accepted the fact that to work with a Date, I have to construct a GregorianCalendar. So I start to type something like this:

int day = new GregorianCalendar(date).get(Calendar.DAY);

But, no, GregorianCalendar has no constructor that takes a Date. I have to construct and initialize it in two separate steps. And then, to make matters worse, the method that initializes the GregorianCalendar is void. So I can't even chain this call with the get. I have to do in three lines what I would like to do in one.

The third problem is that the API provides one generic getter and setter for all fields of a Calendar. A Calendar is a collection of fields identified by constants. But I don't care how you represent a Calendar, I just want to know what day it is.
The fourth problem is that there is no distinction between a date (October 17, 2006), a time of day (10:11 pm), and a moment in time (October 17, 2006 at 10:11 pm). These different concepts need to be treated differently. When working with dates, for example, you don't apply time zones or daylight savings rules. The span between October 29 and October 30 is still 1 day, even though the span between midnight on those two days may be 25 hours (depending upon where you live).

Joda solves all of these problems. It is easy to use right where you need it without breaking stride. Do yourself a favor and start using it now, even if you are right in the middle of a project. You'll never want to be without it.

But the point of this post is not to plug Joda, but to codify four rules for creating good APIs:

  1. Put the useful methods on the most used classes. Don't force your audience to bring in objects that they don't already need.
  2. Promote method chaining. This is accomplished in two ways. First, all methods should return something, even if it is "this". And second, provide all necessary conversions. Overload constructors, or better yet have a good set of static methods (GregorianCalendar.fromDate() would have been nice).
  3. Break down the abstraction the way your audience will use it. Don't design the API based on implementation.
  4. Create a different type for each useful distinct idea, even if it appears to be a special case of another. Remember, a circle is not an ellipse, no matter what you learned in geometry.

While you are working on the API, remember the golden rule Necessary and Sufficient. Don't go crazy with the convenience methods or you will only sow confusion.

Feature Indecision

Monday, October 16th, 2006

Here's a scenario that I've seen played out countless times. I recall a requirements meeting during which an issue came up about a new feature. The business analysts, software architects, and product managers present all had different opinions on the fine points of its behavior. All had valid opinions. The user might actually want the feature to work in any of the proposed ways. Nobody wanted to loose the argument by putting their opinions aside, so instead of choosing one we decided to ask the user what they wanted.

At first, we were going to ask via a modal dialog that appears when the user invokes the feature. Such pop-ups are hostile to the user experience, so it was changed to a checkbox on the screen that invoked the feature. However, this solution cluttered the UI, so we eventually buried it in a separate configuration window.

After the feature shipped, we received feedback that it behaved inconsistently. Sometimes it worked one way, and sometimes another. Apparently one power-user who knew about the configuration window was changing it, and all of the users who didn't were suffering the consequences.

The problem, of course, is that we failed to take responsibility for our product's behavior. As software designers, it is up to us to decide what a feature does. When we shirk that responsability, we create unnecessary complexity in the software. Complexity leads to confusion.

Now when I see feature indecision, I put a stop to it. Whenever someone recommends that we prompt the user, I ask if there is any way we can decide now. Sure, that means that some people leave the meeting "loosers", but the user wins in the end.

More Trouble with Naming

Friday, October 13th, 2006

Previously, I posted a solution that I learned from my wife for the problem of domain name kiting, squatting, and misspelling. In that post, I mentioned that other nations have little control over the way that this international network is administered. The e360 Insight LLC v Spamhaus suit illustrates exactly why we should all be concerned.

Spamhaus maintains a blacklist of spammers. e360 Insight LLC sued Spamhaus after it was blacklisted, and won an $11.7 million ruling. But the US district court that ordered the payment has no validity against the UK-based Spamhaus, so e360 turned its attention to another US company: ICANN.

e360 has asked the court to order ICANN to suspend spamhaus.org until Spamhaus complies with the ruling.

A recent ZDNet article reports that ICANN is taking the high road in stating that it will refuse this order. I applaud their efforts to stay above the fray. But some day they will be drawn down into it, bringing the Internet (to some degree) with them.

Your iPod Software is Too Old

Thursday, October 12th, 2006

You may have heard that Apple has released a major update to iTunes. In fact, they have already released a patch for that update. I've been following the community response of iTunes 7 (now 7.01) to learn when I should take the plunge. By all accounts, it is still too early. Since I am still on iTunes 6, I don't know whether the following problem have been corrected.

When you plug your iPod into your PC (I can't attest to the Mac behavior), iTunes will automatically synchronize your music, podcasts, and videos. Very convenient. So convenient, in fact, that I often plug in my iPod and walk away from the machine. Recently, however, I have returned to my machine to find this error message:

iTunes too old

(If the image does not appear, the text reads "Some of the items in the iTunes music library were not copied to the iPod "Mike's iPod" because your iPod software is too old. Go to www.apple.com/ipod/download to get the latest iPod Updater.")

When I press OK, the iPod synchronization commences. When it finally completes, all of my songs, podcasts, and videos are properly synchronized. I don't know which items it was complaining about because they all seem to have been copied to my iPod. This is what Alan Cooper calls "stopping the proceedings with idiocy." There are a number of lessons we can learn from this error message.

  1. Never display a modal error message during a long-running operation. The user is probably away.
  2. If you must display an error message concerning "some of the items," list those items.
  3. Never tell the user that there is a problem when there isn't one.
  4. Never use an error message as an opportunity to promote a software upgrade.

Until I hear that iTunes 7 is solid, or better yet that a fully-integrated alternative exists, I will just have to put up with this idiocy.

Mathematical Where Clause

Tuesday, October 10th, 2006

Here's a trick that I learned from a database guru: instead of using "SELECT COUNT(*) ... WHERE x", consider using "SELECT SUM(IF(x, 1, 0))". He calls this a mathematical where clause, because it simulates a where clause with an aggregate function.

The idea is that a SELECT statement can have only one WHERE clause. If you want to count different subsets of rows using COUNT(*), you have to run a different SELECT statement for each one. That could cost you some performance if your WHERE clauses don't line up with indexes, because each SELECT statement performs another table scan. However, you can combine multiple subsets into one SELECT statement by writing different SUMs. Then, it does all of the counting in just one table scan.

Today I wrote some code to aggregate logs over time. I wanted to capture the total number of requests, as well as the number of failed requests. I solved the problem using a mathematical where clause:

SELECT command, COUNT(*) AS total_requests, SUM(IF(result!=200, 1, 0)) AS failed_requests
  FROM log
  GROUP BY command

And this trick can be combined with other aggregate functions to produce even more interesting results. Instead of just adding up 1s, you can aggregate any value you want. Just put the value on the positive side of an IF. But be careful about what goes on the negative side. Whereas a 0 has no effect on a SUM, it has a devistating effect on functions like MIN or AVG. For those, use a NULL. Aggregate functions ignore NULLs.

For example, I added another column to my select statement to return the smallest and largest valid results. I added the following to the statement:

SELECT ... MIN(IF(result=200, response_size, NULL)) AS min_result,
  MAX(IF(result=200, response_size, NULL)) AS max_result ...

A 0 wouldn't mess up the MAX, but I used NULL for consistency with the MIN.

As always, please carefully consider the impact of this technique on your performance. Mathematical where clauses cannot take advantage of indexes. So if your conditions line up with indexes, go ahead and use multiple SELECTs. That way you can avoid the table scan altogether.

Process vs. Goal Orientation

Sunday, October 8th, 2006

When my wife asks me to do something, she first tells me where to go, then what to get, and finally what to do with it. For example, today she told me "In the bathroom, under the sink on my side, there are some disposable wipes. Could you clean off that stain on the landing?" I, of course, had already cleaned off that stain, but with paper towels from the kitchen.

This exchange points out one key difference in our thinking. She is a process-oriented thinker while I am a goal-oriented thinker.

Quite often, she will ask me to help her with software. It might be a program that she has never used before, like when she was learning Picasa (now she is a Picasa guru). Or it might be a program so complicated that it is new to her every time she uses it, like Word. In either case, she tells me what she wants done (the goal), and I figure out how to do it.

When I'm finished, she asks me to repeat the steps for her. After that, she can execute that same process over and over again.

Generally speaking, programmers are goal-oriented. We have to be. The program is a blank canvas. We have nothing to help guide us except the requirements. There are many ways to accomplish the goal, and no one way is the right way.

Many other people are process-oriented. A lot of these people use our software. So when we programmers write software for ourselves, we alienate a large subset of our potential users.

A programmer will generally create a program with many tools that can be combined in uncountable ways to achieve any goal. Examples of these kinds of programs are Word, AutoCAD, and Photoshop. These tend to be very powerful programs that take a long time to learn. And, with the exception of one member of this class, they tend to be designed for specific audiences. AutoCAD is for engineers. Photoshop is for artists. Engineers and artists, like programmers, have to be goal-oriented. Word is the exception that proves the rule: almost everybody uses Word, but they tend to use only the surface features. When they need more from the tool, they turn to someone who has spent some time to learn its nuances.

Process-oriented users need a different style of program. They don't want to see a bunch of tools, they want to be taken step-by-step through their work. Examples of these programs are Picasa, SketchUp, and Quicken. (Two of these, you’ll notice, have been acquired by Google. Hmmm...) These programs have fewer features, but those features are targeted to specific goals. The important thing to notice is that process-oriented people have goals, and they recognize those goals. They just focus their work on the process instead of the goal itself.

The software development team is responsible for identifying the audience for their program. If they are developing for goal-oriented users, they should create a set of small tools that can be combined in multiple ways, and leave the goal up to the user. If they are developing for process-oriented users, they should discover the users goals ahead of time and design a process for each one.