Archive for May, 2008

Java Query

Monday, May 12th, 2008

This is not linq for Java, but it does bear a small resemblance. Download the source code and unit test. [Update: One unit test requires the Java tuple library.]

If you've ever needed to return an Iterable<T> from a Java method, you know that it can be challenging. If you already have exactly the collection that you want to iterate, then all is well. If not, you have two choices.

Option 1, you could construct a new ArrayList<T>, populate it, and return it. This option is less memory efficient than it could be. But it is by far the easiest way to go. You get to write your iteration code using "for".

Or option 2, you could implement Iterable<T> with an anonymous inner class. Instead of using "for", you'll have to manage the state of your iterator yourself. Basically, you have to turn the looping code inside out and put the initialization, testing, and stepping code in separate methods. However, this saves on memory, as you don't have to store the results in an intermediate collection.

On a number of occasions, I've taken option 2. Having rewritten the same ugly code several times, I decided to generalize it.

Inspiration from C#
Having spent some time in the .NET world, I'm familiar with two ways of solving this problem.

First, C# provides the "yield" keyword for turning a loop inside out. You write the code as per option 1, but instead of adding an object to a collection, you "yield return" it. The compiler turns the code inside out and makes it work like option 2. Yael has an implementation of yield return in java, if you like this approach.

Second, C# now has built-in syntax for querying any data source, including collections of objects. This syntax is called linq, or language integrated query. I didn't want to implement all of the linq functionality in java, but I did use it as a source of inspiration.

From
Linq is like upside-down SQL. The first thing that you write is the "from" clause. This is important for type safety and intellisense. Borrowing from this convention, you start a java query with:

Query
    .from(collection)

This returns a Query<T>, which gives you the rest of the methods.

Where
A where clause filters the collection on-the-fly. The iterator that is produced only lands on the items that satisfy the where condition. You specify the condition by anonymously implementing the Predicate interface, as in:

Query
    .from(collection)
    .where(new Predicate<T>() {
        public boolean where(T row) {
            return /* Condition based on row */;
        }
    })

Join
A join traverses a nested collection. For each object of the first collection, you obtain an iterator over the second. The result is the same as two nested for loops. You return the inner collection using the Relation interface, as in:

Query
    .from(collection)
    .join(new Relation<PARENT, CHILD> () {
        public Iterable<CHILD> join(PARENT parent) {
            return parent.getChildren();
        }
    })

This produces an iterator of Join<PARENT, CHILD>.

Select
The last clause in a query is the select. Now that you've built up and narrowed down your collection of objects, the select clause gives you a chance to transform them into the data type that you want. You provide an anonymous implementation of the Selector interface:

Query
    .from(collection)
    .select(new Selector<T, RESULT>() {
        public RESULT select(T row) {
            return row.getProperty();
        }
    })

If you want to get the objects themselves, you can simply use ".selectRow()" instead.

Put it all together
You can combine these clauses to create the query you need. It has to start with a "from" and end with a "select", but you can have any combination of "join" and "where" clauses in between. Here's an example from the unit test:

// Iterate through the order items. Pull out order numbers where a widget is ordered.
Iterator<Integer> orderNumbers = Query
    .from(customers)
    .join(new Relation<Customer, Order>() {
        public Iterable<Order> join(Customer parent) {
            return parent.getOrders();
        }
    })
    .join(new Relation<Join<Customer,Order>, Item>() {
        public Iterable<Item> join(Join<Customer, Order> parent) {
            return parent.getSecond().getItems();
        }
    })
    .where(new Predicate<Join<Join<Customer,Order>,Item>>() {
        public boolean where(Join<Join<Customer, Order>, Item> row) {
            return row.getSecond().getPart().getName().equals("widget");
        }
    })
    .select(new Selector<Join<Join<Customer,Order>,Item>, Integer>() {
        public Integer select(Join<Join<Customer, Order>, Item> row) {
            return row.getFirst().getSecond().getNumber();
        }
    })
    .iterator();

Now you can generate iterators on-the-fly without wasting memory or writing really ugly code. Under the covers, it's just doing what a for loop would do. But with a query, you can return the resulting collection to a caller without turning your code inside-out.

Dallas TechFest 2008

Sunday, May 4th, 2008

The first Dallas TechFest was this weekend. There were sessions on .NET, Ruby, REST, Groovy, Flex, AIR, and many other languages, tools, and platforms. It was a chance for members of different communities to get together and cross-train each other. I made a point of stepping outside my comfort zone, rather than sticking with the .NET track.

But the two sessions that I attended within the .NET track were hosted by Carl Franklin and Richard Campbell of .NET Rocks. I couldn't miss the chance to see them. The first was a recording of a .NET Rocks episode on community. The second was Richard on web application performance and scalability. If you ever get a chance to hear Richard speak, you must go.

After his talk, I asked Richard to take a look at Update Controls. He was very gracious and gave me at least a half an hour of his time over lunch to show him the demo. He understood the benefits of the library immediately, and gave me some pointers for getting the idea across more effectively. Finally, he told me that I need to complete the database story, which, as it happens, is in the works.

It was a thrill to finally meet Carl and Richard, the guys who inspired me to launch this site. And it was enlightening to see what's been going on in other parts of the industry. Who knows. Maybe I'll get a chance to present next year.