Entity Framework query based on properties of related objects

We're using Entity Framework for data access on an enterprise system. It has been painful, but we've gradually learned how to use the product. This last problem was our fault, but we blamed the tool.

The situation is that we have Orders with OrderLines. We want to find all orders that include a particular item. From those orders, we want to get information from a related entity called OrderSummary.

We tried several approaches until we found a syntax that would compile and give us the right set of orders. What we ended up with was a linq query using two "from" clauses:

OrderContainer container = new OrderContainer();
var source = container.Orders.Include("OrderSummary");

var orders =
	from o in source
	from ol in o.OrderLines
	where ol.ItemId == "32180"
	select o;

This query returns the correct set of orders, but the OrderSummary navigation property is not loaded. We called it a bug in EF and worked around it. The workaround involved multiple queries, and was generally a bad idea.

Fortunately, we took the time to revisit the problem. The first thing I did was to use Reflector to find out what that query was actually doing. Here's the equivalent:

var orders = source
	.SelectMany(o => o.OrderLines, (o, ol) => new { order = o, orderLine = ol })
	.Where(result => result.orderLine.ItemId == "32180")
	.Select(result => result.order);

It's a round-about way of saying it, but it looks OK. It creates a tuple of orders and order lines, filters that tuple based on the item ID, then selects just the order part of the tuple. If you think about it relationally, this is pretty close to an old-fashioned WHERE join.

Keeping with the explicit syntax, I wrote the query exactly as I wanted it. This is the result:

var orders = source
	.Where(o => o.OrderLines
		.Any(ol => ol.ItemId == "32180"));

This reads like the specification. It gets all orders where any order line has the desired item ID.

This version of the code not only selects exactly the orders I want, it also includes the OrderSummary. No longer do I have to do additional queries to fill in the missing details.

Here's my solution
My advice after working through this problem is to skip the linq syntax. Get used to the more explicit extension method syntax. Understand .Where(), .Select(), and .Any(). When you are more explicit with your tools (and I'm not talking profanity here), you have a better chance of getting the behavior you want out of them.

Leave a Reply

You must be logged in to post a comment.