Archive for April, 2009

Avoid mutable variables in linq expressions

Wednesday, April 29th, 2009

Please take a moment to determine whether this unit test (in C#) passes.

var collection = new NamedNumber[]
{
    new NamedNumber() { Name = "One", Number = 1 },
    new NamedNumber() { Name = "Two", Number = 2 },
    new NamedNumber() { Name = "Three", Number = 3 }
};
IEnumerable<string> names = null;
for (int i = 1; i < 3; ++i)
{
    if (i == 1)
        names = from n in collection where n.Number == i select n.Name;
}

string name = names.Single();
Assert.AreEqual("One", name);

We are producing an enumerator over named numbers using linq syntax. Only when i is equal to 1 do we produce this enumerator. You would therefore assume that it would enumerate only the numbers equal to 1.

If that were the case, I wouldn't bother to write this post. Indeed, I wouldn't have even written this unit test. Nope, this test fails:

Assert.AreEqual failed. Expected:<One>. Actual:<Three>.

As you may already know, a linq expression is not evaluated until it is realized. The IEnumerable holds the instructions to perform the query, but does not actually do so until the enumeration is traversed. In this case, we traverse the query with the call to "Single".

By the time we get to "Single", the for loop has terminated. At that point i will be equal to 3. (Notice I didn't use <= 3, which would have terminated with i = 4). Only then was the expression evaluated, finding all of the numbers equal to three.

Closures
Many functional languages have a concept called a closure. A closure is a block of code (a.k.a. a function) that refers to a variable from the outside. This function can be passed around and executed at any time, but it carries with it that variable. Here's a very simple example in F#:

let x = 1
let xplus y = x + y

The function xplus carries with it the variable x. The word "variable" can be misleading. The "variable" cannot actually vary at all. It is immutable. At least in a functional language.

Java
The closest thing to a closure in Java is an anonymous inner class (at least for now). Here's the same test in Java (using my Java Query library to approximate linq):

List<NamedNumber> numbers = new ArrayList<NamedNumber>();
numbers.add(new NamedNumber("One", 1));
numbers.add(new NamedNumber("Two", 2));
numbers.add(new NamedNumber("Three", 3));

Iterable<String> names = null;
for (int i = 1; i < 3; ++i) {
    final int fi = i;
    if (fi == 1) {
        names = Query
            .from(numbers)
            .where(new Predicate<NamedNumber>() {
                public boolean where(NamedNumber row) {
                    return row.getNumber() == fi;
                }
            })
            .select(new Selector<NamedNumber, String>() {
                public String select(NamedNumber row) {
                    return row.getName();
                }
            });
    }
}
String name = Query.from(names).selectOne();
assertEquals("One", name);

Notice the extra step I had to include. I had to create a "final" variable. A final variable must be initialized, and cannot change. It is immutable. Java requires that any variable used inside of an anonymous inner class be declared final.

You may be thinking at this point that the value of fi does change during the course of this method. It is assigned a new value for each iteration of the loop. Not true. In fact, a new fi is created each time the loop is entered. Each instance of fi is initialized to a different value. While there is only one i, there are two fi's (1 and 2). The instance of the anonymous inner class is a closure that carries that instance of fi with it wherever it goes.

Here's my solution
Seeing how Java forces you to solve the problem, let's try the same technique in C#.

var collection = new NamedNumber[]
{
    new NamedNumber() { Name = "One", Number = 1 },
    new NamedNumber() { Name = "Two", Number = 2 },
    new NamedNumber() { Name = "Three", Number = 3 }
};
IEnumerable<string> names = null;
for (int i = 1; i < 3; ++i)
{
    int fi = i;
    if (fi == 1)
        names = from n in collection where n.Number == fi select n.Name;
}

string name = names.Single();
Assert.AreEqual("One", name);

Now if you run this unit test, it passes. The linq query captures the instance of fi that was created during that loop (1). It doesn't matter that a new instance of fi (2) was created before the query had a chance to execute. The query carries around the first instance (1), and uses it to select the name.

While C# has its own equivalent of closures (lambdas and linq), it does not force you to use immutable variables inside of them. It's up to you to be careful never to change a variable referenced inside of a linq expression. Initialize a new variable that never changes, and use it exclusively.

.NET to .NET web services without WSDL

Thursday, April 23rd, 2009

We are using web services internally as an RPC mechanism. Our web tier calls our application tier via service interfaces. We are both exposing these interfaces and consuming them. They are not intended for external consumption.

Since we control both ends of the conversation, WSDL is overkill. WSDL is generated by .NET when you publish a web service. Any compliant consumer can import that WSDL and generate a service proxy. When the consumer is another .NET application, you can use either "Import Service Reference" or "svcutil.exe". Publish the service, generate the proxy, and start calling it.

This publish/import workflow causes problems in a homogeneous .NET/.NET environment. The generated proxy classes are all imported into one namespace, even if the published classes come from different namespaces. While this alone is confusing, it can cause real problems if you pass messages among through one service to another, or from multiple clients. Every client defines types of exactly the same shape, but since they are in different namespaces and assemblies, they are not the same type.

Another problem that this causes is version inconsistencies between clients and servers. We use TFS automated builds for continuous integration. After the build, the server is published to the development environment. If we generated client proxies from that development environment, we would have an old version of the client proxy checked in at the same time as a newer version of the service.

Here's my solution
Instead of going through WSDL and generating a proxy, the client can use exactly the same data types that the server publishes. If you look at the code that is generated for you, you can see how. All of the magic is in System.ServiceModel.ClientBase<T>. This class generates a proxy on the fly based on the interface you specify. Rather than importing that interface from WSDL, you can include it from the source.

Put all of your web service contracts into a single project. This includes all of the ServiceContract interfaces that define the service methods, and the DataContract classes that define their parameters and return values. This project will need to reference System.Runtime.Serialization and System.ServiceModel. None of the implementation goes in this project. In the spirit of the Dependency Inversion Principle, it is completely abstract. I like to call it <MySolution>.Contracts.

Next, add a reference from your web service project to <MySolution>.Contracts. Implement the service contract interface, and do the work of the web service here.

Finally, add a reference from your client project to <MySolution>.Contracts. Don't add a service reference. Don't run svcutil.exe. Instead, add this little interface/class pair to your arsenal:

public interface IServiceClientFactory<TServiceInterface>
{
	void CallService(Action<TServiceInterface> action);
	TResult CallService<TResult>(Func<TServiceInterface, TResult> function);
}

public class ServiceClientFactory<TServiceInterface> :
	IServiceClientFactory<TServiceInterface>
	where TServiceInterface : class
{
	class Client : System.ServiceModel.ClientBase<TServiceInterface>
	{
		public TServiceInterface Service
		{
			get { return base.Channel; }
		}
	}

	public void CallService(Action<TServiceInterface> action)
	{
		Client client = new Client();

		try
		{
			action(client.Service);
			client.Close();
		}
		catch (Exception)
		{
			client.Abort();
			throw;
		}
	}

	public TResult CallService<TResult>(Func<TServiceInterface, TResult> function)
	{
		Client client = new Client();

		try
		{
			TResult result = function(client.Service);
			client.Close();
			return result;
		}
		catch (Exception)
		{
			client.Abort();
			throw;
		}
	}
}

With this, you can call a web service in one of two ways. Either you can invoke a method that returns nothing:

_serviceClientFactory.CallService(client =>
{
	client.DoAction(parameters);
})

Or, you can invoke a method with a return:

var something = _serviceClientFactory.CallService(client => client.GetSomething(parameters));

The client calls the service without ever importing the WSDL. The ServiceClientFactory does the proper Close() or Abort() pattern on the service reference. And the interface makes it suitable for IoC and unit testing.

AutoMapper misses the point

Sunday, April 19th, 2009

Jimmy Bogard created AutoMapper to generate left-hand-right-hand code. This is the mapping code that often appears when you are translating one object model to another parallel object model. He gives the example of mapping a data model to a view model:

public class OrderToOrderViewModelMapper
{
    public OrderViewModel Map(Order order)
    {
        return new OrderViewModel
            {
                CustomerFullName = order.Customer.GetFullName(),
                Total = order.GetTotal(),
                LineItems = order.GetLineItems()
                    .Select(x => new OrderLineItemViewModel
                        {
                            Price = x.Price,
                            Quantity = x.Quantity,
                            Total = x.GetTotal(),
                            ProductName = x.Product.Name
                        })
                    .ToArray()
            };
    }
}

This misses the point. The problem is not in the tedious mapping code. The problem is that you are doubling the amount of independent data.

Let me start at the beginning. This is a field:

public class Customer
{
    private string _fullName;
}

This field is independent data. It can change independent of anything else in the system.

This is a property:

public class Customer
{
    private string _fullName;

    public string FullName
    {
        get { return _fullName; }
        set { _fullName = value; }
    }
}

This property is dependent data. Its behavior depends entirely on the field.

This is an immutable field:

public class OrderViewModel
{
    private Order _order;

    public OrderViewModel(Order order)
    {
        _order = order;
    }
}

This immutable field is not independent. It cannot be changed. It is immutable.

This is another property:

public class OrderViewModel
{
    private Order _order;

    public OrderViewModel(Order order)
    {
        _order = order;
    }

    public string CustomerFullName
    {
        get { return _order.Customer.FullName; }
    }
}

Adding independent data to a system increases its degrees of freedom. Adding dependent data does not. Adding an immutable field does not.

You want no more degrees of freedom in the system than the problem calls for. Degrees of freedom are moving parts. Moving parts break. Moving parts need to be tested. You should therefore avoid adding independent data when dependent data is sufficient.

The problem with Jimmy Bogard's example wasn't the code to copy data from one object to another. It was that data had to be copied at all. The view model should not have fields (independent data). It should have properties (dependent data).

Update Controls in Silverlight 3 proof-of-concept

Sunday, April 19th, 2009

I have confirmed that Update Controls will work in Silverlight 3. I have wrapped a data object with one that uses DependencyProperties and successfully bound to the wrapper in Silverlight 3. This POC did not work in Silverlight 2 because it did not support DependencyProperty binding. Here's the code:

using System;
using System.Windows;
using UpdateControls;

namespace SL3Test
{
    public class PersonWrapper : DependencyObject
    {
        /////////////////////////////////////////////////////////////////////////////
        // Static section.
        //
        // Set up the dependency properties for all instances.

        // Register dependency properties. XAML can bind to these by name, even though
        // there is no CLR property to be found.
        private static DependencyProperty _firstNameProperty = DependencyProperty.Register(
            "FirstName",
            typeof(string),
            typeof(PersonWrapper),
            new PropertyMetadata(OnFirstNameChanged));
        private static DependencyProperty _fullNameProperty = DependencyProperty.Register(
            "FullName",
            typeof(string),
            typeof(PersonWrapper),
            new PropertyMetadata(""));

        // Called when the user edits the first name. Delegates to the object instance.
        private static void OnFirstNameChanged(DependencyObject obj, DependencyPropertyChangedEventArgs e)
        {
            ((PersonWrapper)obj).OnFirstNameChanged();
        }

        /////////////////////////////////////////////////////////////////////////////
        // Instance section.
        //
        // Coordinate between the dependency properties and the wrapped instance.

        // The data object being wrapped
        private Person _person = new Person();

        // Dependent sentries.
        private Dependent _depFirstName;
        private Dependent _depFullName;

        public PersonWrapper()
        {
            // When the dependent property is out of date, update it from the wrapped object.
            _depFirstName = new Dependent(delegate() { SetValue(_firstNameProperty, _person.FirstName); });
            _depFullName = new Dependent(delegate() { SetValue(_fullNameProperty, _person.FullName); });

            // When the dependent property becomes out of date, trigger an update.
            _depFirstName.Invalidated += new Action(TriggerFirstNameUpdate);
            _depFullName.Invalidated += new Action(TriggerFullNameUpdate);

            // The properties are out of date right now, so trigger the first update.
            TriggerFirstNameUpdate();
            TriggerFullNameUpdate();
        }

        // Called by the dependency property when the user changes the first name.
        private void OnFirstNameChanged()
        {
            // Set the first name in the wrapped object.
            _person.FirstName = (string)GetValue(_firstNameProperty);
        }

        // Called by Update Controls whenever the dependent property becomes out of date.
        private void TriggerFirstNameUpdate()
        {
            // Queue an update for first name. Don't do it right now.
            Dispatcher.BeginInvoke(new Action(delegate() { _depFirstName.OnGet(); }));
        }
        private void TriggerFullNameUpdate()
        {
            // Queue an update for full name. Don't do it right now.
            Dispatcher.BeginInvoke(new Action(delegate() { _depFullName.OnGet(); }));
        }
    }
}

This is the kind of code that Update Controls hides from you, so let me walk you through it.

Set up dependency properties
Silverlight 3 can bind to dependency properties. Verifying that was the whole point of this exercise. A dependency property is a property of a dependency object managed by a pair of classes. The first is the DependencyObject base class, which PersonWrapper inherits. The second is the DependencyProperty class, which PersonWrapper uses. There is one DependencyObject per instance, and one DependencyProperty per property. Values are managed at the intersection of these two: one property of one instance.

Since there is one DependencyProperty per property, rather than per property instance, we create static DepenencyProperties. We want the _firstNameProperty to support two-way binding, so we give it a property changed callback. This event is fired when the user changes the dependency property through the two-way binding. We use this to set the FirstName of the wrapped data object.

Set up dependent sentries
Going back the other way, we use the Dependent sentry.

In your typical usage of Update Controls, you see only the Independent class. Independent wraps a single property that can change independent of any other property. It has methods called OnGet and OnSet that you call whenever that property is accessed or modified, respectively.

The other half of the equation is the Dependent class. Dependent wraps a single property that changes only by observing other properties. As such, it has no OnSet; the user cannot set it. Instead, it has an update delegate.

The update delegate is passed into the Dependent's constructor. That delegate is called to calculate the value of the dependent property. You can see in the POC code that it calculates the dependent property by calling the wrapped data object.

The trick is that Dependent will sit idle until someone observes the dependent property. It will happily remain out-of-date if no one needs it to be updated. You tell it that you are interested by calling OnGet.

OnGet will call the update delegate only if it is out-of-date. When it does, it will record everything that the update delegate does so that it knows when to become out-of-date again. If Independent.OnGet is called by the update delegate, then the Dependent knows that it depends upon that Independent.

Update asynchronously
When Independent.OnSet is called, it makes all of the Dependents that have accessed it out-of-date. When this happens, the Invalidated event is fired. The POC code handles this event by calling Dispatcher.BeginInvoke. This queues up an action to be performed later. The action is performed on the UI thread after all current UI operations are complete. So what action do we queue up? A call to Dependent.OnGet.

The properties begin their lives as out-of-date. The Invalidated event won't fire this first time. So we have to manually queue the first calls to Dependent.OnGet. That's the reason for the two calls directly to the Trigger... methods.

The round trip
DependencyWrapper So think about the ebb and flow. The user enters data into the first name text box. Through two-way binding, that sets a dependency property. The wrapper responds to that change by setting the FirstName property on a wrapped data object. That calls Independent.OnSet, which causes the Dependent sentries (_depFirstName and _depFullName) to become out-of-date. The Invalidated event is fired, causing two Dependent.OnGet calls to be queued.

Silverlight finishes with handing the text box, then invokes the two queued actions. Dependent.OnGet is called on both _depFirstName and _depFullName (the order doesn't matter), and they update themselves by getting their respective values from the wrapped data object. In the process, they cause Independent.OnGet to be called, reestablishing the fact that they both depend upon the FirstName property. They then set their respective DependencyProperties, and Silverlight updates the UI.

This proves that Update Controls will work in Silverlight 3. The next step is to make it easy.

Returning the autoincrement ID of the last row inserted

Friday, April 17th, 2009

It's a common scenario. You've defined an autoincrement primary key on a table. After you insert a row into this table, you need the ID. Maybe you need to insert related rows into a child table. Maybe you need to redirect the user to a page displaying the new data. It's easy to imagine reasons for needing this ID. In fact, it's hard to imagine not needing it.

Why, then, is it so hard to get it?

Well, now it's easy add this class to your project:

using System;
using System.Data;

namespace AdventuresInSoftware.Data
{
    /// <summary>
    /// Create a ConnectionScope inside of a using to open and close a
    /// database connection. Also offers a convenient LastId method.
    /// </summary>
    public class ConnectionScope : IDisposable
    {
        private IDbConnection _connection;

        /// <summary>
        /// Wrap a database connection inside of a ConnectionScope in
        /// a using statement.
        /// </summary>
        /// <param name="connection">The connection to open and close.</param>
        public ConnectionScope(IDbConnection connection)
        {
            _connection = connection;
            connection.Open();
        }

        /// <summary>
        /// Get the autoincrement key generated by the last insert.
        /// </summary>
        /// <returns>The ID of the last row inserted.</returns>
        public int LastId()
        {
            using (IDbCommand command = _connection.CreateCommand())
            {
                command.CommandText = "SELECT @@IDENTITY";
                command.CommandType = CommandType.Text;
                return (int)(decimal)command.ExecuteScalar();
            }
        }

        /// <summary>
        /// Closes the connection. Intended to be called automatically
        /// by the using statement.
        /// </summary>
        public void Dispose()
        {
            _connection.Close();
        }
    }
}

Then write code like this:

public int SaveVendor(string vendorName)
{
    VendorTableAdapter vendorTableAdapter = new VendorTableAdapter();
    using (var scope = new ConnectionScope(vendorTableAdapter.Connection))
    {
        // Insert the vendor and return the new ID.
        vendorTableAdapter.Insert(vendorName);
        return scope.LastId();
    }
}

As you can see, the above code uses a typed TableAdapter. This is a convenient class generated by ADO .NET to give you strongly typed objects and methods for accessing tables. TableAdapters have been largely obsoleted by ORMs and Entity Framework, but they are still handy for smaller client-side projects. This code is from a smart-client project built on SQL CE.

A typed TableAdapter has a method called Insert which returns an integer. Oh boy! It must be returning the ID of the new row! After all, what else could I possibly want out of an Insert?

No, sorry. Insert returns a row count. That's right. The Insert method, which inserts a row, returns a row count.

Let me say that again. This method who's only reason for existing is to insert one row returns the number of rows it inserted. Did you get that? It always returns the number 1! By design!

Inane, yes, I know. But there it is. Enjoy.

Coming soon: Update Controls in Silverlight

Monday, April 13th, 2009

When I was porting Update Controls to WPF, I tried to hit Silverlight 2 simultaneously. Unfortunately, a couple of things got in my way. One of those problems may be fixed in Silverlight 3. And I've just learned about a new technique that might work in Silverlight 2.

Custom markup extensions
The first problem is that Silverlight 2 does not support custom markup extensions. A markup extension is a class that inherits MarkupExtension. It is available to XAML properties using the bracket syntax. "{Binding}" is a markup extension. Update Controls for WPF defines a custom markup extension called "{Update}".

Silverlight 2 does not allow us to create our own markup extensions. It supports only the "{Binding}", "{StaticResource}", and "{TemplateBinding}" built-in markup extensions. The library doesn't even include the MarkupExtension base class. The three supported classes are well-known to the Silverlight 2 runtime. From the CTP of Silverlight 3, it doesn't look like it does any better in this area.

Dependency properties
The second problem I ran in to is that Silverlight 2 does not bind to dependency properties. A dependency property is managed by the framework. You inherit DependencyObject, then create a static instance of DependencyProperty for each property. The DependencyObject base class manages all of the properties for that instance; DependencyProperty is the key that identifies one property within a dependency object.

You typically define a CLR property to call GetValue and SetValue for each dependency property. But this is not strictly necessary. WPF binds to every DependencyProperties registered with your type when it sees the DependencyObject base class. It doesn't even see your CLR property. It's only there for the convenience of other code.

One thing I tried doing was to reflect over a .NET object and create a DependencyProperty for each of its CLR properties. Sort of turn the pattern inside-out. I got this working, and WPF was able to data-bind to my wrapper object. Unfortunately, Silverlight 2 again stood in my way. Silverlight 2 does not bind to dependency properties.

Silverlight 3 fix coming soon
Silverlight 3 now offers support for binding from one XAML property to another. While I generally consider this a bad idea, it does offer a glimmer of hope. Properties of other controls tend to be dependency properties. This means that maybe Silverlight 3 will bind to a dependency property of any DependencyObject, just the way that WPF does.

I was unable to locate my old dependency property wrapper code. Apparently I was too frustrated to even check it into source control. So I'll recreate it and see if it works with Silverlight 3. If so, there will be one annoying step that you'll have to take to wrap your Presentation Model (a.k.a. View Model), your Navigation Model, and your Data Model before Silverlight can bind to them. But I think the advantages outweigh the inconvenience.

And possibly Silverlight 2?
In WPF: A Beginner's Guide, Sacha Barber tells me that markup extensions and the "{}" syntax are just a shorthand. There is a way to set a property to an object with more verbose syntax. I don't know if this verbose syntax will give me the hooks that ProvideValue offers, but it's worth a try. This would be more convenient than the wrapper described above, so I'll favor this approach if it works.

Update Controls on Polymorphic Podcast

Friday, April 10th, 2009

Many thanks to Craig Shoemaker, host of Pixel8 Radio and the Polymorphic Podcast, and overall beneficiary to the .NET community. I was honored to have a chance to speak with him about Update Controls.

We talked about the benefits of Update Controls over INotifyPropertyChanged and Binding for WPF. We talked about how to get started using it. And hopefully my explanations of how it works were somewhat understandable.

Have a listen. And if I don't sound like a complete idiot, it's only because of Craig's editing skills.

Ignore whitespace in TFS compare

Friday, April 10th, 2009

Thanks to JB Brown, James Manning, and Brian Harry for this information. Unfortunately, nobody I could find puts it all in one place, step-by-step. So here it is.

  • In Visual Studio, select "Tools", "Options", "Source Control", ":Visual Studio Team Foundation Server".
  • Click "Configure User Tools".
  • Click "Add".
  • Extension: .cs
  • Operation: Compare
  • Command: C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\diffmerge.exe
  • Arguments: %1 %2 %6 %7 %5 /ignorespace
  • OK
  • OK
  • OK

If you use VB, C++, XML, or any other file extension, go back and add those extensions, too.

And please, Microsoft. Why is this not the default?

Excel as Specification

Thursday, April 9th, 2009

I used to work with engineers. These are people who, like me, think in equations. Until moving on to other domains, I never realized just how lucky I was to work with these people.

A common practice among the engineers was to try out new analysis methods in Excel. It was really easy to put in some data, graph it, write equations against it, and iterate over the problem. If you needed to find a root, "Goal Seek" was right there. If you needed something more powerful, you could drop down into VBA.

On more than one occasion, I received an Excel worksheet along with the instruction to "make the program do this." Those were some of the easiest specifications to follow. First, the math was right there; it just had to be translated from Excel to C++. Second, test data could be plugged into both systems, and their results compared. And third, if there was a problem, I could always go back to the engineer and ask questions while pointing to the worksheet.

Fast forward to today. I'm working for business people, not engineers. They use Excel, but not in the same way. They think in numbers, not equations. Their specifications are written in a different Office product, are less precise, and are not executable.

Turning it around
InputsAndOutputs Since my customer is not going to give me an Excel worksheet, I decided to give one to them. I received a Word document describing some complex interactions among several pieces of data. Some were coming from our ERP system (Baan), some from a site administrator, and some from the user. Many of the fields had different names, depending upon who was looking at it. Some of the rules were contradictory, others were ambiguous, and a few were missing. After several mutually frustrating conversations with the business trying to understand this spec, I took matters into my own hands.

I created an Excel worksheet that has all of the inputs and outputs. These are organized according to the UI screens on which they appear, and are named according to the spec. Database and code names are not used.

Using this worksheet, the business can try out different scenarios and determine whether it does what they expect. If not, they can point to the worksheet and tell me what it should do. It may be asking too much for this group to fix the spreadsheet themselves, but at least we have a precise and executable communication tool.

Historic Modeling site launched

Tuesday, April 7th, 2009

I've started a new site specifically about Historic Modeling. The historic modeling content on AiS will remain here, but new content will be posted there. Adventures in Software will continue to be about lessons learned on the road to quality computing. Historic Modeling will be dedicated to the theory and practice of the technique, and the tools and processes I'm developing to support it.

The first series of posts is complete. It takes you step-by-step through the creation of a historic model. This example is drawn from real-life experience, and includes just enough complexity to demonstrate how the technique applies to real-world problems. Please enjoy.