Archive for August, 2006

Building Bridges

Wednesday, August 30th, 2006

What is an architect? That's easy. It's someone who builds bridges. My 10 year old can tell you that.

When it comes to software architecture, the industry has had a hard time nailing down a definition of the role. But even in software, it amounts to the same thing. An architect is someone who bridges the gap between business needs and technological solutions.

Some of my friends at Radiant considered an architect to be a lot like a conductor of an orchestra. He's the guy who stands at the front of the room with the baton (or dry-erase marker), waiving his arms, and not actually producing any music (or code). Then, when the performance (project) is done, he's the guy who turns to the audience and takes a bow.

Being a good architect, I would occasionally oblige them by waiving my arms in big circles as I made some grand pronouncement about how our systems should be continuously-integrated, service-oriented, test-driven masterpieces. But all the while I believe that we all understood the role that I played as architect.

As an architect, I was principally responsible to the business for technological decisions, even those I did not make myself. It was my duty to ensure that the business needs were met while preserving the integrity of the solution. I was tasked with weighing the cost of a technology with the benefits of its solution. At the end of the day, it was my assessment that helped the business analysts to decide the best course.

My responsability was also to the engineers. I had to recommend the patterns, practices, and tools that were suitable to the task that the development engineers had to perform. I had to ensure that operations had the visibility and control that they needed to run the system. I had to oversee a smooth transition of a product from development through production and into maintenance.

So my definition of a software architect is someone who builds bridges. Between business and technology. Between analysis and implementation. Between development and operations. You have to waive your arms pretty wide to reach all those places at once.

Cygwin Makes it Easy

Tuesday, August 29th, 2006

To be effective at my new job, I have learned my way around a Linux box. For my entire career so far, I've lived in an exclusively Windows environment. But now I've been introduced to a whole new world. I confess that it has been quite exhilarating.

The transition from Windows to Linux was a bit daunting at first, but one product helped me bridge the gap. Cygwin is a Unix shell that runs within Windows. You don't have to dual-boot. You don't have to install a new OS. You don't have to install a virtual machine. It just presents a bash shell as if it were a DOS command window.  (Other shells are available, too.)

What really makes Cygwin a winner is the ease with which you can install and upgrade it. Just go to cygwin.com, find a mirror near you, and run setup.exe. It will present you with a set of packages that you can install. Then it downloads only what you choose, so you don't waste time or bandwidth on features you won't use.

Don't worry about the shear number of packages available. If you don't know what you need, just take the defaults. When you discover that you need a new package, you can always run setup.exe again. It will remeber which mirror you chose last time, and will only download and install the new packages that you select. It couldn't be any easier.

Here are the packages that I use:

OpenSSH - secure shell and file transfer. I use this to access Linux servers running in the datacenter.

Nano - a simple text editor.

gcc - a C/C++ compiler. We use Java at work, but I expect to start a Linux C++ project at home.

Some friends from Radiant tried to convince me to use Cygwin before. I wish I had listened then. Sorry, John and Daniel!

The Self-Important Application

Sunday, August 27th, 2006

I love Cucusoft's iPod video converter. I can download movies from Channel 9 as WMV and convert them to MP4 for my video iPod. And for the most part their UI is fantastic. They provide a simple drag-and-drop mechanism for getting my videos into the pipeline.

But when they bundled two of their converters together into one suite, they made a big UI mistake in the process. Now when you close the main window, the following prompt appears:

Are you sure you want to quit?

This is a wonderful application, but it is not so wonderful that I would never want to close it. There is no loss of information when the program shuts down, so this prompt is not protecting me from an irreversable operation. It's just an annoyance.

Cucusoft's is not the first app that I've seen make this mistake. My wife still uses B's Recorder Gold to back up her machine, and it has the same problem. I also remember this prompt from several early Visual Basic apps, but the community seems to have learned since then. I just wish everyone would take the lesson to heart.

Problems with Naming

Thursday, August 24th, 2006

As you probably know, DNS is the way that computers turn names (like mallardsoft.com) into addresses (like 216.154.223.70). The DNS system was invented as a convenience to TCP/IP network users so that they wouldn't have to remember numbers or maintain their own hosts file. It has since become something much bigger.

The assignment of names and numbers is controlled by ICANN, the Internet Corporation for Assigned Names and Numbers. As their name implies, ICANN is a corporation, so it is controlled by US law. Other nations have very little influence in the way they conduct business, which is obviously a cause for concern. The Internet may have been born here, but it is now very much an international platform. Considering how much business is done online, I can understand the concern.

What is most troublesome is the way that the business of naming has evolved. ICANN controls the central registry of names. Somebody has to, since the name mallardsoft.com must always resolve to the same address no matter who types it in. DNS itself is a peer-to-peer protocol that distributes the registry, but the problem of naming demands that one organization must resolve conflicts.

ICANN outsources the actual registration process to other corporations, the registrars. These include some big names like Namesecure, AOL, NetworkSolutions, and GoDaddy. But many smaller corporations have also gotten into the registration business. (For an entertaining look inside the business of registration, please listen to Bob Parsons, CEO of GoDaddy). Not all of these registrars provide the same level of service. Some don't service customers at all.

Names have become hot property. Common words are especially valuable. According to one registrar, "Business.com" sold for $8 million. That's a lot to pay for a part of the English language. If you check out the site, you will find a search engine. This is the unfortunate reality of naming: the owners of these names do not need to add value to them. Someone who has a legitimate desire to provide value on line has to pay one of these squatters for the name they want, or make up some obscure name that isn't yet taken.

I have been concerned about this problem for a while. When ICANN started to charge for names, I thought that it would be less of an issue. As long as the resource is not free, you have to add value in order to recoup your investment, right? No. In his last podcast episode, Bob Parsons talks about Kiting or Tasting names. Domain registrars can obtain domain names from ICANN, put up a search page, and then return it to ICANN after a few days for a full refund. If they get enough traffic with that name, they'll keep it. If not, they haven't lost anything.

Another disturbing behavior is to register common misspellings of popular sites. This usually brings you to a page that tries to make money through some other means, like paid advertisements or sales of spyware blockers. Occasionally, though, the fake site will mimic the real one that you tried to visit, and may entice you to reveal passwords or personal data.

I am a capitalist at heart. I don't begrudge anyone doing anything legal and ethical to make money. These practice -- while robbing value-providing entrepreneurs of the resource of domain names -- are usually neither illegal, nor unethical. However, any system that rewards this kind of behavior is broken. The cornerstone of capitalism is that the market rewards businesses according to the value that they provide. If you can make money without providing value, then the system that you are using is ultimately not market-driven.

So here's my solution

This is one I learned from my wife. I Google everything, even known domain names. I never type names into the address bar of my browser. Google has very effective page ranking algorithms and an incredibly large user base from which to glean data. Theirs is not a centrally managed naming system. Theirs is a community ranking system. If I'm looking for youtube.com and I accidentally search for "utube", the first link in Google will take me to the right place. If I enter the same thing into the address bar I end up at Universal Tube and Rollform Equipment Corporation. Obviously, they are providing value with their site, but that isn't where I wanted to go.

The idea that a corporation owns a domain name is fundamentally flawed, especially in today's international marketplace. Naming implies registration, which implies that someone controls the names. How is this central registrar supposed to decide who provides value with that name and who is just squatting? ICANN can't do it, and doesn't even try. But the market can. Search engines like Google that use community data enable us, the market, to decide who is providing value. If DNS were to stop working tomorrow, searching would still work. Only the squatters and kiters that would loose out.

What kind of combo box is it?

Monday, August 21st, 2006

Does anyone else remember Windows 3.1? So much has changed since then, and most of it for the better. But there is one thing that has bugged me ever since Windows 95. Microsoft messed up the combo box control.

A combo box control is a combination edit control and list box. It can be used in one of three ways. A simple combo box displays both the edit control and the list box at all times, and the text is editable. A drop-down combo box displays the list only when you click the button, and the text is still editable. A drop-down list displays the list when you click the button, but the text is not editable: you can only pick something from the list.

Yes, I agree that those three names are poorly chosen. It has taken me 12 years to remember which one is a drop-down and which is a drop-down list.

But what's worse than that is that a user cannot see the difference. Both a drop-down and a drop-down list appear as a box containing text with a button off to the right. The user has to hover over the control to see if the arrow turns into an I beam, or click on the control to see if a blinking cursor appears. A user should never have to interact with a control to discover its function.

Do you remember what the combo box looked like in Windows 3.1? Yes, it was flat instead of being chiseled into the gunmetal of the dialog box, but there was something else. Give up?

In Windows 3.1, a drop-down combo box (which allows editing of the text) had a gap between the edit control and the button. A drop-down list (which does not) had the two stuck together as they are today. This made it immediately obvious which kind of combo box control it was. If the text box stood alone, then it was independently editable. If the text box was stuck to a button, then you were stuck with the choices in the list. Simple.

I wish Microsoft would correct their error and bring back the visually discoverable combo box. I could create my own custom control to fix the problem, but since it is no longer standard I would just end up causing more confusion.

Scroll within a scroll

Friday, August 18th, 2006

As I've said before, I'm a big fan of WordPress. However, they did make one big user interface mistake. They have the dreaded scroll bar on a scrollable page. As I am typing this post, I can scroll down to see a preview of my article. This preview is itself scrollable.

As programmers we are used to thinking recursively. We often have to write nested loops or call recursive functions. So when we see one thing nested within another, we don't think twice about it. But users don't think like us. Nested scroll bars confuse most people.

But even if the UI is not confusing, there are a couple of practical usability issues that generaly arise when you nest scroll bars. First is the horizontal size. If the inner scroll region is wider than the outer scroll region, the user has to scroll right to even see the inner scroll bar. While the inner scroll bar is off the screen, the user doesn't even realize that there is a problem. They might scroll down to the bottom of the outer region and see that their data is cut off. If they could see the inner scroll bar, they would know that more data appears below, off screen.

Another common usability issue with the nested scroll bar is the behavior of the scroll wheel. The wheel will usually scroll through the outer region until the inner region happens to move up under the mouse. At that point, the scroll wheel will operate on the inner region. When that one gets to the end of the document, it again operates on the outer region. This is quite disconcerting.

The nested scroll bar is hard to avoid in web based applications. The best practice, I believe is to define a frame set, and have each scrollable region within a frame. This allows the user to resize and scroll each frame individually. Unfortunately, this is difficult to code for, but your users will appreciate the effort.

When a rich client has nested scroll bars, the application developer has no excuse. No matter what the development environment, split frames are easy on the desktop. Take, for example, JMeter (http://jakarta.apache.org/jmeter/). This is an extremely useful program that I rely upon day after day, yet it has an infuriating nested scroll bar in the view results tree. I have to maximize the window in order to see the scroll bar for the results of my HTTP request.

So please do your users a favor and avoid the nested scroll bar.

The first tenet of object-oriented programming

Wednesday, August 16th, 2006

What are the tenets of object-oriented programming? This was a popular interview question back in the early 90's. And the expected answer was: encapsulation, inheritance, and polymorphism.

But go back and read Rumbaugh, OOM&D. In chapter 3 where he begins to build the Object Modeling Technique, he starts with the idea of identity. Identity is the inherent uniqueness of objects, independent of any attributes. Two objects may look the same, act the same, and have the same state, but an action performed on one does not affect the other. They are two distinct objects.

This fundamental idea is the keystone of object-oriented analysis, design, and programming. I argue that it is the most important of the four (not three) tenets of OO. And yet it is the one that people often neglect.

If you program in any object-oriented language, I challenge you to really think about how identity manifests itself in your code and designs. Consider how you use it, and what its consequences are. For example, if you are a C++ programmer, consider why you would use a pointer rather than a value. If you pass a pointer to an object into a function, that function can operate on that particular object. If you pass the object itself (i.e. by value), then the function can only work on a copy of the object. But if that object itself contains a pointer to another object, then the pointer is copied, and the function again has access to it. So you use pointers when you want identity.

If you work in Java, then pointer is a bad word. Some people like to say that there are no pointers in Java, because of the bad reputation pointers got in C++ (memory leaks, buffer overruns, protection faults). Others like to say that everything is a pointer in Java, because Java has a reference type system. Only some of the built-in types are values. So when you pass an object to a method in Java, you have to pass its identity to that function. The function can then do anything it wants to that object. The language does not make a copy, like C++ does, to protect its identity.

This usually isn't a problem, because you are used to the way the language works. You might make objects immutable to avoid the possibility that the method will change it, or pass in an interface that exposes just the read-only methods. In extreme cases you might even make an explicit copy yourself. But you know that identity is being shared, and so you code for it.

If you code in C#, then you have options. You can define types that want identity as classes, and types that don't want identity as structs. Classes are reference types (i.e. have identity), and structs are value types (i.e. don't). So when you pass an object to a method, you share its identity. When you pass a structure, you don't. But C# gives you one more choice. You can declare your value-typed parameter using "ref". This suddenly gives the structure identity so that the function can modify it. You probably won't need to do this very often, but when you do you will see identity at work in your app.

Identity is also apparent in design. Most of the gang-of-four design patterns are heavily dependent upon identity. A visitor is given the identity of an object to visit. A state machine knows the identity of its current state. Each strategy has identity so that it can be distinguished from other strategies that implement the same interface. Patterns that do not take advantage of identity have consequences that derive from that fact. Flyweights must be immutable (since their identity is shared and must not be exploited). Singletons have identity, but the identity of the instance is synonymous with that of the type, and must therefore be protected (against multi-threaded access, or duplicate creation).

If you draw your designs on a white board or a piece of paper, objects with identity will be visible as boxes or circles. They will be independent yet related to other objects in the drawing. If you draw an arrow from one object to another, you are representing the fact that one object knows the identity of the other. You will probably construct a graph of objects that know each others identities, and live for some time within the system.

Types that don't want identity will appear as members or parameters of those objects. They will not have a life of their own outside of their identifiable hosts.

If you engage in object-oriented analysis, design, or programming, you use identity. But most people fail to recognize it. You may find yourself wondering whether to declare that new C# type as a struct or a class. You may wonder whether you should make that method static or not. You may consider reusing an object instead of creating a new one each time. If you put the idea of identity at the front of your thinking, then the answers to all of these questions become apparent.

Order-sensitive unit testing

Saturday, August 12th, 2006

I've spoken brifly in the post TDD Test Drive about using NMock2 to test a transaction pump. For this test, I mocked the transaction queue, the web service, and the dialup connection manager. I did this so that I could make sure, for example, that the transaction pump dialed the phone before it invoked the web service.

I set up expectations in my code as follows:

    [TestFixture]
    public class FirstTest
    {
        private Mockery _mockery;
        private DialupConnectionManager _dialupConnectionManager;
        private TransactionService _transactionService;
        private TransactionPump _transactionPump;

        [SetUp]
        public void SetUp()
        {
            _mockery = new Mockery();
            _dialupConnectionManager =
                _mockery.NewMock();
            _transactionService = _mockery.NewMock();
            _transactionPump = new TransactionPump(_dialupConnectionManager,
                _transactionService);
        }

        [TearDown]
        public void TearDown()
        {
            _mockery.VerifyAllExpectationsHaveBeenMet();
        }

        [Test]
        public void TestPump()
        {
            Expect.Once.On(_dialupConnectionManager).Method("Dial");
            Expect.Once.On(_transactionService).Method("SendTransaction").
                With("Hello");
            Expect.Once.On(_dialupConnectionManager).Method("HangUp");
            _transactionPump.Run();
        }
    }

The tests failed, I made them pass, then I moved on to the next. Unfortunatly, they passed for the wrong reasons.

My transaction pump had a bug in it that caused the dialup connection to dial at the wrong times (it was much more complex than the example given above). By default, NMock validates that all expectations are met, but not in the order that you specified. You have to take one additional step to get there:

        public void TestPump()
        {
            using (_mockery.Ordered)
            {
                Expect.Once.On(_dialupConnectionManager).Method("Dial");
                Expect.Once.On(_transactionService).Method("SendTransaction").
                    With("Hello");
                Expect.Once.On(_dialupConnectionManager).Method("HangUp");
            }
            _transactionPump.Run();
        }

I had already written 40 tests with the assumption that NMock was validating order, so I didn't want to add this using statement to every test. Instead, I added it to the SetUp and TearDown methods:

    [TestFixture]
    public class FirstTest
    {
        private Mockery _mockery;
        private DialupConnectionManager _dialupConnectionManager;
        private TransactionService _transactionService;
        private TransactionPump _transactionPump;

        private IDisposable _ordered;

        [SetUp]
        public void SetUp()
        {
            _mockery = new Mockery();
            _dialupConnectionManager =
                _mockery.NewMock();
            _transactionService = _mockery.NewMock();
            _transactionPump = new TransactionPump(_dialupConnectionManager,
                _transactionService);

            _ordered = _mockery.Ordered;
        }

        [TearDown]
        public void TearDown()
        {
            _ordered.Dispose();

            _mockery.VerifyAllExpectationsHaveBeenMet();
        }

        [Test]
        public void TestPump()
        {
            Expect.Once.On(_dialupConnectionManager).Method("Dial");
            Expect.Once.On(_transactionService).Method("SendTransaction").
                With("Hello");
            Expect.Once.On(_dialupConnectionManager).Method("HangUp");
            _transactionPump.Run();
        }
    }

This caused all unit tests to become order-sensitive. Not surprisingly, this revealed some errors in other test that I tought were working. Unit testing with NMock2 is incredibly effective, but you must be aware of this caveat. If order matters (and it usually does), then tell NMock2 to pay attention to it.

Where Does the UI Begin?

Monday, August 7th, 2006

Today was my first day at Handmark. Though I did some important work and met some incredible people at Radiant Systems, I needed a change of pace. The upshot for you, dear reader, is that this content will be slightly biased toward Java, Linux, and MySQL. Whereas Radiant was part Java and part .NET, Handmark uses no Microsoft technologies on the server.

From this new (for me) and interesting landscape comes a challenging puzzle. Handmark creates software for mobile devices. The server component runs at our datacenter, while the client component runs on the cell phone or PDA. My team is working on the server side while another team writes the client.

The way that you would typically create a client/server application would be to split the system right around the business logic. For a thin client, only the UI would be on the device. For a rich client, the UI and almost all of the business logic is on the client. For a smart client, it's somewhere in between. But the general theme is that UI is on the client.

Applying this line of reasoning to Handmark, you would expect my team to be working on business logic and data access, and the device team to be working on UI (and some business logic, since this is a smart client). To support multiple devices, you would adopt a common standard for the server, and have the client team to write different software for each device. They would all conform to the one server standard, so the server itself would be device agnostic.

But here we have a problem. The devices we are talking about have very limited memory and processing power. Certain UI functions, like localization and parsing, are extremely taxing to these micro machines. They don't have the space for translation tables, nor the speed for snappy string manipulation. These functions are really easy for the server, so the client team asks that we preprocess the data for their device.

Once we have preprocessed the data for one device, it is no longer suitable for another. So our server has to know about the device that is going to render the data. Based on the client device capabilities, the server will provide data in different formats, with images of different resolutions, localized according to the user's preferences. This places UI functionality squarely on the server.

But isn't that incorrect? Shouldn't the client alone be responsible for UI. After all, it is the user-facing device.

Actually, no, this is not incorrect. The confusion comes from the difference between logical tiers and physical tiers. We still have a three-tier logical structure, but where we choose to physically host those tiers is controlled by practical constraints. The decision to break the application at the business logic for the client\server boundary was arbitrary. Any other decision could be justified.

Take a look at the most popular client\server application of all: the World Wide Web. Most web sites that have a three-tier logical structure locate all three tiers at the server. The client is just a dumb browser. The user interface is written in PHP, JSP, ASPX, or some other language that the browser does not understand. This code is executed on the server to produce HTML. And due to browser incompatibility, this server-side UI often has to be device-aware.

The right solution depends upon the problem. In Handmark's case, the solution is to put some of the UI on the server. With the right infrastructure, this is fairly easy to do. Sometimes thinking outside the box means keeping more features inside the box. So don't put arbitrary demands on your clients. If the problem demands it, put some or all of the UI on the server.

A Memento’s a Memento

Thursday, August 3rd, 2006

A developer on my team just solved an insidious problem today. He found that an object that was correctly filled in on the client was incorrectly deserialized on the server. This object contained discrete amounts as separate properties. Two of these amounts are called Product and Tip. He found that on the server side Product was the sum of Product and Tip on the client side. Tip was still correct.

After a moment of insight, he realized the problem. This class defines a calculated property called Total that returns Product + Tip. Because this class is serializable, Total cannot be read-only. Therefore, the setter assigns the value to Product. The server must have been deserializing Total after Product, and therefore overwriting the value.

When described in so many words, it appears obvious. The setter for Total was wrong. But what should the setter for Total look like? My first instinct was that Total should set Product to the value minus the current Tip. That way, no matter what the object currently looks like, you can set the Total, get the Total and see exactly the result you expected.

But this solution is wrong. The decision to change Product and not Tip is arbitrary. Sure, if the Tip was deserialized first, then this would produce the correct result. But not if deserialization occurs in the order Product, Total, Tip. In this case, Product would initiallly be set correctly, then overwritten by the total (Total-current Tip, which is 0). Finally Tip would be set correctly. You would again be left with a Product that is too high.

So how about this solution. Instead of making Total a property, make GetTotal() a method. That way the runtime doesn't try to serialize it. This, in fact, is what my coworker did, and it worked. However, even this solution is wrong. Consider what he had to do in the first place, and you'll see why.

.NET web services do some interesting magic for you. First, you can write a class, decorate it with the appropriate attributes, and build. .NET will generate a WSDL descriptor for the data type. Cool. Then, you can import that WSDL into another project and it will reconstruct that class. Double cool. Now you can pass objects of that class to and from the web service.

But wait! That's not the same class! It has the same name and lives in the same namespace, but it is different. .NET generated that type from WSDL, which doesn't describe things like methods (outside of web service methods), inheritance, and interfaces. .NET couldn't generate exactly the same class.

If you add a method to the class, it doesn't make it to the other side. If you create a derived class, the web service slices your objects to the base class. If you declare interfaces, just forget all about them. So how did my coleague create a class with a calculated property? Isn't that a method?

Well, he defined that class on the server side. When .NET did its round trip from class to WSDL and back to class, it lost that calculation. So he just edited the generated code so that it used the original class defined on the server side. This allows the client and server to share code, and code reuse is good, right?

Well, no, not always. This was generated code. It was already reusing the description, even if not the actual code, of the server side. If a programmer needs to make a change to this class, the programmer still needs to change only one source file. It doesn't matter how much work the compiler needs to do.

But the core of the problem is this: .NET web services were designed to exchange mementos. The documentation doesn't come right out and say that, and the compiler doesn't complain if the class has more than just properties. But the fact is that things just don't work out right unless the types used in web services are just mementos.

This is true for any serialization, persistence, or network communication strategy. You are not exchanging objects, you are exchanging data. We are so used to object-oriented programming that we fail to recognize when something is not an object. We want to send an object of a derived class to a web service method that expects the base. We want the web service to act polymorphically on the object. We want the object to handle its own behavior. But it just can't work that way. Persistable objects are just mementos, and all behavior needs to be implemented elsewhere.