Archive for April, 2007

Constrained dot-chaining

Thursday, April 26th, 2007

For my recent XML parsing library -- DeclarativeXMLParser -- I used a coding technique that I call constrained dot-chaining. This technique is useful for creating structures in code that more concisely express their intent.

Simple dot-chaining is accomplished by declaring methods that end in "return this". For example, StringBuffer in Java and StringBuilder in C# both facilitate dot-chaining by returning "this" from the "append" (or "Append" in C#) method. Thus you can chain a series of method calls as in:

String greeting = new StringBuffer()
  .append("Hello, ")
  .append(name)
  .append("!")
  .toString();

Notice the use of the dot at the beginning of the line. As compared to the more typical dot at the end of the line, this lends more symmetry to dot-chained code and works better with intellisense.

Dot-chaining is especially powerful when you pass the resulting object as a parameter. The object can be created and initialized in-line with no semicolon. No variable needs to be declared.

throw new MyException(new StringBuffer()
  .append("Failed to open  file ")
  .append(fileName)
  .append(": ")
  .append(e.getMessage())
  .toString(), e);

Constrained dot-chaining is useful when the order of operations should be restricted. For example, in the DeclarativeXMLParser, each element will be initialized before the attributes are matched. The attributes will be matched before the Begin delegate is called. Begin will precede the sub elements, and End will follow all. So you want the code to look like this:

new Element()
  .Init(delegate() {...})
  .RequiredAttribute("id", idAttribute)
  .IgnoreOtherAttributes()
  .Begin(delegate() {...})
  .One("subElement", new Element()
    .RequiredAttribute("name", nameAttribute)
  )
  .End(delegate() {...});

Unconstrained dot-chaining would allow these operations to be declared in any order. While that wouldn't confuse the parser engine, this would most certainly confuse anyone reading the code. Constraints are required to force this code to be more readable.

Here's my solution
The Element class is only the leaf of a long line of inherited classes. It directly supplies the Init method, which returns Element. It's base class is ElementWithAttributes, which supplies all of the attribute related methods. ElementWithAttributes in turn inherits ElementWithFilter, which supplies the Filter method, and so on up the hierarchy until you reach ElementWithEnd.

The idea is simple. Once you go up the inheritance hierarchy, you can't go back down. This forces you to call the methods in the prescribed order. Inheritance allows you to skip levels if you want to, but you can't backtrack.

Each level of the hierarchy returns its own type for methods that can be called more than once, and its base class type for methods that can be called only once. So, for example, once you call Begin, you are now at the ElementWithContent level. You cannot call Begin again because you are too high up in the tree.

Download the code for more details. I hope you find this technique as useful as I have.

Protocol Monopolies are Dangerous

Monday, April 23rd, 2007

The recent blackout of BlackBerry service has taught us a few lessons. Paul McNamara points out the importance of transparency. Merlin Mann warns us about techonogy addiction. But there is another lesson that I'd like us to learn from this experience: protocol monopolies are dangerous.

Almost 3 years ago, NTP sued RIM over their infringement of its push-email patent. The case dragged on, and threatened to terminate the service that many consumers had come to rely upon. RIM eventually settled with NTP for $612 million, ensuring that the service would continue.

Until this recent interruption, that is.

If not for the government sponsered monopoly that we call US patent law, competing services would have been available. RIM's customers would still have suffered, but their competitors would still have been operational. BlackBerry users would have been able to hold their service provider accountable, since they could switch to a competing platform. NTP once held its monopoly power over RIM's head, and now RIM holds it over us.

The Internet, with all of its flaws, is built on open standards. The protocols are well documented and not encumbered by patents. Any service provider is able to implement these protocols, and thereby (theoretically) interoperate with other service providers. For the most part, this works. And when it doesn't, we have the opportunity to hold our service providers accountable until they make it work. The Internet has no single point of failure (except perhaps DNS, but that's a discussion for another day).

When a service provider is given a legal monopoly over a protocol, the entire Internet suffers. Inventions and business processes can be patented, but protocols should be excluded. Until the law is changed, I encourage you to trade in your BlackBerry for a Windows Mobile, or some other device that can check email via the open protocols.

AiS 23: XML Inside Out

Sunday, April 22nd, 2007

Listen Now

XML is not a religion. XML is a tool. A religion tells you what is right and what is wrong. A tool is only concerned with what is useful.

At the 2007 Dallas Code Camp, I presented 4 different XML parsing strategies currently available in .NET. XmlDocument is easy to use and supports input as well as output, but it loads the entire document into memory. XPathDocument is just as easy and slightly faster, but it is read-only. XmlSerializer is the easiest of all since it works with the .NET type system, but it tightly couples your data types with the schema and doesn't support all schemas. And XmlReader is the fastest and most powerful, but incredibly difficult to use.

I added a fifth strategy called DeclarativeXmlParser. This library wraps XmlReader with a dot-chained schema declaration that makes XmlReader dirt simple to use. This library and the demonstration code used in the presentation are available at http://dirtsimplexml.com.

This presentation was inspired by work that I do at Handmark, and data provided by OAG. Additional information was found at Softartisans.

Dirt Simple XML Parser in C#

Friday, April 20th, 2007

On March 30, I posted a Dirt Simple XML Parser in Java. The C# version of this parser is now available at DirtSimpleXML.com. I created this page in preparation for tomorrow's code camp, where I'll be comparing 5 different XML parsing strategies currently available in .NET 2.0.

I don't have any documentation for it yet, but that is in the pipeline. Please check it out and post your comments here.

I Fixed Security Now!

Wednesday, April 18th, 2007

I have edited the offending comment attached to Security Now! Episode 86. Even though the last half of the comment was hidden from view inside of a <script> tag, I discovered that I could extract the "edit" link from the page source.

For the full story, go back a couple of posts or listen to the Adventures in Software podcast, episode 22.

Again, my appologies to Steve and Leo for accidentally hacking their site. Fortunately, no permanent damage was done.

AiS 22: Website Attacks

Tuesday, April 17th, 2007

Listen Now

Cross site scripting and SQL injection attacks both take advantage of vulnerabilities in websites. In the case of a cross site scripting vulnerability, data can be rendered to the browser, causing it to run unintended code. In the case of a SQL injection attack, data is executed by the database as SQL code. In both cases, input that a malicious user provides is interpreted in a way that the designer did not intend.

Both of these vulnerabilities are software defects. In both cases, data in a natural language is rendered as if it were a programming language. Web developers and designers need to take extra precautions to prevent this from happening. But what are the right precautions?

One solution is to validate all inputs to ensure that no script or SQL code is included. Unfortunately, this approach is nearly impossible to implement correctly. Any inconsistencies between the way the validator parses the input and the way the browser or database interprets it can leave a hole for a hacker to slip through. Furthermore, it may be perfectly valid for the data to contain HTML or SQL code. What if you are running a blog about software? Shouldn't you be able to post code to the blog?

A better solution is to escape the data. If you are converting a natural language string into SQL, you have to be sure to escape the quotes. In HTML, you must escape the angle brackets. Several tools are already at your disposal to perform the escaping function. Please take full advantage of them.

http://www.bitlbee.org
http://toc2rta.com
http://jakarta.apache.org/commons/lang
http://jakarta.apache.org/commons/codec
http://www.unixwiz.net/techtips/sql-injection.html

Oops! I broke Security Now!

Friday, April 13th, 2007

I listen to Leo Leporte and Steve Gibson talk about Internet security on the weekly podcast Security Now. I just posted a comment on the show about cross-site scripting attacks. I didn't realize that I was attacking the site myself!

My comment was this:

Validating user input, while a good idea, is not the fix for CSS attacks. As Steve pointed out, it is nearly impossible to accurately detect script in the input.

The fact is, the CSS vulnerability is a defect in the output of a system, not the input. If I tell a web site that my name is "<script>", it should reply "Hello, <script>!". The way you say this in HTML is "Hello, &lt;script&gt;!".

If web developers simply escaped their output, the problem would be solved.

When the site served this comment back, it failed to escape the word "<script>". As a result, the remainder of this comment and all comments that followed were hidden.

I posted the comment a second time, this time escaping it myself on the way in. Hopefully someone can go back into the database and delete the first one.

Sorry Steve!

AiS 21: Getting Started

Tuesday, April 10th, 2007

Listen Now

Each member of the panel got started in software because of something inside them. Some of us had a love of mathematics, some of us understood data, and some felt compelled in mysterious ways. Whatever drew us to software, we knew we had to follow.

Having found our calling, we've each taught ourselves the skills that we need day-to-day. We recognized that neither school nor employer is responsible for our training. We cracked books, installed compilers, and searched the web for the information we needed.

This path is a common one. It is still the way that most people find their way into computer programming. Formal training in software is not necessary. Some of the best programmers have been trained in accounting, electrical engineering, music, or mathematics. Many programmers get into software via other jobs, such as data entry, media, or support.

However you get here, it's not about the money. It's about the love.

Shameless plugs:
http://updatecontrols.net
http://d20universe.com
http://orbit1.com
http://beforeyouaregone.com
http://mallardsoft.com
http://mobilescannersofamerica.com
http://yahsaves.org
http://webcudgel.com

Cover flow in Javascript

Monday, April 9th, 2007


My wife is a scrapbook consultant. She was looking for a way to showcase her artwork on her web page. Instead of your typical list of thumbnails, I thought it would be more interesting to put her scrapbook pages in iTunes cover flow. So after a weekend of messing with javascript, this is the result:

http://myscraproom.net/pageflow.html

It doesn't have the oblique angles, but I think I know how to achieve that effect. There is also a Safari compatibility issue, and a performance improvement I can make. More to come.

Update
Finn Rudolph has added many enhancements to the original script. I highly recommend using his Image Flow.

Speaking at the Dallas Code Camp (4/21)

Tuesday, April 3rd, 2007

If you are in the Dallas/Ft. Worth area, please come by the Microsoft campus in Las Colinas on Saturday, April 21. I will be presenting "XML Inside Out", a discussion of the various ways of parsing XML in .NET. When would you choose DOM, SAX2, or XLinq? Is there another option? (Hint: see the March 30th posting "Dirt Simple XML Parser" for a clue.)

RSVP at the Dallas Code Camp web site. And post a comment if you plan to attend

Hope to see you there!