Archive for the ‘XML’ Category

XmlSerializer Correction

Saturday, May 12th, 2007

I mentioned in episode 23 XMl Inside Out that there was no attribute in .NET that would flatten a collection. Raymond and I stumbled upon just such an attribute accidentally.

The issue was that all legs of a flight were direct sub elements of the "Flight" element. Serializing an array puts all of those legs under an intemediate "Legs" element. However, if you use the attribute [XmlElement("Leg")] instead of [XmlArray("Legs")], you will get a flat collection.

The point is still valid, however, that you will find some schemas that you cannot match with XmlSerializer. It was designed to be really easy, not really flexible. These two goals are often at odds with one another.

AiS 23: XML Inside Out

Sunday, April 22nd, 2007

Listen Now

XML is not a religion. XML is a tool. A religion tells you what is right and what is wrong. A tool is only concerned with what is useful.

At the 2007 Dallas Code Camp, I presented 4 different XML parsing strategies currently available in .NET. XmlDocument is easy to use and supports input as well as output, but it loads the entire document into memory. XPathDocument is just as easy and slightly faster, but it is read-only. XmlSerializer is the easiest of all since it works with the .NET type system, but it tightly couples your data types with the schema and doesn't support all schemas. And XmlReader is the fastest and most powerful, but incredibly difficult to use.

I added a fifth strategy called DeclarativeXmlParser. This library wraps XmlReader with a dot-chained schema declaration that makes XmlReader dirt simple to use. This library and the demonstration code used in the presentation are available at http://dirtsimplexml.com.

This presentation was inspired by work that I do at Handmark, and data provided by OAG. Additional information was found at Softartisans.

Dirt Simple XML Parser in C#

Friday, April 20th, 2007

On March 30, I posted a Dirt Simple XML Parser in Java. The C# version of this parser is now available at DirtSimpleXML.com. I created this page in preparation for tomorrow's code camp, where I'll be comparing 5 different XML parsing strategies currently available in .NET 2.0.

I don't have any documentation for it yet, but that is in the pipeline. Please check it out and post your comments here.

Dirt Simple XML Parser

Friday, March 30th, 2007

I know there are already far too many ways to parse XML. Softartisans has an entire page devoted to choosing which XML parser is best for different situations. But still I find myself facing the same trade-off each time I need to consume an XML document: do I do something simple, or do I do something efficient? DOM is easy, but very costly. SAX is the most efficient, but very difficult to use. Other technologies fall between those two on the spectrum.

I've finally solved that problem once and for all. I created an XML library that makes SAX dirt simple to use. You express an XML document declaratively, putting in hooks to handle elements that interest you. This declaration turns into a DefaultHandler for SAX to invoke. The source code is available in two parts: mallardsoft.xml depends upon mallardsoft.util.

To build a parser, create a new NestedHandler and initialize it with a new Document. Using in-line dot-chaining, set up sub elements of the Document with new Element objects. Here's a quick example:

final StringBuffer firstName = new StringBuffer();
final StringBuffer lastName = new StringBuffer();

DefaultHandler handler = new NestedHandler(new Document()
  .one("people", new Element()
    .zeroToMany("person", new Element()
      .init(firstName)
      .init(lastName)
      .optionalAttribute("firstName", firstName)
      .requiredAttribute("lastName", lastName)
      .end(new InvokeHandler() {
        public void handle() throws SAXException {
          newPerson(firstName.toString(), lastName.toString());
        }
      })
    )
  )
);

Pass this to a SAX parser and it will extract all of the person elements, grab their first and last names, and call the newPerson() method for each one. You can nest elements as deeply as you need to. And if you need a recursive declaration, use an ElementSpec object. Give it a try with a more complex XML file and see how well it scales up.

Update
I've changed the names to better reflect true XML nomenclature. The proper names are not "root" and "node", but "document" and "element".