A SAX Parser in .NET
With SAX, the client application receives any data the parser is designed to push and its only method of filtering the data is to discard any unwanted information later. The application has to build fairly sophisticated code to isolate the pieces of information it really needs (the nodes of interest) and, more importantly, to add these pieces of data to a custom data structure which represents the state.
Applications interact with a SAX parser by writing and registering their own handlers. For example:
Set saxParser.contentHandler = myCntHandler
' *** set other handlers
saxParser.parseURL(file)
The pseudocode here shows the structure of a Visual Basic .NET class mimicking a SAX parser:
Public Class SaxParser
Public ContentHandler As SaxContentHandler
Public Sub Parse(ByVal file As String)
Dim reader As XmlTextReader = New XmlTextReader(file)
While (reader.Read())
ContentHandler.Process(reader.Name, _
reader.Value, reader.NodeType)
End While
reader.Close()
End Sub
End Class
The property called ContentHandler refers to a user-defined object in charge of processing the nodes that are found. The Parse method parses the content of the XML document using a reader and calls the content handler whenever a new node is found. The content handler class has a fixed interface represented by the following abstract class:
Public MustInherit Class SaxContentHandler
Public MustOverride Sub Process( _
ByVal name As String, _
ByVal value As String, _
ByVal type As XmlNodeType)
End Class
Once the two classes have been compiled into an assembly, a client SAX application can just reference and instantiate the parser and the content handler class. The SAX application initializes the parser as follows:
Dim saxParser As New SaxParser()
Dim myHandler As New MyContentHandler()
saxParser.ContentHandler = myHandler
saxParser.Parse(file)
Of course, the parser discussed here is fairly simplistic, but the design guidelines are concrete and effective. In the client application, the content handler class and the form are different classes, which makes updating the user interface from the content handler class a bit complicated.
|