File handling XML SAX

XML with SAX

The klyn.io.xml.sax package provides an event-driven XML parser. SAX is the right model when you want to process large XML documents without materializing a whole tree.

Imports
import klyn.collections
import klyn.io
import klyn.io.xml.sax
import org.xml.sax
Define a Handler

SAX calls handler methods as it reads the document. Keep handler state explicit: booleans for current context, counters for statistics, and builders for text that may arrive in several characters callbacks.

import klyn.collections
import klyn.io
import klyn.io.xml.sax
import org.xml.sax

class TitleHandler extends DefaultHandler:

    public titles as ArrayList<String>
    private _insideTitle as Boolean = false
    private _text as StringBuilder

    public TitleHandler():
        this.titles = ArrayList<String>()
        this._text = StringBuilder()

    public override startElement(uri as String, localName as String, qName as String, atts as Attributes) as Void throws SAXException:
        if qName == "title":
            this._insideTitle = true
            this._text = StringBuilder()

    public override characters(ch as Array<Char>, start as Int, length as Int) as Void throws SAXException:
        if not this._insideTitle:
            return
        i as Int = 0
        while i < length:
            this._text.append(ch[start + i])
            i += 1

    public override endElement(uri as String, localName as String, qName as String) as Void throws SAXException:
        if qName == "title":
            this.titles.add(this._text.toString().trim())
            this._insideTitle = false
Parse a File

Create a SAXParser, then call parse with a file path and handler. Parser errors raise SAXException or a more specific subclass such as SAXParseException.

handler = TitleHandler()
parser = SAXParser()
parser.parse("books.xml", handler)

for title in handler.titles:
    print(title)
Parse a String

For generated XML or tests, wrap text with StringReader and InputSource.

xml = """<books><book><title>Klyn</title></book></books>"""
handler = TitleHandler()
SAXParser().parse(InputSource(StringReader(xml)), handler)
print(handler.titles[0])
When to Use SAX
  • Use SAX for large XML files, import pipelines, logs, and one-pass transformations.
  • Use DOM when you need random access, tree edits, or repeated queries over the same document.
  • Do not store every event in a SAX handler unless you actually need a tree; that is DOM's job.