XMLSCAN simple XML scanning library
Sometimes, you just want to scan a tagged document to extract
the structured data that's in there. You don't need schema validation,
and you don't need to worry about XML entities (or, at least, you can
translate them yourself, after scanning). In a case like that, you
want a simple, fast, low-impact XML scanning and parsing library that
builds the simplest possible DOM representation. XMLSCAN implements
such a bare-bones scanner, using the text you pass it for storage, so
it doesn't have to call malloc() to copy strings around.
XMLSCAN is the fastest XML scanning library I know of, by
several orders of magnitude in some cases(!).
Please download it here (8 kB) if you
think you might have use of it. It's MIT license open source. It will need
very minor adjustments to build in your environment: you need to provide
logging and a Vector3 class (or just cut out those parts, as
done in the performance test archive below). Be happy!
To test the performance, there's a simple scanning and parsing app
and a 40 MB XML test file that you can get here
(460 kB).
You'll still have to build it (with optimization) to test it. Run it with
the file to parse as argument, or with "-dom filetoparse" to build a DOM
out of the file. On my laptop (Core 2 Duo P9600), scanning the
40 MB file (forward scanner) takes 4 microseconds (!) and building
a full DOM-style tree takes about 520 milliseconds.
|