CompSci497
Advanced XML Programming with XmlPL
Lecture Notes
Friday, January 26th, 2007
Contents
- An Introduction to SAX, DOM, XSLT, & XQuery.
- SAX
- DOM
- XSLT
- XQuery
- A justification for XmlPL
- SAX
- DOM
- XSLT
- XQuery
- XmlPL
- Comparisons
- XmlPL Architecture
- Compiler
- xmlplcc: command line
- Runtime
- Standard Lib
1) An Introduction to SAX, DOM, XSLT, & XQuery.
1.1) SAX
- Simple API for XML
- Event driven
- Defines a set of function callback
- startElement, endElement, startDocument, endDocument,
text, processingInstruction, etc.
1.2) DOM
- Document Object Model
- Language independent
- Defines XML as a tree of objects
- Defines valid operations (methods) on those objects
1.3) XSLT
- Extensible Stylesheet Language Transformations
- Template processor
- Designed to transform XML documents
- Predecessor of: Document Style Semantics and Specification
Language (DSSSL)
- http://en.wikipedia.org/wiki/XSLT for examples
1.4) XQuery
- XML Query Language
- semantically similar to SQL
- FLWOR: FOR, LET, WHERE, ORDER BY, RETURN
- http://en.wikipedia.org/wiki/XQuery for example
2) A justification for XmlPL
2.1) SAX
- Fast and efficient for small tasks.
- Does not allow linear code execution.
- Difficult to relate different parts of an XML document
- Must maintain stacks of context information
- No output processing
2.2) DOM
- Resource hog.
- XML DOM is often 6 or more times larger in memory
than on disk
- APIs generally don't allow for partial loading
- DOM APIs are clumsy
- No output processing
2.3) XSLT
- Verbose: XML is not good at describing computation
- Slow
- Designed mainly for document transformation
- Unable to update XML trees
2.4) XQuery
- Designed for querying XML documents as databases
- Messy SQL like syntax
- Unable to update XML trees
- Not widely supported
2.5) XmlPL
- Efficient single pass processing
- Linear code execution
- Allows user to control DOM memory usage
- Internal DOM representation used less memory
- Output processing guarantees well formed XML output.
- Clean C-like syntax
- Allows in memory updating as well as on-the-fly processing
- Designed for general purpose XML processing
2.5.1) Comparisons
3) XmlPL Architecture
3.1) Compiler
- Written in C++
- Antlr parser generator
- AST: Abstract Syntax Tree
- Environment
- Library importer
- C++ code generator
- Should either report an error and fail or produce compilable
C++ code
- Known bugs exist (and likely unknown)
3.1.1) xmlplcc: command line
- Parses code, generates C++, calls g++ to build
- Able to build standalone or library
- Can dump a stack trace on errors (-t option)
- Able to dump an XML parse tree
3.2) Runtime
- Also written in C++
- Boehm-Demers-Weiser conservative garbage collector
- libxml2 parser and output processing
- Custom low-memory DOM
- Pull parser on top of SAX
- Efficient Sequence implementation
- libcurl network access
3.3) Standard Lib
- currently fairly limited
- curl, gen, math, process, stdio, stdlib, string, unistd, xml
- Easy to expand