XPath is the expression sublanguage of both XSLT and XPointer.
PyXPath is laid on top of PyDOM, the DOM implementation of Python's XML special interest group.
For parser construction, PyXPath uses
and Scott Hassan's
pyxpath.tgz is necessary to use the software.
The archive contains two python packages
Unpack it in Python's
dmutil contains code
developed by me,
PyBison is a small (and slightly patched) part of
pyxpath_s.tgz contains additional sources.
You need this archive only if you want to change (or view)
the bison grammar for XPath.
To change the grammar, you will need
a C development system and the
PyBison must be patched with the patch file
provided in the archive.
The patch allows parser object specific customization.
dmutil.xsl.xpath. It contains the parser factory
makeParserand two evalation context classes
an XPath parser. Such a parser parses
XPath expression strings and constructs corresponding
XPath objects. An XPath object can be evaluated with
a node, a nodelist and an
to obtain a value.
makeParser accepts two optional parameters,
ParseContext instance context,
BaseFactory instance factory.
context defines the namespaces and function
library available for XPath parsing. factory
is the factory object used to contruct XPath objects.
A XPath parser can be applied to a XPath expression string.
It accepts as optional argument a
instance context. The expression is
parsed with the namespaces defined in context
and the context parameter given during parser
construction and with the functions defined by either
the context argument, the context argument
given for parser construction or
the factory. The context parameters
None, the factory to
With this default setting, the constructed XPath object does not
recognize namespaces and can use the functions defined in
the XPath core library; it can be evaluated with
three parameters, a PyDom node, a PyDom nodelist containing
the node and an
Env instance specifying the
available variables and their values.
from dmutil.xsl.xpath import makeParser, Env domtree=.... # create a PyDom document P= makeParser(); E= Env() # make a parser and a variable environment E.setVariable('x','Hallo') # binds x to 'Hallo' links= P('//A[@HREF]').eval(domtree,[domtree],E) # selects all links in a HTML document anchors= P('//A[@NAME]').eval(domtree,[domtree],E) # selects all anchors in a HTML document
You find more XPath examples in the
test case file.
XPath knows 5 data types (extendible):
Return Tree Fragment.
PyXPath maps these to the Python data types
You must use one of these types for values of variables.
Python can be downloaded from the Python homepage, the XML package from the XML-SIG repository.
Open Sourcelicense at your own risk. Please see the copyright notice at the beginning of
dmutil/xsl/xpath.py, for details.
namespaceaxis is not yet recognized.
idattributes. In order to support
idreferences, PyXPath requires each
node._documentto have an attribute
_idMap. This attribute must be a dictionary mapping elements to their
idattribute. If such an attribute does not exist,
idreferences are not found.
DomFactorymodule contains the class
IdDecl. Its contructor has a dictionary mapping element names to
idattributes as idMap parameter. If an
IdDeclinstance is applied to a document, it installs its idMap as
_idMapattribute of the document.
Notationnodes, which can occur in DOM trees. This incompatibility is not yet handled correctly.
EntityReferenceinto normal text nodes, at least if not explicitely told otherwise. However, in this process, the XSL requirement is violated that no two text nodes are adjacent. The function
normalizecan be applied to a document to merge adjacent text nodes.