|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
DTM
is an XML document model expressed as a table
rather than an object tree. It attempts to provide an interface to
a parse tree that has very little object creation. (DTM
implementations may also support incremental construction of the
model, but that's hidden from the DTM API.)
Nodes in the DTM are identified by integer "handles". A handle must be unique within a process, and carries both node identification and document identification. It must be possible to compare two handles (and thus their nodes) for identity with "==".
Namespace URLs, local-names, and expanded-names can all be represented by and tested as integer ID values. An expanded name represents (and may or may not directly contain) a combination of the URL ID, and the local-name ID. Note that the namespace URL id can be 0, which should have the meaning that the namespace is null. For consistancy, zero should not be used for a local-name index.
Text content of a node is represented by an index and length, permitting efficient storage such as a shared FastStringBuffer.
The model of the tree, as well as the general navigation model, is that of XPath 1.0, for the moment. The model will eventually be adapted to match the XPath 2.0 data model, XML Schema, and InfoSet.
DTM does _not_ directly support the W3C's Document Object Model. However, it attempts to come close enough that an implementation of DTM can be created that wraps a DOM and vice versa.
Please Note: The DTM API is still Subject To Change. This wouldn't affect most users, but might require updating some extensions.
The largest change being contemplated is a reconsideration of the Node Handle representation. We are still not entirely sure that an integer packed with two numeric subfields is really the best solution. It has been suggested that we move up to a Long, to permit more nodes per document without having to reduce the number of slots in the DTMManager. There's even been a proposal that we replace these integers with "cursor" objects containing the internal node id and a pointer to the actual DTM object; this might reduce the need to continuously consult the DTMManager to retrieve the latter, and might provide a useful "hook" back into normal Java heap management. But changing this datatype would have huge impact on Xalan's internals -- especially given Java's lack of C-style typedefs -- so we won't cut over unless we're convinced the new solution really would be an improvement!
Field Summary | |
static short |
ATTRIBUTE_NODE
The node is an Attr . |
static short |
CDATA_SECTION_NODE
The node is a CDATASection . |
static short |
COMMENT_NODE
The node is a Comment . |
static short |
DOCUMENT_FRAGMENT_NODE
The node is a DocumentFragment . |
static short |
DOCUMENT_NODE
The node is a Document . |
static short |
DOCUMENT_TYPE_NODE
The node is a DocumentType . |
static short |
ELEMENT_NODE
The node is an Element . |
static short |
ENTITY_NODE
The node is an Entity . |
static short |
ENTITY_REFERENCE_NODE
The node is an EntityReference . |
static short |
NAMESPACE_NODE
The node is a namespace node . |
static short |
NOTATION_NODE
The node is a Notation . |
static short |
NTYPES
The number of valid nodetypes. |
static int |
NULL
Null node handles are represented by this value. |
static short |
PROCESSING_INSTRUCTION_NODE
The node is a ProcessingInstruction . |
static short |
ROOT_NODE
The node is a Root . |
static short |
TEXT_NODE
The node is a Text node. |
Method Summary | |
void |
appendChild(int newChild,
boolean clone,
boolean cloneDepth)
Append a child to "the end of the document". |
void |
appendTextChild(java.lang.String str)
Append a text node child that will be constructed from a string, to the end of the document. |
void |
dispatchCharactersEvents(int nodeHandle,
ContentHandler ch,
boolean normalize)
Directly call the characters method on the passed ContentHandler for the string-value of the given node (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). |
void |
dispatchToEvents(int nodeHandle,
ContentHandler ch)
Directly create SAX parser events representing the XML content of a DTM subtree. |
void |
documentRegistration()
As the DTM is registered with the DTMManager, this method will be called. |
void |
documentRelease()
As documents are released from the DTMManager, the DTM implementation will be notified of the event. |
int |
getAttributeNode(int elementHandle,
java.lang.String namespaceURI,
java.lang.String name)
Retrieves an attribute node by local name and namespace URI %TBD% Note that we currently have no way to support the DOM's old getAttribute() call, which accesses only the qname. |
DTMAxisIterator |
getAxisIterator(int axis)
This is a shortcut to the iterators that implement XPath axes. |
DTMAxisTraverser |
getAxisTraverser(int axis)
This returns a stateless "traverser", that can navigate over an XPath axis, though not in document order. |
ContentHandler |
getContentHandler()
Return this DTM's content handler, if it has one. |
DeclHandler |
getDeclHandler()
Return this DTM's DeclHandler, if it has one. |
int |
getDocument()
Given a DTM which contains only a single document, find the Node Handle of the Document node. |
boolean |
getDocumentAllDeclarationsProcessed()
Return an indication of whether the processor has read the complete DTD. |
java.lang.String |
getDocumentBaseURI()
Return the base URI of the document entity. |
java.lang.String |
getDocumentEncoding(int nodeHandle)
Return the name of the character encoding scheme in which the document entity is expressed. |
int |
getDocumentRoot(int nodeHandle)
Given a node handle, find the owning document node. |
java.lang.String |
getDocumentStandalone(int nodeHandle)
Return an indication of the standalone status of the document, either "yes" or "no". |
java.lang.String |
getDocumentSystemIdentifier(int nodeHandle)
Return the system identifier of the document entity. |
java.lang.String |
getDocumentTypeDeclarationPublicIdentifier()
Return the public identifier of the external subset, normalized as described in 4.2.2 External Entities [XML]. |
java.lang.String |
getDocumentTypeDeclarationSystemIdentifier()
A document type declaration information item has the following properties: 1. |
java.lang.String |
getDocumentVersion(int documentHandle)
Return a string representing the XML version of the document. |
DTDHandler |
getDTDHandler()
Return this DTM's DTDHandler, if it has one. |
int |
getElementById(java.lang.String elementId)
Returns the Element whose ID is given by
elementId . |
EntityResolver |
getEntityResolver()
Return this DTM's EntityResolver, if it has one. |
ErrorHandler |
getErrorHandler()
Return this DTM's ErrorHandler, if it has one. |
int |
getExpandedTypeID(int nodeHandle)
Given a node handle, return an ID that represents the node's expanded name. |
int |
getExpandedTypeID(java.lang.String namespace,
java.lang.String localName,
int type)
Given an expanded name, return an ID. |
int |
getFirstAttribute(int nodeHandle)
Given a node handle, get the index of the node's first attribute. |
int |
getFirstChild(int nodeHandle)
Given a node handle, get the handle of the node's first child. |
int |
getFirstNamespaceNode(int nodeHandle,
boolean inScope)
Given a node handle, get the index of the node's first namespace node. |
int |
getLastChild(int nodeHandle)
Given a node handle, get the handle of the node's last child. |
short |
getLevel(int nodeHandle)
**For internal use only** Get the depth level of this node in the tree (equals 1 for a parentless node). |
LexicalHandler |
getLexicalHandler()
Return this DTM's lexical handler, if it has one. |
java.lang.String |
getLocalName(int nodeHandle)
Given a node handle, return its DOM-style localname. |
java.lang.String |
getLocalNameFromExpandedNameID(int ExpandedNameID)
Given an expanded-name ID, return the local name part. |
java.lang.String |
getNamespaceFromExpandedNameID(int ExpandedNameID)
Given an expanded-name ID, return the namespace URI part. |
java.lang.String |
getNamespaceURI(int nodeHandle)
Given a node handle, return its DOM-style namespace URI (As defined in Namespaces, this is the declared URI which this node's prefix -- or default in lieu thereof -- was mapped to.) |
int |
getNextAttribute(int nodeHandle)
Given a node handle, advance to the next attribute. |
int |
getNextNamespaceNode(int baseHandle,
int namespaceHandle,
boolean inScope)
Given a namespace handle, advance to the next namespace in the same scope (local or local-plus-inherited, as selected by getFirstNamespaceNode) |
int |
getNextSibling(int nodeHandle)
Given a node handle, advance to its next sibling. |
Node |
getNode(int nodeHandle)
Return an DOM node for the given node. |
java.lang.String |
getNodeName(int nodeHandle)
Given a node handle, return its DOM-style node name. |
java.lang.String |
getNodeNameX(int nodeHandle)
Given a node handle, return the XPath node name. |
short |
getNodeType(int nodeHandle)
Given a node handle, return its DOM-style node type. |
java.lang.String |
getNodeValue(int nodeHandle)
Given a node handle, return its node value. |
int |
getOwnerDocument(int nodeHandle)
Given a node handle, find the owning document node. |
int |
getParent(int nodeHandle)
Given a node handle, find its parent node. |
java.lang.String |
getPrefix(int nodeHandle)
Given a namespace handle, return the prefix that the namespace decl is mapping. |
int |
getPreviousSibling(int nodeHandle)
Given a node handle, find its preceeding sibling. |
SourceLocator |
getSourceLocatorFor(int node)
Get the location of a node in the source document. |
XMLString |
getStringValue(int nodeHandle)
Get the string-value of a node as a String object (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). |
char[] |
getStringValueChunk(int nodeHandle,
int chunkIndex,
int[] startAndLen)
Get a character array chunk in the string-value of a node. |
int |
getStringValueChunkCount(int nodeHandle)
Get number of character array chunks in the string-value of a node. |
DTMAxisIterator |
getTypedAxisIterator(int axis,
int type)
Get an iterator that can navigate over an XPath Axis, predicated by the extended type ID. |
java.lang.String |
getUnparsedEntityURI(java.lang.String name)
The getUnparsedEntityURI function returns the URI of the unparsed entity with the specified name in the same document as the context node (see [3.3 Unparsed Entities]). |
boolean |
hasChildNodes(int nodeHandle)
Given a node handle, test if it has child nodes. |
boolean |
isAttributeSpecified(int attributeHandle)
5. |
boolean |
isCharacterElementContentWhitespace(int nodeHandle)
2. |
boolean |
isDocumentAllDeclarationsProcessed(int documentHandle)
10. |
boolean |
isNodeAfter(int firstNodeHandle,
int secondNodeHandle)
Figure out whether nodeHandle2 should be considered as being later in the document than nodeHandle1, in Document Order as defined by the XPath model. |
boolean |
isSupported(java.lang.String feature,
java.lang.String version)
Tests whether DTM DOM implementation implements a specific feature and that feature is supported by this node. |
boolean |
needsTwoThreads()
|
void |
setDocumentBaseURI(java.lang.String baseURI)
Set the base URI of the document entity. |
void |
setFeature(java.lang.String featureId,
boolean state)
Set an implementation dependent feature. |
void |
setProperty(java.lang.String property,
java.lang.Object value)
Set a run time property for this DTM instance. |
boolean |
supportsPreStripping()
Return true if the xsl:strip-space or xsl:preserve-space was processed during construction of the document contained in this DTM. |
Field Detail |
public static final int NULL
public static final short ROOT_NODE
Root
.public static final short ELEMENT_NODE
Element
.public static final short ATTRIBUTE_NODE
Attr
.public static final short TEXT_NODE
Text
node.public static final short CDATA_SECTION_NODE
CDATASection
.public static final short ENTITY_REFERENCE_NODE
EntityReference
.public static final short ENTITY_NODE
Entity
.public static final short PROCESSING_INSTRUCTION_NODE
ProcessingInstruction
.public static final short COMMENT_NODE
Comment
.public static final short DOCUMENT_NODE
Document
.public static final short DOCUMENT_TYPE_NODE
DocumentType
.public static final short DOCUMENT_FRAGMENT_NODE
DocumentFragment
.public static final short NOTATION_NODE
Notation
.public static final short NAMESPACE_NODE
namespace node
. Note that this is not
currently a node type defined by the DOM API.public static final short NTYPES
Method Detail |
public void setFeature(java.lang.String featureId, boolean state)
%REVIEW% Do we really expect to set features on DTMs?
featureId
- A feature URL.state
- true if this feature should be on, false otherwise.public void setProperty(java.lang.String property, java.lang.Object value)
property
- a String
valuevalue
- an Object
valuepublic DTMAxisTraverser getAxisTraverser(int axis)
axis
- One of Axes.ANCESTORORSELF, etc.public DTMAxisIterator getAxisIterator(int axis)
axis
- One of Axes.ANCESTORORSELF, etc.public DTMAxisIterator getTypedAxisIterator(int axis, int type)
axis
- type
- An extended type ID.public boolean hasChildNodes(int nodeHandle)
%REVIEW% This is obviously useful at the DOM layer, where it would permit testing this without having to create a proxy node. It's less useful in the DTM API, where (dtm.getFirstChild(nodeHandle)!=DTM.NULL) is just as fast and almost as self-evident. But it's a convenience, and eases porting of DOM code to DTM.
nodeHandle
- int Handle of the node.public int getFirstChild(int nodeHandle)
nodeHandle
- int Handle of the node.public int getLastChild(int nodeHandle)
nodeHandle
- int Handle of the node.public int getAttributeNode(int elementHandle, java.lang.String namespaceURI, java.lang.String name)
elementHandle
- Handle of the node upon which to look up this attribute.namespaceURI
- The namespace URI of the attribute to
retrieve, or null.name
- The local name of the attribute to
retrieve.nodeName
) or DTM.NULL
if there is no such
attribute.public int getFirstAttribute(int nodeHandle)
nodeHandle
- int Handle of the node.public int getFirstNamespaceNode(int nodeHandle, boolean inScope)
nodeHandle
- handle to node, which should probably be an element
node, but need not be.inScope
- true if all namespaces in scope should be
returned, false if only the node's own
namespace declarations should be returned.public int getNextSibling(int nodeHandle)
nodeHandle
- int Handle of the node.public int getPreviousSibling(int nodeHandle)
nodeHandle
- the id of the node.public int getNextAttribute(int nodeHandle)
nodeHandle
- int Handle of the node.public int getNextNamespaceNode(int baseHandle, int namespaceHandle, boolean inScope)
baseHandle
- handle to original node from where the first child
was relative to (needed to return nodes in document order).namespaceHandle
- handle to node which must be of type
NAMESPACE_NODE.
NEEDSDOC @param inScopepublic int getParent(int nodeHandle)
nodeHandle
- the id of the node.public int getDocument()
public int getOwnerDocument(int nodeHandle)
nodeHandle
- the id of the node.getDocumentRoot(int nodeHandle)
public int getDocumentRoot(int nodeHandle)
nodeHandle
- the id of the node.getOwnerDocument(int nodeHandle)
public XMLString getStringValue(int nodeHandle)
nodeHandle
- The node ID.public int getStringValueChunkCount(int nodeHandle)
nodeHandle
- The node ID.public char[] getStringValueChunk(int nodeHandle, int chunkIndex, int[] startAndLen)
nodeHandle
- The node ID.chunkIndex
- Which chunk to get.startAndLen
- A two-integer array which, upon return, WILL
BE FILLED with values representing the chunk's start position
within the returned character buffer and the length of the chunk.public int getExpandedTypeID(int nodeHandle)
nodeHandle
- The handle to the node in question.public int getExpandedTypeID(java.lang.String namespace, java.lang.String localName, int type)
nodeHandle
- The handle to the node in question.
NEEDSDOC @param namespace
NEEDSDOC @param localName
NEEDSDOC @param typepublic java.lang.String getLocalNameFromExpandedNameID(int ExpandedNameID)
ExpandedNameID
- an ID that represents an expanded-name.public java.lang.String getNamespaceFromExpandedNameID(int ExpandedNameID)
ExpandedNameID
- an ID that represents an expanded-name.public java.lang.String getNodeName(int nodeHandle)
nodeHandle
- the id of the node.public java.lang.String getNodeNameX(int nodeHandle)
nodeHandle
- the id of the node.public java.lang.String getLocalName(int nodeHandle)
nodeHandle
- the id of the node.public java.lang.String getPrefix(int nodeHandle)
postition
- int Handle of the node.
%REVIEW% Are you sure you want "" for no prefix?
nodeHandle
- the id of the node.public java.lang.String getNamespaceURI(int nodeHandle)
postition
- int Handle of the node.nodeHandle
- the id of the node.public java.lang.String getNodeValue(int nodeHandle)
nodeHandle
- The node id.public short getNodeType(int nodeHandle)
%REVIEW% Generally, returning short is false economy. Return int?
nodeHandle
- The node id.public short getLevel(int nodeHandle)
nodeHandle
- The node id.public boolean isSupported(java.lang.String feature, java.lang.String version)
feature
- The name of the feature to test.version
- This is the version number of the feature to test.
If the version is not
specified, supporting any version of the feature will cause the
method to return true
.true
if the specified feature is
supported on this node, false
otherwise.public java.lang.String getDocumentBaseURI()
public void setDocumentBaseURI(java.lang.String baseURI)
baseURI
- the document base URI String object or null if unknown.public java.lang.String getDocumentSystemIdentifier(int nodeHandle)
nodeHandle
- The node id, which can be any valid node handle.public java.lang.String getDocumentEncoding(int nodeHandle)
nodeHandle
- The node id, which can be any valid node handle.public java.lang.String getDocumentStandalone(int nodeHandle)
nodeHandle
- The node id, which can be any valid node handle.public java.lang.String getDocumentVersion(int documentHandle)
the
- document handle
NEEDSDOC @param documentHandlepublic boolean getDocumentAllDeclarationsProcessed()
true
if all declarations were processed;
false
otherwise.public java.lang.String getDocumentTypeDeclarationSystemIdentifier()
public java.lang.String getDocumentTypeDeclarationPublicIdentifier()
the
- document type declaration handlepublic int getElementById(java.lang.String elementId)
Element
whose ID
is given by
elementId
. If no such element exists, returns
DTM.NULL
. Behavior is not defined if more than one element
has this ID
. Attributes (including those
with the name "ID") are not of type ID unless so defined by DTD/Schema
information available to the DTM implementation.
Implementations that do not know whether attributes are of type ID or
not are expected to return DTM.NULL
.
%REVIEW% Presumably IDs are still scoped to a single document, and this operation searches only within a single document, right? Wouldn't want collisions between DTMs in the same process.
elementId
- The unique id
value for an element.public java.lang.String getUnparsedEntityURI(java.lang.String name)
XML processors may choose to use the System Identifier (if one is provided) to resolve the entity, rather than the URI in the Public Identifier. The details are dependent on the processor, and we would have to support some form of plug-in resolver to handle this properly. Currently, we simply return the System Identifier if present, and hope that it a usable URI or that our caller can map it to one. %REVIEW% Resolve Public Identifiers... or consider changing function name.
If we find a relative URI reference, XML expects it to be resolved in terms of the base URI of the document. The DOM doesn't do that for us, and it isn't entirely clear whether that should be done here; currently that's pushed up to a higher level of our application. (Note that DOM Level 1 didn't store the document's base URI.) %REVIEW% Consider resolving Relative URIs.
(The DOM's statement that "An XML processor may choose to completely expand entities before the structure model is passed to the DOM" refers only to parsed entities, not unparsed, and hence doesn't affect this function.)
name
- A string containing the Entity Name of the unparsed
entity.public boolean supportsPreStripping()
public boolean isNodeAfter(int firstNodeHandle, int secondNodeHandle)
There are some cases where ordering isn't defined, and neither are the results of this function -- though we'll generally return true.
%REVIEW% Make sure this does the right thing with attribute nodes!!!
%REVIEW% Consider renaming for clarity. Perhaps isDocumentOrder(a,b)?
firstNodeHandle
- DOM Node to perform position comparison on.secondNodeHandle
- DOM Node to perform position comparison on.(firstNode.documentOrderPosition <= secondNode.documentOrderPosition)
.public boolean isCharacterElementContentWhitespace(int nodeHandle)
If there is no declaration for the containing element, an XML processor must assume that the whitespace could be meaningful and return false. If no declaration has been read, but the [all declarations processed] property of the document information item is false (so there may be an unread declaration), then the value of this property is indeterminate for white space characters and should probably be reported as false. It is always false for text nodes that contain anything other than (or in addition to) white space.
Note too that it always returns false for non-Text nodes.
%REVIEW% Joe wants to rename this isWhitespaceInElementContent() for clarity
nodeHandle
- the node ID.true
if the node definitely represents whitespace in
element content; false
otherwise.public boolean isDocumentAllDeclarationsProcessed(int documentHandle)
the
- document handledocumentHandle
- A node handle that must identify a document.true
if all declarations were processed;
false
otherwise.public boolean isAttributeSpecified(int attributeHandle)
the
- attribute handle
NEEDSDOC @param attributeHandletrue
if the attribute was specified;
false
if it was defaulted or the handle doesn't
refer to an attribute node.public void dispatchCharactersEvents(int nodeHandle, ContentHandler ch, boolean normalize) throws SAXException
nodeHandle
- The node ID.ch
- A non-null reference to a ContentHandler.normalize
- true if the content should be normalized according to
the rules for the XPath
normalize-space
function.public void dispatchToEvents(int nodeHandle, ContentHandler ch) throws SAXException
nodeHandle
- The node ID.ch
- A non-null reference to a ContentHandler.public Node getNode(int nodeHandle)
nodeHandle
- The node ID.public boolean needsTwoThreads()
public ContentHandler getContentHandler()
public LexicalHandler getLexicalHandler()
public EntityResolver getEntityResolver()
public DTDHandler getDTDHandler()
public ErrorHandler getErrorHandler()
public DeclHandler getDeclHandler()
public void appendChild(int newChild, boolean clone, boolean cloneDepth)
%REVIEW% DTM maintains an insertion cursor which performs a depth-first tree walk as nodes come in, and this operation is really equivalent to: insertionCursor.appendChild(document.importNode(newChild))) where the insert point is the last element that was appended (or the last one popped back to by an end-element operation).
newChild
- Must be a valid new node handle.clone
- true if the child should be cloned into the document.cloneDepth
- if the clone argument is true, specifies that the
clone should include all it's children.public void appendTextChild(java.lang.String str)
str
- Non-null reference to a string.public SourceLocator getSourceLocatorFor(int node)
node
- an int
valueSourceLocator
value or null if no location
is availablepublic void documentRegistration()
public void documentRelease()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |