Class XmlParser
java.lang.Object
com.pnfsoftware.jeb.util.encoding.xml.XmlParser
A thread-safe, limited, simple, lenient, fast, read-only (except for user-data on nodes) XML
parser. It can be used as a back-up when the JDK implementation (typically, Apache Xerces) fails
on documents deviating from the XML specifications.
Features and limitations:
- XML version must be 1.x, encoding must be UTF-8.
- All unicode chars are accepted as long as there is no parsing ambiguity.
- Multiple root elements are allowed.
- Android-style backslash escapes in attribute values can be supported (
details). - This parser returns XML documents implementing the read-only parts of the standard
org.w3c.domAPI (refer to theX...classes in this package). - The following XML node types are NOT supported:
DocumentFragment,Entity,EntityReference,Notation,ProcessingInstruction. - Calls to unsupported methods will raise
UnsupportedOperationException. - Allow unclosed tags (like the famous html <br>)
- Allow unquoted attribute values
- The resulting Document tree is thread-safe, including for setUserData calls
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbooleanbooleanbooleanbooleanbooleanbooleanparse(byte[] bytes) Parse the provided XML data and return an XMLDocumentobject.Parse the provided XML string and return an XMLDocumentobject.voidsetAllowMismatchedTags(boolean allowMismatchedTags) voidsetAllowNoXmlDeclaration(boolean allowNoXmlDeclaration) voidsetAllowUnclosedTags(boolean allowUnclosedTags) voidsetAssignParentNodes(boolean assignParentNodes) voidsetHandleBackslashAxmlStyle(boolean handleBackslashAxmlStyle) voidsetSortAttributes(boolean sortAttributes)
-
Constructor Details
-
XmlParser
public XmlParser()
-
-
Method Details
-
setAssignParentNodes
public void setAssignParentNodes(boolean assignParentNodes) - Parameters:
assignParentNodes- if true, the internal parent and/or owner fields are set, allowing the use of methods likeNode.getParentNode(),Node.getOwnerDocument(),Attr.getOwnerElement(), etc.
-
isAssignParentNodes
public boolean isAssignParentNodes()- Returns:
- the default is false
-
setSortAttributes
public void setSortAttributes(boolean sortAttributes) - Parameters:
sortAttributes- if true, the element attributes are sorted by name, alphabetically; if false, the original order is maintained (note: Xerces does alphasort)
-
isSortAttributes
public boolean isSortAttributes()- Returns:
- the default is false
-
setHandleBackslashAxmlStyle
public void setHandleBackslashAxmlStyle(boolean handleBackslashAxmlStyle) - Parameters:
handleBackslashAxmlStyle- If true,\nand\tescapes are allowed in attribute values. Other escapes (\x) will result in the character "x". If false, The normal XML behavior applies: \ is a regular character, meaning \x is literally "\x".
-
isHandleBackslashAxmlStyle
public boolean isHandleBackslashAxmlStyle()- Returns:
- the default is false
-
setAllowUnclosedTags
public void setAllowUnclosedTags(boolean allowUnclosedTags) - Parameters:
allowUnclosedTags- if true, the parser will consider unclosed tags as part ofXText
-
isAllowUnclosedTags
public boolean isAllowUnclosedTags()- Returns:
- the default is false
-
setAllowMismatchedTags
public void setAllowMismatchedTags(boolean allowMismatchedTags) - Parameters:
allowMismatchedTags- if true, the parser will match the tags case-insensitively
-
isAllowMismatchedTags
public boolean isAllowMismatchedTags()- Returns:
- the default is false
-
isAllowNoXmlDeclaration
public boolean isAllowNoXmlDeclaration()- Returns:
- the default is false
-
setAllowNoXmlDeclaration
public void setAllowNoXmlDeclaration(boolean allowNoXmlDeclaration) - Parameters:
allowNoXmlDeclaration- if true, the parser will process even if no xml declaration is found (<?xml...)
-
parse
Parse the provided XML string and return an XMLDocumentobject. This method will throw if an error occurs.- Parameters:
str- XML string (the encoding attribute is disregarded)- Returns:
- a document object, never null (the method throws on error)
- Throws:
ParseException
-
parse
Parse the provided XML data and return an XMLDocumentobject. This method will throw if an error occurs.- Parameters:
str- XML data- Returns:
- a document object, never null (the method throws on error)
- Throws:
ParseException
-