public class

XmlParser

extends Object
java.lang.Object
   ↳ com.pnfsoftware.jeb.util.encoding.xml.XmlParser

Class Overview

A limited, simple, lenient, fast, and read-only XML parser. It can be used as a back-up when the JDK implementation (typically, Apache Xerces) fails on documents deviating from the XML specifications.

Features and limitations:

  • XML version must be 1.x, encoding must be UTF-8.
  • All unicode chars are accepted as long as there is no parsing ambiguity.
  • Multiple root elements are allowed.
  • Android-style backslash escapes in attribute values can be supported ( details).
  • This parser returns XML documents implementing the read-only parts of the standard org.w3c.dom API (refer to the X... classes in this package).
  • The following XML node types are NOT supported: DocumentFragment, Entity, EntityReference, Notation, ProcessingInstruction.
  • Calls to unsupported methods will raise UnsupportedOperationException.
  • Allow unclosed tags (like the famous html <br>)
  • Allow unquoted attribute values

Summary

Public Constructors
XmlParser()
Public Methods
boolean isAllowMismatchedTags()
boolean isAllowUnclosedTags()
boolean isAssignParentNodes()
boolean isHandleBackslashAxmlStyle()
boolean isSortAttributes()
XDocument parse(String str)
Parse the provided XML string and return an XML Document object.
XDocument parse(byte[] bytes)
Parse the provided XML data and return an XML Document object.
void setAllowMismatchedTags(boolean allowMismatchedTags)
void setAllowUnclosedTags(boolean allowUnclosedTags)
void setAssignParentNodes(boolean assignParentNodes)
void setHandleBackslashAxmlStyle(boolean handleBackslashAxmlStyle)
void setSortAttributes(boolean sortAttributes)
[Expand]
Inherited Methods
From class java.lang.Object

Public Constructors

public XmlParser ()

Public Methods

public boolean isAllowMismatchedTags ()

Returns
  • the default is false

public boolean isAllowUnclosedTags ()

Returns
  • the default is false

public boolean isAssignParentNodes ()

Returns
  • the default is false

public boolean isHandleBackslashAxmlStyle ()

Returns
  • the default is false

public boolean isSortAttributes ()

Returns
  • the default is false

public XDocument parse (String str)

Parse the provided XML string and return an XML Document object. This method will throw if an error occurs.

Parameters
str XML string (the encoding attribute is disregarded)
Returns
  • a document object, never null (the method throws on error)
Throws
ParseException

public XDocument parse (byte[] bytes)

Parse the provided XML data and return an XML Document object. This method will throw if an error occurs.

Returns
  • a document object, never null (the method throws on error)
Throws
ParseException

public void setAllowMismatchedTags (boolean allowMismatchedTags)

Parameters
allowMismatchedTags if true, the parser will match the tags case-insensitively

public void setAllowUnclosedTags (boolean allowUnclosedTags)

Parameters
allowUnclosedTags if true, the parser will consider unclosed tags as part of XText

public void setAssignParentNodes (boolean assignParentNodes)

Parameters
assignParentNodes if true, the internal parent and/or owner fields are set, allowing the use of methods like getParentNode(), getOwnerDocument(), getOwnerElement(), etc.

public void setHandleBackslashAxmlStyle (boolean handleBackslashAxmlStyle)

Parameters
handleBackslashAxmlStyle If true, \n and \t escapes are allowed in attribute values. Other escapes (\x) will result in the character "x". If false, The normal XML behavior applies: \ is a regular character, meaning \x is literally "\x".

public void setSortAttributes (boolean sortAttributes)

Parameters
sortAttributes if true, the element attributes are sorted by name, alphabetically; if false, the original order is maintained (note: Xerces does alphasort)