Class AbstractUnitIdentifier

java.lang.Object
com.pnfsoftware.jeb.core.AbstractPlugin
com.pnfsoftware.jeb.core.units.AbstractUnitIdentifier
All Implemented Interfaces:
IPlugin, IUnitIdentifier, IUnitPlugin

public abstract class AbstractUnitIdentifier extends AbstractPlugin implements IUnitIdentifier
Skeleton implementation for an IUnitIdentifier class (aka, a Parser). It is recommended parsers extend this class (or one of the its many subclasses) instead of implementing IUnitIdentifier.
  • Field Details

  • Constructor Details

    • AbstractUnitIdentifier

      protected AbstractUnitIdentifier(String type, double priority)
      Create a new unit identifier.
      Parameters:
      type - mandatory type, which will be used as the sub-region type when the identifier is initialized. The type should be a valid region name, please see PropertyDefinitionManager
      priority - mandatory priority
  • Method Details

    • initialize

      public void initialize(IPropertyDefinitionManager parentPdm)
      The default implementation creates a basic PDM attached to the provided parent.
      Specified by:
      initialize in interface IUnitPlugin
      Parameters:
      parentPdm - the parent PDM
    • createPDM

      protected final void createPDM(IPropertyDefinitionManager parentPdm, String pdmDescription, int pdmFlags)
      Create and attach the property definition manager.
      Parameters:
      parentPdm - parent property definition manager
      pdmDescription - optional PDM description
      pdmFlags - PDM flags
    • getPropertyDefinitionManager

      public IPropertyDefinitionManager getPropertyDefinitionManager()
      Description copied from interface: IUnitPlugin
      Retrieve the property definition manager (PDM) used by this plugin.
      Specified by:
      getPropertyDefinitionManager in interface IUnitPlugin
      Returns:
      the PDM
    • getFormatType

      public String getFormatType()
      Description copied from interface: IUnitPlugin
      Retrieve the identity of the unit type. The type should be unique across all units.
      Specified by:
      getFormatType in interface IUnitPlugin
      Returns:
      the non-null type
    • getPriority

      public double getPriority()
      Description copied from interface: IUnitPlugin
      Get the identifier priority. Priority can be negative. Priorities are used by identifier managers (eg, Processors) to ensure that identifiers are called in a (partially) predictable order, if need be. The higher the number, the higher the priority.

      Recommendation: typical priority is zero. Relying on priority for proper identification should be used as a last resort measure.

      Specified by:
      getPriority in interface IUnitPlugin
      Returns:
      the identifier priority
    • getTypeIdProvider

      public ITypeIdProvider getTypeIdProvider()
      Description copied from interface: IUnitPlugin
      The type-id provider for this type of units. Currently, This method is not intended to use by third-party clients.
      Specified by:
      getTypeIdProvider in interface IUnitPlugin
      Returns:
      a type-id provider, null if none
    • checkBytes

      public static boolean checkBytes(byte[] data, int offset, byte... marker)
      Check for expected bytes at a specific offset within the provided data buffer.
      Parameters:
      data - the input data buffer
      offset - the offset
      marker - the bytes to check for
      Returns:
      true if the bytes at data[offset] match the provided bytes
      See Also:
    • checkBytes

      public static boolean checkBytes(byte[] data, int offset, boolean caseSensitive, byte... marker)
      Convenience method used to check for expected bytes at a specific offset within the provided data buffer.
      Parameters:
      data - the input data buffer
      offset - the offset
      caseSensitive - indicate if [a-z] should be considered as case sensitive. While insensitive case, marker must receive lower case characters. Default value is TRUE.
      marker - the bytes to check for
      Returns:
      true if the bytes at &data[offset] are equals to the provided bytes
    • checkBytes

      public static boolean checkBytes(byte[] data, int offset, int... marker)
      Check for expected bytes at a specific offset within the provided data buffer.
      Parameters:
      data - the input data buffer
      offset - the offset
      marker - the byte values to check for
      Returns:
      true if the bytes at data[offset] match the provided byte values
      See Also:
    • checkBytes

      public static boolean checkBytes(byte[] data, int offset, boolean caseSensitive, int... marker)
      Check for expected bytes at a specific offset within the provided data buffer.
      Parameters:
      data - the input data buffer
      offset - the offset
      caseSensitive - true for case-sensitive checks
      marker - the byte values to check for
      Returns:
      true if the bytes at data[offset] match the provided byte values
      See Also:
    • checkBytes

      public static boolean checkBytes(IInput input, int offset, byte... marker)
      Check for expected bytes at a specific offset within the provided input.
      Parameters:
      input - input data
      offset - the offset
      marker - the bytes to check for
      Returns:
      true if the bytes at the offset match the provided bytes
      See Also:
    • checkBytes

      public static boolean checkBytes(IInput input, int offset, int... marker)
      Check for expected bytes at a specific offset within the provided input.
      Parameters:
      input - input data
      offset - the offset
      marker - the byte values to check for
      Returns:
      true if the bytes at the offset match the provided byte values
      See Also:
    • checkBytes

      public static boolean checkBytes(IInput input, int offset, String header)
      Check for expected UTF-8 text at a specific offset within the provided input.
      Parameters:
      input - input data
      offset - the offset
      header - converted to byte[] using Strings.encodeUTF8(String)
      Returns:
      true if the bytes at the offset match the provided text
      See Also:
    • readHeaderByte

      public static int readHeaderByte(IInput input, int offset)
      Read a single header byte.
      Parameters:
      input - input data
      offset - offset within the IInput.getHeader()
      Returns:
      the byte value, -1 on error (eg, the offset is outside the header)
    • getNonWhitespaceHeader

      public static byte[] getNonWhitespaceHeader(IInput input, int size, char... extraWhitespaceCharacters)
      Returns the header with at least n significant bytes after the initial whitespace. To be used for text documents which can potentially contain lot of whitespace characters before text. May return trimmed data (when header is not enough) or non-trimmed data (when header is enough).
      Parameters:
      input - input data
      size - minimum data size required
      extraWhitespaceCharacters - additional characters considered as whitespace
      Returns:
      a byte array, possibly smaller than requested, or null on error
    • getNonWhitespaceHeader

      public static byte[] getNonWhitespaceHeader(IInput input, int size, boolean trim, char... extraWhitespaceCharacters)
      Returns the header with at least n significant bytes after the initial whitespace. To be used for text documents which can potentially contain lot of whitespace characters before text.
      Parameters:
      input - IInput
      size - minimum data size required
      trim - when true, any BOM + any whitespace will be skipped
      extraWhitespaceCharacters - additional characters considered as whitespace
      Returns:
      an array with at least minimum size. If there is not enough available bytes, the array size will be smaller. May return null in case of error or when file only contains whitespaces.
    • getTextFirstIndex

      protected static int getTextFirstIndex(byte[] data, int i, char... extraWhitespaceCharacters)
      Retrieve the index of the first non-whitespace text byte.
      Parameters:
      data - input bytes
      i - start index
      extraWhitespaceCharacters - additional characters considered as whitespace
      Returns:
      the first non-whitespace byte index, or data.length
    • acceptAnyInputBytes

      public boolean acceptAnyInputBytes()
      The default implementation returns false.
      Specified by:
      acceptAnyInputBytes in interface IUnitIdentifier
      Returns:
      true if the input data may not be readily identified as of a specific format