Class CharacterParser


public class CharacterParser extends SinglePhase
Realizes the fine granular d2d character level parsing. Usage: call constructor, then parse(eu.bandm.tools.d2d2.base.MemScanner<java.lang.String>,eu.bandm.tools.d2d2.model.CharsRegExp,eu.bandm.tools.d2d2.rt.ResultingStructure). The parsing is nondeterministically (implemented as parallel search) and delivers one of the resulting interpretations which consume most input characters.
Character based parsing is conceptually AFTER the elimination of comments, and AFTER the translation of closing parenthesis characters into close tags.
It is NOT AFTER the translation of numeric input character entities.
The standard command character "#" in a parser definition is matched to the actual current command character in the input text.
FIXME Kommentare werden übersprungen ABER NICHT aus den Resultaten gestrichen.
For these translation purposes there are dedicated consumer methods with the word "_filtered_", MemScanner.accept_greedy_filtered_chars(CharSet), MemScanner.accept_one_filtered_char(CharSet), MemScanner.accept_blanks_filtered(), MemScanner.accept_string_w_o_lineswitch(String).
EXCPTION ONE: String constants (by a parser definition like
... ~ "ABC" ~ ...
are only acceptes when outside of comments, i.e. the characters of a String constant parser may not be interrupted by comment insertions.
EXCPTION TWO: Charater sets with a star- or plus operator are interpreted in a GREEDY way, not non-deterministically: ('a'..'z')~* ~ ('a'..'z')
will never match anything!
  • Field Details

  • Constructor Details

  • Method Details

    • parse

      @Opt public @Opt ResultingStructure parse(MemScanner<String> scanner, CharsRegExp parser, ResultingStructure result)
      Central executive method. It gets a particular CharsRegExp definition and tries to match the character input (given by the MemScanner) by a non-deterministic, parallel breadth-first execution.
      Maintains a set of CharacterParser.ParseResult objects combining parser state and the ResultingStructure, as constructed so far.
      In case of success, it returns one (randomly chosen) of the matches which consume most input characters and adjusts "scanner" accordingly. (So "scanner is both an input and an output argument.
      In case of failure, it returns null and does not advance the input pointer in "scanner".
    • copyContentsFromTo

      protected void copyContentsFromTo(CharacterParser.ParseResult from, MemScanner<String> startpos, ResultingStructure to)
      If "from" has structured contents, then add this into "to", as a sequence and per assoc. Otherwise copy the characters from "startpos" upto the current(=accepting) input pointer pos in "from". ***
    • typingError

      protected void typingError(Location<XMLDocumentIdentifier> loc)
    • trace

      protected void trace(String s)
    • trace

      protected void trace(@Opt @Opt Location<XMLDocumentIdentifier> loc, String s)
    • action

      public void action(Insertion insertion)
      Special case of insertion, namely the reference to an enumeration, can survive the rewriting process: (@ ref). It means a flattened acceptance of one of the enumeration's string values.
      Also INFINITE CYCLES of insertions of CharsRegExp may survive. NOT YET SUPPORTED FIXME.
      Overrides:
      action in class SinglePhase
    • action

      public void action(Perm permutation)
      Overrides:
      action in class SinglePhase
    • action

      public void action(Reference ref)
      Overrides:
      action in class SinglePhase
    • flattened_consumption_of_enum

      protected void flattened_consumption_of_enum(Enumeration enumeration)
      when called from an insertion: treat an enumeration as mere collection of string constants, i.e. consume one of them and do NOT create any corresponding element.
    • action

      public void action(Enumeration enumeration)
      treat enumeration as parser and create a special resulting structure.
      Overrides:
      action in class SinglePhase
    • subElement

      protected void subElement(Definition def, Expression rule)
      Called when reaching a ParseParticle, a CharsRegExp or an Enumeration, which all lead to wrapping the parsing result into a Result object with this definition as its tag. All cases can lead to character data only; the first two cases can also lead to sub-result-objects (structured contents) instead. All results must be wrapped explicitly.
    • action

      public void action(CharsRegExp def)
      Overrides:
      action in class SinglePhase
    • action

      public void action(ParseParticle pp)
      Overrides:
      action in class SinglePhase
    • action

      public void action(TagsRegExp def)
      Overrides:
      action in class SinglePhase
    • action

      public void action(ImportItem def)
      Overrides:
      action in class SinglePhase
    • acceptCharRep

      protected void acceptCharRep(CharSet cset, boolean isstar, boolean istight)
      Provides special treatment (implemented directly in the scanner), including different semantics (greedy, not non-determinstically !) for character set expressions.
    • acceptRep

      protected void acceptRep(GrUnary expr, boolean isstar)
    • action

      public void action(Star expr)
      Overrides:
      action in class SinglePhase
    • action

      public void action(Plus expr)
      Overrides:
      action in class SinglePhase
    • action

      public void action(Greedy greedy)
      Only the longest matches for each incoming hypotheses are recognized.
      Overrides:
      action in class SinglePhase
    • action

      public void action(Opt expr)
      Overrides:
      action in class SinglePhase
    • action

      public void action(Alt expr)
      Overrides:
      action in class SinglePhase
    • action

      public void action(Seq expr)
      Overrides:
      action in class SinglePhase
    • action

      public void action(CharBinary expr)
      Overrides:
      action in class SinglePhase
    • action

      public void action(CharExpr expr)
      Overrides:
      action in class SinglePhase
    • action

      public void action(StringConst expr)
      Overrides:
      action in class SinglePhase