Package eu.bandm.tools.d2d2.base
Class CharacterParser
java.lang.Object
eu.bandm.tools.d2d2.model.MATCH_ONLY_00
eu.bandm.tools.d2d2.model.SinglePhase
eu.bandm.tools.d2d2.base.CharacterParser
Realizes the fine granular d2d character level parsing.
Usage: call constructor, then
Character based parsing is conceptually AFTER the elimination of comments, and AFTER the translation of closing parenthesis characters into close tags.
It is NOT AFTER the translation of numeric input character entities.
The standard command character "
FIXME Kommentare werden übersprungen ABER NICHT aus den Resultaten gestrichen.
For these translation purposes there are dedicated consumer methods with the word "_filtered_",
EXCPTION ONE: String constants (by a parser definition like
are only acceptes when outside of comments, i.e. the characters of a String constant parser may not be interrupted by comment insertions.
EXCPTION TWO: Charater sets with a star- or plus operator are interpreted in a GREEDY way, not non-deterministically:
will never match anything!
parse(eu.bandm.tools.d2d2.base.MemScanner<java.lang.String>,eu.bandm.tools.d2d2.model.CharsRegExp,eu.bandm.tools.d2d2.rt.ResultingStructure).
The parsing is nondeterministically (implemented as parallel search)
and delivers one of the resulting interpretations which consume most input
characters.Character based parsing is conceptually AFTER the elimination of comments, and AFTER the translation of closing parenthesis characters into close tags.
It is NOT AFTER the translation of numeric input character entities.
The standard command character "
#" in a parser definition is matched to the
actual current command character in the input text.
FIXME Kommentare werden übersprungen ABER NICHT aus den Resultaten gestrichen.
For these translation purposes there are dedicated consumer methods with the word "_filtered_",
MemScanner.accept_greedy_filtered_chars(CharSet),
MemScanner.accept_one_filtered_char(CharSet),
MemScanner.accept_blanks_filtered(),
MemScanner.accept_string_w_o_lineswitch(String).
EXCPTION ONE: String constants (by a parser definition like
... ~ "ABC" ~ ...
are only acceptes when outside of comments, i.e. the characters of a String constant parser may not be interrupted by comment insertions.
EXCPTION TWO: Charater sets with a star- or plus operator are interpreted in a GREEDY way, not non-deterministically:
('a'..'z')~* ~ ('a'..'z')
will never match anything!
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static classA simple wrapper for one (of the many parallel) parsing situations; wrapsMemScanneras the next input situation andResultingStructures as collected so far. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final Navigate.CharSetCalcprotected final booleanprotected Set<CharacterParser.ParseResult> Set of alive parsing situations: is input before visiting a grammar sub-expression and output after parsing.protected final MessageReceiver<SimpleMessage<XMLDocumentIdentifier>> Fields inherited from class eu.bandm.tools.d2d2.model.MATCH_ONLY_00
_visitor_debug_stream, partial -
Constructor Summary
ConstructorsConstructorDescriptionCharacterParser(Navigate.CharSetCalc charSetCalc, MessageReceiver<SimpleMessage<XMLDocumentIdentifier>> msg, boolean doTrace) Only constructor. -
Method Summary
Modifier and TypeMethodDescriptionprotected voidacceptCharRep(CharSet cset, boolean isstar, boolean istight) Provides special treatment (implemented directly in the scanner), including different semantics (greedy, not non-determinstically !) for character set expressions.protected voidvoidvoidaction(CharBinary expr) voidvoidaction(CharsRegExp def) voidaction(Enumeration enumeration) treat enumeration as parser and create a special resulting structure.voidOnly the longest matches for each incoming hypotheses are recognized.voidaction(ImportItem def) voidSpecial case of insertion, namely the reference to an enumeration, can survive the rewriting process:(@ ref).voidvoidaction(ParseParticle pp) voidvoidvoidvoidvoidvoidaction(StringConst expr) voidaction(TagsRegExp def) protected voidcopyContentsFromTo(CharacterParser.ParseResult from, MemScanner<String> startpos, ResultingStructure to) If "from" has structured contents, then add this into "to", as a sequence and per assoc.protected voidflattened_consumption_of_enum(Enumeration enumeration) when called from an insertion: treat an enumeration as mere collection of string constants, i.e.parse(MemScanner<String> scanner, CharsRegExp parser, ResultingStructure result) Central executive method.protected voidsubElement(Definition def, Expression rule) Called when reaching a ParseParticle, a CharsRegExp or an Enumeration, which all lead to wrapping the parsing result into a Result object with this definition as its tag.protected voidtrace(@Opt Location<XMLDocumentIdentifier> loc, String s) protected voidprotected voidMethods inherited from class eu.bandm.tools.d2d2.model.SinglePhase
action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, action, follow_definitions, follow_defInstances, follow_docu, follow_firsts, follow_globalSubsts, follow_imports, follow_itemDocu, follow_localdefs, follow_localSubsts, follow_modules, follow_namespaces, follow_obligates, follow_on, follow_rawModules, follow_text, follow_weakfirsts, follow_xattributesMethods inherited from class eu.bandm.tools.d2d2.model.MATCH_ONLY_00
_visitor_trace, action, action, action, action, action, compile, followAll_definitions, followAll_defInstances, followAll_docu, followAll_firsts, followAll_globalSubsts, followAll_imports, followAll_itemDocu, followAll_localdefs, followAll_localSubsts, followAll_modules, followAll_namespaces, followAll_obligates, followAll_on, followAll_rawModules, followAll_text, followAll_weakfirsts, followAll_xattributes, foreignObject, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, match, nomatch
-
Field Details
-
msg
-
charSetCalc
-
doTrace
protected final boolean doTrace -
hypotheses
Set of alive parsing situations: is input before visiting a grammar sub-expression and output after parsing.
-
-
Constructor Details
-
Method Details
-
parse
@Opt public @Opt ResultingStructure parse(MemScanner<String> scanner, CharsRegExp parser, ResultingStructure result) Central executive method. It gets a particularCharsRegExpdefinition and tries to match the character input (given by theMemScanner) by a non-deterministic, parallel breadth-first execution.
Maintains a set ofCharacterParser.ParseResultobjects combining parser state and theResultingStructure, as constructed so far.
In case of success, it returns one (randomly chosen) of the matches which consume most input characters and adjusts "scanner" accordingly. (So "scanner is both an input and an output argument.
In case of failure, it returns null and does not advance the input pointer in "scanner". -
copyContentsFromTo
protected void copyContentsFromTo(CharacterParser.ParseResult from, MemScanner<String> startpos, ResultingStructure to) If "from" has structured contents, then add this into "to", as a sequence and per assoc. Otherwise copy the characters from "startpos" upto the current(=accepting) input pointer pos in "from". *** -
typingError
-
trace
-
trace
-
action
Special case of insertion, namely the reference to an enumeration, can survive the rewriting process:(@ ref). It means a flattened acceptance of one of the enumeration's string values.
Also INFINITE CYCLES of insertions of CharsRegExp may survive. NOT YET SUPPORTED FIXME.- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
flattened_consumption_of_enum
when called from an insertion: treat an enumeration as mere collection of string constants, i.e. consume one of them and do NOT create any corresponding element. -
action
treat enumeration as parser and create a special resulting structure.- Overrides:
actionin classSinglePhase
-
subElement
Called when reaching a ParseParticle, a CharsRegExp or an Enumeration, which all lead to wrapping the parsing result into a Result object with this definition as its tag. All cases can lead to character data only; the first two cases can also lead to sub-result-objects (structured contents) instead. All results must be wrapped explicitly. -
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
acceptCharRep
Provides special treatment (implemented directly in the scanner), including different semantics (greedy, not non-determinstically !) for character set expressions. -
acceptRep
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
Only the longest matches for each incoming hypotheses are recognized.- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-
action
- Overrides:
actionin classSinglePhase
-