Package eu.bandm.tools.d2d2.base
Class CharParserPrepare
java.lang.Object
eu.bandm.tools.d2d2.base.CharParserPrepare
Collects all parser particles from character parsers and
joins them to a content model which is DTD compatible.
There are two situations: without and with an explicit element definition.
In both cases the definitions of the parse particles (e.g. "p") must be
collected from the different character parser definitions,
which may be recursive.
All variants must be joined into one disjunction, and this must be
normalized into a content model for later DTD generation.
(For mere write-out = SAX event generation, this is not necessary.)
Case without element definition:
or
Case without element definition:
chars a = ..... [p ..... [q..][q..] [p ..... ] .... ] chars b = ..... [p ..... ] ...In this case NO definition in the module may have the name "p".
or
chars a = ..... [p ..... [q..][q..] [p ..... ] .... ] chars b = ..... [p ..... ] ... chars p = #distributed with xmlrep el = "x:y", postprocessor "a.b"In this case a char parser is defined as "#distributed" and will be assigned the collected content model of the parse particles in the same module with the name. Then additional features can be specified, like xml tag and namespace, Java class as postprocessor, etc.
chars p = #distributed with xmlrep data ...The keyword "data" means that the sequential order of the input is not significant and will be re-ordered. The top-level expression will be a permutation "
..&..
" of the contained elements (coming from character parsers
or parse particles in the collected regexps).
Usage: create one instance (anew for each instantiated module)
and call resolve(eu.bandm.tools.d2d2.model.ResolvedModule)
.
(Please note that there is a general problem with more than one
instantiations of the same module, which will result in more than one
parser definitions bound to the same XML tag !)
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
static class
(package private) static class
Represents a RegExp as a collection of segments: g=non-recursive termation case, p=prefix before the first recursive call, s=suffix after the last recursive call, i=inbetween two recursive calls. -
Field Summary
Modifier and TypeFieldDescription(package private) final Set<CharsRegExp>
(package private) final Multimap<String,
ParseParticle> (package private) static final Alt
(package private) static final Seq
protected final MessageReceiver<SimpleMessage<XMLDocumentIdentifier>>
(package private) static final Location<XMLDocumentIdentifier>
(package private) ResolvedModule
-
Constructor Summary
ConstructorDescription -
Method Summary
Modifier and TypeMethodDescription(package private) static Alt
ALT
(Expression... exp) protected void
(package private) static boolean
isEmptyAlt
(Expression exp) static void
(package private) static Expression
PLUS
(Expression exp) protected Definition
rawDef
(Definition def) protected Module
rawMod
(Definition def) (package private) final String
representingKey
(DefInstance defI, Module rawMod) ATTENTION ParseParticles are unified w.r.t.void
resolve
(ResolvedModule resolvedModule) MAIN OPERATIVE ENTRY METHOD: Collect all ParseParticles in resolvedModule and link together all those from the same raw module, with the same name.protected void
resolveOnePpName
(String ppName) ppName is built from one (arbitrarily chosen) module expansion of the raw original module, plus the name of a ParseParticle.(package private) static Expression
SEQ
(Expression... exp) static void
static void
static Expression
static Expression
toRegExp
(String s, Expression exp)
-
Field Details
-
msg
-
NOLOC
-
EPS
-
EMPTY
-
resolvedModule
ResolvedModule resolvedModule -
allCharsRegExp
-
allParseParticles
-
ppName2rawModule
-
reprKey
-
-
Constructor Details
-
CharParserPrepare
-
-
Method Details
-
error
-
SEQ
-
ALT
-
isEmptyAlt
-
PLUS
-
rawDef
-
rawMod
-
representingKey
ATTENTION ParseParticles are unified w.r.t. the UN-instatiated, raw modules. Therefore one instantiated module is chosen arbitrarily, and the "collector" is instantiated/created with that import prefix in its name. -
resolve
MAIN OPERATIVE ENTRY METHOD:- Collect all ParseParticles in resolvedModule and link together all those from the same raw module, with the same name.
- Link these to a "Collector", either a fresh CharsRegExp, or one in the same module with the with the same name, declared as "#distributed".
- Collect the "linear content models" (as an Expression) of all CharParsers and all ParseParticles, thereby reducing "flattened recursion" = "x=..@x.." to repetition.
- Construct the unified expr (for DTD-generation and writing-out) either as a conjunction of the distributed expressions (may crash due to violation of (1)unambigueness.) or as a permutation expression, reflecting cardinalities only, iff the Collector has been declared as xmlrep data.
-
resolveOnePpName
ppName is built from one (arbitrarily chosen) module expansion of the raw original module, plus the name of a ParseParticle. This call triggers the unification of all ParseParticle with this name in ALL instantiated definitions coming from this raw module, as already stored inallParseParticles
. -
testParse
-
toRegExp
-
testE
-
testF
-
main
-