Package eu.bandm.tools.dtd
Class EntityClassifier
java.lang.Object
eu.bandm.tools.rdparser.TunableParser<D>
eu.bandm.tools.rdparser.TunableParserForXml<XMLDocumentIdentifier>
eu.bandm.tools.dtd.TunedDTDParser
eu.bandm.tools.dtd.EntityClassifier
Classifies DTD entities wrt
EntityRole
by attempts
to parse their content against different start
symbols using a TunedDTDParser
.
We proposed an extended Dtd.AttDef structure where the
usage of classified entities could be encoded as follows:
ent() means full contents of entity ent(0) means contents at leaf nr. 0 of parse tree of entity ent,0 means 2-tuple of reference to entity and this index number +name | +nameabbrev | | +type | | | +typeabbrev | | | | +value | | | | | +valueabbrev | | | | | | %ent; (a|b) #IMPLIED ent() ent (a|b) null #IMP null ab %ent; #IMPLIED ab null ent() ent #IMP null ab %ent; ab null ent(0) ent,0 ent(1) ent,1 one attdef in ent: %ent; ent(0) ent,0 ent(1) ent,1 ent(2) ent,2 more attdefs in ent: %ent; null ent,-1 null null null null V V
FIXME insert "?" in DTD.umod : AttDef FORMAT "$tabular{0>name,20>type?,40>value?}" insert case "only nameabbrev" in attdef ?? print "nameabbrev{asAtts}
-
Nested Class Summary
Modifier and TypeClassDescriptionclass
Classify entity replacement text w.r.t usability in content models.Nested classes/interfaces inherited from class eu.bandm.tools.dtd.TunedDTDParser
TunedDTDParser.ParsingFailed
Nested classes/interfaces inherited from class eu.bandm.tools.rdparser.TunableParser
TunableParser.CharSet, TunableParser.ExtensionalCharSet
-
Field Summary
Fields inherited from class eu.bandm.tools.dtd.TunedDTDParser
currentElementName, entityUsage, errorOnExpand, generalEntities, ignoreErrors, LAZY_ENTITY_ERROR, nicePE, parameterEntities, UNPARSED_CONTENTS
Fields inherited from class eu.bandm.tools.rdparser.TunableParserForXml
asciiLetterSet, decDigitSet, encNameSet, hexDigitSet, initialSet, nameSet, prefix_GE, prefix_PE, pubidCharSet, sNoPESet, sSet, stringconstant_IGNORE, stringconstant_INCLUDE, versionNumSet
Fields inherited from class eu.bandm.tools.rdparser.TunableParser
base, in, messageReceiver, topleveldocumentid
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescription"classify()" only called for entities in an AttList context.Methods inherited from class eu.bandm.tools.dtd.TunedDTDParser
attDef, attlistDecl, attType, attValue, AUX_convert, children, comment, conditionalSection, constructReplacementText, content, cp, declareGeneralEntity, declareParameterEntity, defaultDecl, dtd, elementDecl, encName, encodingDecl, entityDecl, entityValue, enumerated, eq, eRef, externalId, extSubset, fatalError, ignore, markupDecl, mixed, modifierOpt, name, niceEntityValue, nmtoken, notationDecl, parse, parse, parseId, parselocal, peRef, pi, pubidLiteral, resolve, retrieveGeneralEntity, retrieveParameterEntity, retrieveReplacementText, s, sNoPE, sOpt, sOptNoPE, storeEntityUsage, sWsOpt, systemLiteral, versionInfo, versionNum, xmlDecl
Methods inherited from class eu.bandm.tools.rdparser.TunableParserForXml
lookahead_pe, prefixedEntityName
Methods inherited from class eu.bandm.tools.rdparser.TunableParser
consume, eof, error, failure, getMessageReceiver, lookahead, lookahead, lookahead, lookahead, lookahead, lookahead_eof, match, match, match, match, matchahead, matchahead, matchahead, matchUpto, matchUpto, readExternal, setBase, setMessageReceiver, skipUpto, warning, word
-
Constructor Details
-
EntityClassifier
public EntityClassifier()
-
-
Method Details
-
classify
"classify()" only called for entities in an AttList context. Entities in content-def-context are treated in TunedDTDParser by "nicePE()", as in the original bt code !
The roles are coded in EntityRole:
N_V_ContentModel, = ident (attribute N-ame or V-alue or tag in contentm) N_V_ContentModel_IncIgn, = as above, or "IGNORE" "INCLUDE" NT, = attribute name AND type NTV, = attribute name AND type AND value NTVs, = idem, more than one of them T, = type of an attribute T_or_ContentModel, = part of a disjuntion or attribute enumeration type TV, = attribute type and value V, = attribute value (with "#FIXED" or "#IMPLIED" or with quotes) ContentModel, = must be content model IncIgn, = USED at an inc/ign place CrUdE = none of the above
-