Package eu.bandm.tools.dtd
Class EntityClassifier
java.lang.Object
eu.bandm.tools.rdparser.TunableParser<XMLDocumentIdentifier>
eu.bandm.tools.rdparser.TunableParserForXml<XMLDocumentIdentifier>
eu.bandm.tools.dtd.TunedDTDParser
eu.bandm.tools.dtd.EntityClassifier
Classifies DTD entities wrt
EntityRole
by attempts
to parse their content against different start
symbols using a TunedDTDParser
.
We proposed an extended Dtd.AttDef structure where the
usage of classified entities could be encoded as follows:
ent() means full contents of entity ent(0) means contents at leaf nr. 0 of parse tree of entity ent,0 means 2-tuple of reference to entity and this index number +name | +nameabbrev | | +type | | | +typeabbrev | | | | +value | | | | | +valueabbrev | | | | | | %ent; (a|b) #IMPLIED ent() ent (a|b) null #IMP null ab %ent; #IMPLIED ab null ent() ent #IMP null ab %ent; ab null ent(0) ent,0 ent(1) ent,1 one attdef in ent: %ent; ent(0) ent,0 ent(1) ent,1 ent(2) ent,2 more attdefs in ent: %ent; null ent,-1 null null null null V V
FIXME insert "?" in DTD.umod : AttDef FORMAT "$tabular{0>name,20>type?,40>value?}" insert case "only nameabbrev" in attdef ?? print "nameabbrev{asAtts}
-
Nested Class Summary
Modifier and TypeClassDescriptionclass
Classify entity replacement text w.r.t usability in content models.Nested classes/interfaces inherited from class eu.bandm.tools.dtd.TunedDTDParser
TunedDTDParser.ParsingFailed
Nested classes/interfaces inherited from class eu.bandm.tools.rdparser.TunableParser
TunableParser.CharSet, TunableParser.ExtensionalCharSet
-
Field Summary
Fields inherited from class eu.bandm.tools.dtd.TunedDTDParser
currentElementName, entityUsage, errorOnExpand, generalEntities, ignoreErrors, LAZY_ENTITY_ERROR, nicePE, parameterEntities, parsingFailed, UNPARSED_CONTENTS
Fields inherited from class eu.bandm.tools.rdparser.TunableParserForXml
asciiLetterSet, decDigitSet, encNameSet, hexDigitSet, initialSet, nameSet, PREFIX_GE, PREFIX_PE, pubidCharSet, sNoPESet, sSet, STRINGCONSTANT_IGNORE, STRINGCONSTANT_INCLUDE, versionNumSet
Fields inherited from class eu.bandm.tools.rdparser.TunableParser
base, in, messageReceiver, topleveldocumentid
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescription"classify()" only called for entities in an AttList context.Methods inherited from class eu.bandm.tools.dtd.TunedDTDParser
attDef, attlistDecl, attType, attValue, AUX_convert, children, comment, conditionalSection, constructReplacementText, content, cp, declareGeneralEntity, declareParameterEntity, defaultDecl, dtd, elementDecl, encName, encodingDecl, entityDecl, entityValue, enumerated, eq, eRef, externalId, extSubset, fatalError, ignore, markupDecl, mixed, modifierOpt, name, niceEntityValue, nmtoken, notationDecl, parse, parse, parseId, parselocal, peRef, pi, pubidLiteral, resolve, retrieveGeneralEntity, retrieveParameterEntity, retrieveReplacementText, s, sNoPE, sOpt, sOptNoPE, storeEntityUsage, sWsOpt, systemLiteral, versionInfo, versionNum, xmlDecl
Methods inherited from class eu.bandm.tools.rdparser.TunableParserForXml
lookaheadPe, prefixedEntityName
Methods inherited from class eu.bandm.tools.rdparser.TunableParser
consume, eof, error, failure, getMessageReceiver, lookahead, lookahead, lookahead, lookahead, lookahead, lookaheadEOF, match, match, match, match, matchahead, matchahead, matchahead, matchUpto, matchUpto, readExternal, setBase, setMessageReceiver, skipUpto, warning, word
-
Constructor Details
-
EntityClassifier
public EntityClassifier()
-
-
Method Details
-
classify
"classify()" only called for entities in an AttList context. Entities in content-def-context are treated in TunedDTDParser by "nicePE()", as in the original bt code !
The roles are coded in EntityRole:
N_V_ContentModel, = ident (attribute N-ame or V-alue or tag in contentm) N_V_ContentModel_IncIgn, = as above, or "IGNORE" "INCLUDE" NT, = attribute name AND type NTV, = attribute name AND type AND value NTVs, = idem, more than one of them T, = type of an attribute T_or_ContentModel, = part of a disjuntion or attribute enumeration type TV, = attribute type and value V, = attribute value (with "#FIXED" or "#IMPLIED" or with quotes) ContentModel, = must be content model IncIgn, = USED at an inc/ign place CrUdE = none of the above
-