public class EntityClassifier extends TunedDTDParser
EntityRole
by attempts
to parse their content against different start
symbols using a TunedDTDParser
.
We proposed an extended Dtd.AttDef structure where the
usage of classified entities could be encoded as follows:
ent() means full contents of entity ent(0) means contents at leaf nr. 0 of parse tree of entity ent,0 means 2-tuple of reference to entity and this index number +name | +nameabbrev | | +type | | | +typeabbrev | | | | +value | | | | | +valueabbrev | | | | | | %ent; (a|b) #IMPLIED ent() ent (a|b) null #IMP null ab %ent; #IMPLIED ab null ent() ent #IMP null ab %ent; ab null ent(0) ent,0 ent(1) ent,1 one attdef in ent: %ent; ent(0) ent,0 ent(1) ent,1 ent(2) ent,2 more attdefs in ent: %ent; null ent,-1 null null null null V V
FIXME insert "?" in DTD.umod : AttDef FORMAT "$tabular{0>name,20>type?,40>value?}" insert case "only nameabbrev" in attdef ?? print "nameabbrev{asAtts}
Modifier and Type | Class and Description |
---|---|
class |
EntityClassifier.CPEntityType
Classify entity replacement text w.r.t usability in content models.
|
TunedDTDParser.ParsingFailed
TunableParser.CharSet, TunableParser.ExtensionalCharSet
currentElementName, entityUsage, errorOnExpand, generalEntities, ignoreErrors, LAZY_ENTITY_ERROR, nicePE, parameterEntities, UNPARSED_CONTENTS
asciiLetterSet, decDigitSet, encNameSet, hexDigitSet, initialSet, nameSet, prefix_GE, prefix_PE, pubidCharSet, sNoPESet, sSet, stringconstant_IGNORE, stringconstant_INCLUDE, versionNumSet
base, in, messageGenerator, topleveldocumentid
Constructor and Description |
---|
EntityClassifier() |
Modifier and Type | Method and Description |
---|---|
EntityRole |
classify(String text)
"classify()" only called for entities in an AttList context.
|
attDef, attlistDecl, attType, attValue, AUX_convert, children, comment, conditionalSection, constructReplacementText, content, cp, declareGeneralEntity, declareParameterEntity, defaultDecl, dtd, elementDecl, encName, encodingDecl, entityDecl, entityValue, enumerated, eq, eRef, externalId, extSubset, fatalError, ignore, markupDecl, mixed, modifierOpt, name, niceEntityValue, nmtoken, notationDecl, parse, parse, parseId, parselocal, peRef, pi, pubidLiteral, resolve, retrieveGeneralEntity, retrieveParameterEntity, retrieveReplacementText, s, sNoPE, sOpt, sOptNoPE, storeEntityUsage, sWsOpt, systemLiteral, textDecl, versionInfo, versionNum
lookahead_pe, prefixedEntityName
consume, eof, error, failure, getMessageReceiver, lookahead_eof, lookahead, lookahead, lookahead, lookahead, lookahead, match, match, match, match, matchahead, matchahead, matchahead, matchUpto, matchUpto, readExternal, setBase, setMessageReceiver, skipUpto, warning, word
public EntityRole classify(String text)
see also the complete user documentation .