public class EntityClassifier extends TunedDTDParser
EntityRole by attempts
to parse their content against different start
symbols using a TunedDTDParser.
We proposed an extended Dtd.AttDef structure where the
usage of classified entities could be encoded as follows:
ent() means full contents of entity
ent(0) means contents at leaf nr. 0 of parse tree of entity
ent,0 means 2-tuple of reference to entity and this index number
+name
| +nameabbrev
| | +type
| | | +typeabbrev
| | | | +value
| | | | | +valueabbrev
| | | | | |
%ent; (a|b) #IMPLIED ent() ent (a|b) null #IMP null
ab %ent; #IMPLIED ab null ent() ent #IMP null
ab %ent; ab null ent(0) ent,0 ent(1) ent,1
one attdef in ent:
%ent; ent(0) ent,0 ent(1) ent,1 ent(2) ent,2
more attdefs in ent:
%ent; null ent,-1 null null null null
V
V
FIXME insert "?" in DTD.umod : AttDef
FORMAT "$tabular{0>name,20>type?,40>value?}"
insert case "only nameabbrev" in attdef ??
print "nameabbrev{asAtts}
| Modifier and Type | Class and Description |
|---|---|
class |
EntityClassifier.CPEntityType
Classify entity replacement text w.r.t usability in content models.
|
TunedDTDParser.ParsingFailedTunableParser.CharSet, TunableParser.ExtensionalCharSetcurrentElementName, entityUsage, errorOnExpand, generalEntities, ignoreErrors, LAZY_ENTITY_ERROR, nicePE, parameterEntities, UNPARSED_CONTENTSasciiLetterSet, decDigitSet, encNameSet, hexDigitSet, initialSet, nameSet, prefix_GE, prefix_PE, pubidCharSet, sNoPESet, sSet, stringconstant_IGNORE, stringconstant_INCLUDE, versionNumSetbase, in, messageGenerator, topleveldocumentid| Constructor and Description |
|---|
EntityClassifier() |
| Modifier and Type | Method and Description |
|---|---|
EntityRole |
classify(String text)
"classify()" only called for entities in an AttList context.
|
attDef, attlistDecl, attType, attValue, AUX_convert, children, comment, conditionalSection, constructReplacementText, content, cp, declareGeneralEntity, declareParameterEntity, defaultDecl, dtd, elementDecl, encName, encodingDecl, entityDecl, entityValue, enumerated, eq, eRef, externalId, extSubset, fatalError, ignore, markupDecl, mixed, modifierOpt, name, niceEntityValue, nmtoken, notationDecl, parse, parse, parseId, parselocal, peRef, pi, pubidLiteral, resolve, retrieveGeneralEntity, retrieveParameterEntity, retrieveReplacementText, s, sNoPE, sOpt, sOptNoPE, storeEntityUsage, sWsOpt, systemLiteral, textDecl, versionInfo, versionNumlookahead_pe, prefixedEntityNameconsume, eof, error, failure, getMessageReceiver, lookahead_eof, lookahead, lookahead, lookahead, lookahead, lookahead, match, match, match, match, matchahead, matchahead, matchahead, matchUpto, matchUpto, readExternal, setBase, setMessageReceiver, skipUpto, warning, wordpublic EntityRole classify(String text)
see also the complete user documentation .