Class TunedDTDParser

Direct Known Subclasses:
EntityClassifier, HtmlRenderer

@CyclicDependency public class TunedDTDParser extends TunableParserForXml<XMLDocumentIdentifier>
Parser implementation, optimized for treating the idiosyncratics of the w3c xml dtd specification.
Cf. w3c xml spec
20121103 ML FIXME: In an INTERNAL subset "s()" "sOpt()" and "cp()" must be defined differently because entities are only allowed BETWEEN MarkupDecl.
("WFC: PEs in Internal Subset") VARIANTE von s() und globale flag
20121103 ML FIXME: In an INTERNAL subset conditional sections are not allowed
Attention : The base classes have the following convention w.r.t errors: they use "MessageGenerator", provide "error()", "warning()" etc., and require "fatalError()" to be provided here!
  • Field Details

    • errorOnExpand

      protected boolean errorOnExpand
      Set by the corresponding parameter. Iff true, an unreachable (/non-existing) external parsed (parameter/general) entity does generate an error message not before it is really used/expanded (which may never happen!)
    • parameterEntities

      protected HashMap<String,DTD.Entity> parameterEntities
      Catalog of all parameter entities defined in this dtd model
    • nicePE

      protected HashMap<String,DTD.CP> nicePE
      Catalog of all those parameter entities which can be used as content model.
    • generalEntities

      protected HashMap<String,DTD.Entity> generalEntities
      Catalog of all general entities defined in this dtd model
    • ignoreErrors

      protected final MessageDisposer ignoreErrors
    • currentElementName

      protected String currentElementName
      Is !=null iff we are currently in an elements content definition. The use of PEs will be recorded in DTD.Dtd.entityUsage.
    • entityUsage

      protected CheckedMultimap_RD<String,String> entityUsage
    • LAZY_ENTITY_ERROR

      public static final String LAZY_ENTITY_ERROR
      See Also:
    • UNPARSED_CONTENTS

      public static final String UNPARSED_CONTENTS
      See Also:
  • Constructor Details

  • Method Details

    • AUX_convert

      @Deprecated @Opt protected static @Opt URL AUX_convert(@Opt @Opt File f)
      Deprecated.
    • fatalError

      protected void fatalError(String msg)
      Gnerates a TunedDTDParser.ParsingFailed exception as a "semantic signal", indicating that a sub-parser failed. It is called ONLY by "match()", which is only called if the match is syntactically necessary. This exception is caught by the "speculating" parsing of entity contents. During this parsing the error receiver is inactivated (replaced by ignoreErrors
      Specified by:
      fatalError in class TunableParser<XMLDocumentIdentifier>
    • declareGeneralEntity

      public void declareGeneralEntity(String name, DTD.Entity entity)
    • declareParameterEntity

      public void declareParameterEntity(String name, DTD.Entity entity)
      Called by the parser, as soon as a parameter entity declaration is recognized.
      First it is stored to parameterEntities.
      Then a "speculative parsing" is started with the start symbol niceEntityValue() and on success the result of this function is (additionally) stored to nicePE.
    • retrieveParameterEntity

      protected DTD.Entity retrieveParameterEntity(String name)
      Retrieve a parameter entity (internal or external) by its name, Create error msg if undefined.
      Parameters:
      name - the name of the entity.
    • retrieveGeneralEntity

      protected DTD.Entity retrieveGeneralEntity(String name)
      Retrieve a general entity (internal or external) by its name, Create error msg if undefined.
      Parameters:
      name - the name of the entity.
    • retrieveReplacementText

      protected String retrieveReplacementText(DTD.Entity entity)
      Retrieve the replacement text of an entity (parameter or general, internal or external). Create error msg in case of "errorOnExpand" mode when an external entity could not be read when it was declared.
      Parameters:
      entity - the entity for which the replacement text is retrieved
    • resolve

      public void resolve(String name)
      Insert the replace text of a parameter entity (internal or external) into the input stream, framed by whitespace. Is only called by cp() for content models, and by s() for ubiquituous whitespace.
      Parameters:
      name - the name of the entity.
    • constructReplacementText

      public String constructReplacementText(String currentlyDefined, String definition, boolean normalizeSpace)
      Normalizes whitespace and expands pe-refs and character refs in the literal definition value of an INTERNAL (parameter or general) entity. Called by entityDecl() for the whole text, and for attValue(), for explicitly extracted character references only. FIXME whitespace coming out of entities/char references is NOT normalized.
      Parameters:
      definition - the text to expand
    • parse

      public static DTD.Dtd parse(Reader in, XMLDocumentIdentifier id, URL base, boolean errorOnExpand, MessageReceiver<? super SimpleMessage<XMLDocumentIdentifier>> msg)
      Main service access point for parsing a dtd into an internal model.
      Parameters:
      in - the source of the dtd to parse
      id - the document id, for tracing etc.
      base - from where relative includes shall be resolved
      errorOnExpand - whether any "read error" of external parsed entities is signalled not before they are expanded (which may never happen!)
      msg - receiver for errors and warnings
    • parse

      @Deprecated public static DTD.Dtd parse(Reader in, XMLDocumentIdentifier id, @Opt @Opt File base, boolean errorOnExpand, MessageReceiver<? super SimpleMessage<XMLDocumentIdentifier>> msg)
    • parselocal

      public static DTD.Dtd parselocal(String in, MessageReceiver<? super SimpleMessage<XMLDocumentIdentifier>> msg)
      Parse a local (or temporary generated) dtd declaration. (Currently only called by TypedDomGenerator and /bajama/mmod/core/MMod2Dtd.java)
      Parameters:
      in - source
      msg - error channel
      Returns:
      null if parsing is not possible = error has occured
    • parseId

      public static XMLDocumentIdentifier parseId(Reader in)
      Parse an XMLDocumentIdentifier. (Currently only called from TypedDomGenerator).
      Parameters:
      in - source
      Returns:
      null if parsing is not possible
    • dtd

      protected DTD.Dtd dtd()
      Parsing function.
    • textDecl

      protected DTD.TextDecl textDecl()
      Parsing function.
    • versionInfo

      protected String versionInfo()
      Parsing function.
    • eq

      protected final void eq()
      Parsing function for an equal sign with optional space before and after.
    • versionNum

      protected String versionNum()
      Parsing function.
    • encodingDecl

      protected String encodingDecl()
      Parsing function.
    • encName

      protected String encName()
      Parsing function.
    • storeEntityUsage

      protected void storeEntityUsage(String def, String refersTo)
    • s

      protected void s()
      Parsing function for required space, while expanding parameter entities.
    • sOpt

      protected void sOpt()
      Parsing function for optional space, while expanding parameter entities.
    • sWsOpt

      protected void sWsOpt()
      Parsing function for space and PEs expanding to space. Used for optional space immediately preceding a closing parenthesis of a declaration > or a conditional section ], cf. XML standard "VC: Proper Conditional Section/PE Nesting" "VC: Proper Declaration/PE Nesting" and "VC: Proper Group/PE Nesting"
    • sNoPE

      protected void sNoPE()
      Parsing function for required space, while NOT expanding parameter entities.
    • sOptNoPE

      protected void sOptNoPE()
      Parsing function for optional space, while NOT expanding parameter entities.
    • extSubset

      protected void extSubset(DTD.Dtd dtd)
      Toplevel parsing function, called by dtd() and conditionalSection(DTD.Dtd).
    • conditionalSection

      protected void conditionalSection(DTD.Dtd dtd)
      Parsing function.
    • ignore

      protected void ignore()
      Parsing function. Does not look for any PERef. This is possible because of "VC: Proper Conditional Section/PE Nesting"
    • pi

      protected DTD.PI pi()
      Parsing function.
    • name

      protected String name()
      Parsing function. Accepts a non-empty name according to production "Name" from XML-specs.
    • nmtoken

      protected String nmtoken()
      Parsing function. Only called from enumerated()
    • markupDecl

      protected DTD.MarkupDecl markupDecl()
      Parsing function.
    • comment

      protected DTD.Comment comment()
      Parsing function.
    • elementDecl

      protected DTD.Element elementDecl()
      Parsing function.
    • content

      protected DTD.ContentModel content()
      Parsing function.
    • mixed

      protected DTD.Mixed mixed()
      Parsing function.
    • children

      protected DTD.CP children()
      Parsing function. Assume OPENING parenthesis just consumed. Leaves closing parenthesis un-consumed.
    • cp

      protected DTD.CP cp()
      Parsing function.
    • modifierOpt

      protected DTD.CP modifierOpt(DTD.CP cp)
      Parsing function.
    • entityDecl

      protected DTD.Entity entityDecl()
      Parsing function. Entities are encoded with the following combinations:
         parameter    id(=fileposition)   notation
         ---------------------------------------------------------------------
         true         null                (null)          internal PE
         true         !=null              (null)          external PE
         false        null                (null)          internal general entity
         false        !=null              ==null          external parsed gen-ent
         false        !=null              !=null          external UNparsed gen-ent
         
      Attention: if errorOnExpand==true, then a non-existing (non-loadable) external (parsed/unparsed) entitiy does NOT raise an error here, when being declared. Instead, this fact is encoded by the combination replacement==null && definition == LAZY_ENTITY_ERROR The error message is sent LATER, iff (and when) the first attempt to expand will happen. There the combination will be changed to the combination replacement== "", to prevent further error generation.
      In all non-error cases, the values of the fields definition and replacement are
                           definition                replacement
         ---------------------------------------------------------------------
         internal PE       verbatim source text      processed/expanded source text
                                                       (expand char refs and PE refs)
         external PE       read file contents        ==definition (no expansion)
         internal gen.Ent. verbatim source text      processed/expanded source text
                                                       (expand char refs and PE refs)
         external gen.Ent. read file contents        ==definition (no expansion)
      
         unparsed (general external) entity
                           dedicated string constant (used as flag)
         
    • externalId

      protected XMLDocumentIdentifier externalId(boolean notation)
      Parsing function. Normal XML document identifier (via the nonterminal "ExternalId") take the form
       SYSTEM  systemLiteral
              PUBLIC  pubidLiteral systemLiteral
        
      The production "NotationDecl" from the XML standard allows additionally (via the nonterminal "PublicId")
       PUBLIC  pubidLiteral
        
    • systemLiteral

      protected String systemLiteral()
      Parsing function.
    • pubidLiteral

      protected String pubidLiteral()
      Parsing function.
    • entityValue

      protected String entityValue()
      Parsing function. Returns the verbatim text input, w/o any expansion.
    • eRef

      protected String eRef()
      Parsing function. Consumes lead in character and semicolon, but delivers only characters in between. FIXME word(hexDigitSet) includes EMPTY WORD !?!?!
    • peRef

      protected String peRef()
      Parsing function. Consumes lead in character and semicolon, but delivers only characters in between.
    • notationDecl

      protected DTD.Notation notationDecl()
      Parsing function.
    • attlistDecl

      protected DTD.Attlist attlistDecl()
      Parsing function.
    • attDef

      protected DTD.AttDef attDef()
      Parsing function. DOES consume trailing sOpt(). Differs from XML specs: They start with obligate S ! This obligate s not reflected in this parser here ?!?!? FIXME
    • attType

      protected DTD.AttType attType()
      Parsing function. Does NOT consume trailing sOpt().
    • enumerated

      protected DTD.Enumerated enumerated()
      Parsing function.
    • defaultDecl

      protected DTD.DefaultDecl defaultDecl()
      Parsing function. Does NOT consume trailing space.
    • attValue

      protected String attValue()
      Parsing function.
    • niceEntityValue

      protected DTD.CP niceEntityValue()
      Auxiliary parsing function to test whether a replacement text of an entity is a well-formed content model. A "test run" of the parser must be set-up before this method is called in a try/catch block, and the main parser reconstructed afterwards. It calls the parser function cp(), which will return a DTD.CP in case of success, or throw a TunedDTDParser.ParsingFailed in case of error. Only called from declareParameterEntity(String,DTD.Entity).