java.lang.Object

eu.bandm.tools.d2d2.base.Text2Udom

public class Text2Udom extends Object

Parse a d2d text input into xml nodes.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

protected class

Text2Udom.CannotContinue

Thrown to signal to an upper processing loop that parsing cannot be continued.

(package private) class

Text2Udom.ErrorLimitReachedException

Dedicated exception to signal this fact to the upper execution loop.

static class

Text2Udom.ErrorStrategy

Configuration object for the different ways of reacting to input errors.

(package private) static enum

Text2Udom.modes

Kind of the text to parse.

static enum

Text2Udom.parsingState

Realizes the parsing state on lower=character level.

protected class

Text2Udom.PrematureEndOfFile

Class to signal an attempt to read beyond the limit of the input data to the upper processing loop.
Field Summary

Fields

Modifier and Type

Field

Description

protected static final Map<Enumeration,String[]>

allSortedKeys

protected Location<String>

bicLoc

protected final Navigate.CharSetCalc

charSetCalc

Instance to evaluate character expessions.

protected Text2Udom.parsingState

currentState

Current micro-state, values from Text2Udom.parsingState.

static final Text2Udom.ErrorStrategy

default_ErrorStrategy

Evident.

protected final Text2Udom.ErrorStrategy

errorStrategy

Applied error strategy, different for interactive and programatic use, etc.

static final Text2Udom.ErrorStrategy

interactive_ErrorStrategy

Evident, Allows partial documents and prints the stack context of a parsing error to the console.

protected Text2Udom.parsingState

interruptedState

Needed to treat setcommand, setcomment and numeric input (i.e.

protected boolean

lastCloseWasXslt

Whether we come from an xslt element.

protected List<ResultingChars>

leading_ws

Accumulator for ws, not yet clear whether to deliver or to ignore.

static final String

MEMSTRING_ID_SYNTHETIC

protected final MessageCounter

messageCounter

Target of most messages.

protected Definition

meta_assumedXsltOutput

protected Definition

meta_expected

protected Definition

meta_kind

protected Definition

meta_location

protected Definition

meta_messageText

protected Definition

meta_parsingError

protected Definition

meta_skipped

protected Definition

meta_tag

protected Definition

meta_warning

(package private) Text2Udom.modes

mode

Kind of the text to parse.

protected final MessageReceiver<SimpleMessage<String>>

msg

Target of most messages.

static final Text2Udom.ErrorStrategy

non_interactive_ErrorStrategy

Used for situations when basic diagnosis data is required by the "programmer" of some using code, not by an interactive user.

protected MemScanner<String>

scanner

Single source of text input.

protected ResultingStructure

skipContainer

For reporting skipped input

protected State

state

The currently growing stackframe, see State.

static final Location<String>

synthLocation

(package private) final TextFileHeader

textFileHeader

The result of parsing the input header.

protected ResultingStructure

top_result

Toplevel result of parsing the input text.

protected final int

traceLevel

Determines the verbosity: ==0 stands for complete silence ==1 stands for minimal output: few loggings and not all warnings ==2, 3 stands for more output:loggings and warnings ==10 shows some synthesized source texts ==20 stands for full debugging.

(package private) final Map<Integer,ResultingChars>

unicodeResults

protected int

verbatimSuppress

How many command characters in verbatim mode will not generated a warning.

protected final MessageReceiver<SimpleMessage<XMLDocumentIdentifier>>

xml_msg

Target of messages for some called classes which generate XML-file-locations.

protected @Opt Expression

xslt_alt_ubiquituous

Contains all xslt tags which can appear anywhere in the target elements.

protected Expression

xslt_alt_ubiquituous_repeated

Contains all xslt tags which can appear anywhere in the target elements.

protected Module

xslt_module

The loaded xslt module.
Constructor Summary

Constructors

Constructor

Description

Text2Udom(MessageReceiver<SimpleMessage<String>> msg, Text2Udom.ErrorStrategy errorStrategy, int traceLevel)

Only constructor.
Method Summary

Modifier and Type

Method

Description

protected boolean

accept_superfluous_end_tag(String tag, boolean defIsEmpty, Definition def, Location<String> startLoc)

Lets the scanner accept and discard an end tag, either normal or forced or by a parenthesis character.

protected void

addRestPerm(State_perm permstate, CheckedList<Expression> misslist)

Add obligate members of a "perm" expression to the term which describes the missing input.

protected void

bicERROR(String text, boolean isCommand)

protected boolean

builtInMetaCommands(String tag)

protected static @Opt ResultingStructure

consume_enumeration(MemScanner<String> scan, Enumeration etype)

Used by tag parsers AND character parsers (with multiple parallel scanners !)

protected void

deliver(ResultingChars chars)

protected void

deliver(ResultingStructure res)

Append the argument to the top-most resulting structure.

protected void

deliver(Udom res)

protected void

deliver_last()

Deliver the last scanned character data, and set the "is whitespace" flag according to the token type returned by scanner.

protected void

deliver_numeric(Location<String> loc, int val)

Append the unicode character to the top-most resulting structure.

protected void

deliver_spontanuous(ResultingStructure res)

Append the argument to the top-most resulting structure which is not encoded as xml attribute.

protected void

deliver_to_singletonstate(ResultingStructure res, State_singleton tss)

protected void

digest()

Main loop, consumes input data according to current "micro"-state currentState.

protected void

digest_consume_characters()

When an element may receive character data, and some non-ws character data has been recognized at the reading position.

protected boolean

digest_look_for_tag()

After a command char has been read.

protected void

digest_nothing_open()

Initial (micro-)state, or after an explicit close.

protected void

digest_skip_for_command()

Micro-state for error recovery: skip until command char and then goto tag-reading mode again.

protected void

digest_verbatim()

Parses "verbatim input" mode, which (a) requires an explicit #/tag end tag, and (b) accepts only tags of sub-elements immediatly contained in its definition.

protected void

error(Location<String> loc, String text, Object... args)

Emits an error message and further context information depending on the values in errorStrategy.

protected void

error(String text, Object... args)

Calls error(eu.bandm.tools.location.Location<java.lang.String>,java.lang.String,java.lang.Object...) with no location.

protected void

failure(Location<?> loc, String text, Object... args)

Throws a corresponding message exception.

protected State_singleton

find_top_singleton()

Called from return_to_upper_input_mode().

protected State_singleton

find_top_singleton(boolean mustBeNonAtt)

@Opt ResultingStructure

fromFile(File f, ModuleRegistry moduleRegistry)

Parse the contents of the given file.

@Opt ResultingStructure

fromFile(String s, ModuleRegistry moduleRegistry)

Parse the contents of the file found at the given location.

@Opt ResultingStructure

fromFile(String locationText, File f, ModuleRegistry moduleRegistry)

Parse the contents of the given file, using the given location text for all error messages.

@Opt ResultingStructure

fromMemString(String locationText, MemString<String> text, ModuleRegistry moduleRegistry)

Parse the contents of the given MemString object, which includes a text type declaration header.

@Opt ResultingStructure

fromMemString(String locationText, MemString<String> text, ModuleRegistry moduleRegistry, @Opt XRegExp toplevelXRegExp, Text2Udom.modes mode)

Parse the contents of the given MemString object, containing just the text body.

@Opt ResultingStructure

fromReader(String locationText, Reader r, ModuleRegistry moduleRegistry)

Parse the contents of the given file, using the given location text for all error messages.

Map<String,String>

getXsltInputNamespaces()

Get all xslt input name spaces, as defined in the header of the input file.

protected void

hint(Location<String> loc, String text, Object... args)

Emits hint iff traceLevel > 0.

protected void

hint_xml(Location<XMLDocumentIdentifier> loc, String text, Object... args)

Emits message iff traceLevel > 0.

protected void

ignore_superfluous_end_tag(String tag, boolean defIsEmpty, Definition def, Location<String> startLoc)

Called after an empty tag, a parser or an enum have been consumed.

static void

insertPlainChars(ResultingStructure host, Definition tag, String chars)

Aux function to insert synthesized character data in the output, which is NOT contained in the source, e.g.

protected boolean

isXsltDef(Definition def)

Whether a definition is an xslt element.

(package private) boolean

isXsltMode()

Whether to parse an XSLT source.

protected Definition

loadMetaDefinition(Map<String,Definition> defs, String name)

Find the definition from the loaded metamodule.

protected void

loadMetaModule(ModuleRegistry moduleRegistry)

Load the general d2d meta module und find the required definitions.

protected void

logEnd(String text, Object... args)

Emits message iff traceLevel > 0.

protected void

logStart(String text, Object... args)

Emits message iff traceLevel > 0.

protected void

makeGlobalSkipContainer(ResultingStructure errormsg)

static void

P(String s)

/

protected boolean

parseVerbatimSuppres()

protected void

process_char_parser_error(String tag, MemString<String> charsStart)

Is called after a non-acceptions by char parser or by an enumeration has been detected.

protected void

process_close_char()

Called from digest_nothing_open/look_for_tag/consume_chars, in case that closing parenthesis character has been recognized.

protected void

process_close_tag(String tag, boolean force)

Called by digest_look_for_tag, process_open_tag (in case of empty element declarations), process_close_char.

protected void

process_close_tag_inner(@Opt String tag, boolean force, Location<String> closeTagDefLoc)

Search the state space for "nexttag" as possible close tag.

protected boolean

process_open_tag(String tag, boolean isCharData)

Only method for processing an "open" tag.

protected void

report_assumed_xslt_output(CheckedList<Expression> misslist)

Is called when an target-language open/close tag has been found after an xslt template call, to report the necessary expansion.

protected void

report_missing_elements(boolean isOpen, String tag, boolean frameFound, CheckedList<Expression> misslist, List<SimpleMessage<String>> messlist, Location<String> closeTagDefLoc)

this proc can be called from process_open_tag(String, boolean) or process_close_tag(String, boolean).
Basically, there are two classes of error recovery:
framefound = true ==> the tag is KNOWN, but tags before are missing, --- then continue with a shrunk stack state.
framefound = false ==> the tag is NOT KNOWN, then discard all input up to the next tag and try again.

protected void

return_to_upper_input_mode()

Called to re-enter the input mode (verbatim or nothing_open) after an explicit close (tags had been openend and closed) or an implicit close (enum or character parser have consumed enough) has happened.
Assume the stack frame of the parser/enum/closed tag is already popped from the stack.
Assume that "inputmode verbatim" is only allowed with "content = (#PCDATA|..)*".

protected static String[]

sortKeys(Enumeration e)

protected void

warning(Location<String> loc, String text, Object... args)

Emits warning iff traceLevel > 0.

protected void

warning(String text, Object... args)

Emits warning iff traceLevel > 0.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- default_ErrorStrategy
  
  public static final Text2Udom.ErrorStrategy default_ErrorStrategy
  
  Evident. See the default field values of this class.
- interactive_ErrorStrategy
  
  public static final Text2Udom.ErrorStrategy interactive_ErrorStrategy
  
  Evident, Allows partial documents and prints the stack context of a parsing error to the console.
- non_interactive_ErrorStrategy
  
  public static final Text2Udom.ErrorStrategy non_interactive_ErrorStrategy
  
  Used for situations when basic diagnosis data is required by the "programmer" of some using code, not by an interactive user.
- scanner
  
  protected MemScanner<String> scanner
  
  Single source of text input.
- msg
  
  protected final MessageReceiver<SimpleMessage<String>> msg
  
  Target of most messages.
- messageCounter
  
  protected final MessageCounter messageCounter
  
  Target of most messages.
- xml_msg
  
  protected final MessageReceiver<SimpleMessage<XMLDocumentIdentifier>> xml_msg
  
  Target of messages for some called classes which generate XML-file-locations.
- errorStrategy
  
  protected final Text2Udom.ErrorStrategy errorStrategy
  
  Applied error strategy, different for interactive and programatic use, etc.
- charSetCalc
  
  protected final Navigate.CharSetCalc charSetCalc
  
  Instance to evaluate character expessions. Maintains a cache.
- traceLevel
  
  protected final int traceLevel
  
  Determines the verbosity: ==0 stands for complete silence ==1 stands for minimal output: few loggings and not all warnings ==2, 3 stands for more output:loggings and warnings ==10 shows some synthesized source texts ==20 stands for full debugging.
- meta_warning
  
  protected Definition meta_warning
- meta_location
  
  protected Definition meta_location
- meta_messageText
  
  protected Definition meta_messageText
- meta_kind
  
  protected Definition meta_kind
- meta_tag
  
  protected Definition meta_tag
- meta_parsingError
  
  protected Definition meta_parsingError
- meta_expected
  
  protected Definition meta_expected
- meta_skipped
  
  protected Definition meta_skipped
- meta_assumedXsltOutput
  
  protected Definition meta_assumedXsltOutput
- textFileHeader
  
  final TextFileHeader textFileHeader
  
  The result of parsing the input header. Globally needed only getXsltInputNamespaces() and "getXsltMode()"
- mode
  
  Text2Udom.modes mode
  
  Kind of the text to parse.
- top_result
  
  protected ResultingStructure top_result
  
  Toplevel result of parsing the input text. Set once when opening the State for the first tag. Globally needed for error handling.
- xslt_alt_ubiquituous
  
  @Opt protected @Opt Expression xslt_alt_ubiquituous
  
  Contains all xslt tags which can appear anywhere in the target elements. This field is != null iff xslt mode.
- xslt_alt_ubiquituous_repeated
  
  protected Expression xslt_alt_ubiquituous_repeated
  
  Contains all xslt tags which can appear anywhere in the target elements. This field is != null iff xslt mode.
- xslt_module
  
  protected Module xslt_module
  
  The loaded xslt module. Only needed for isXsltDef(eu.bandm.tools.d2d2.model.Definition). This field is != null iff xslt mode.
- state
  
  protected State state
  
  The currently growing stackframe, see State.
- leading_ws
  
  protected List<ResultingChars> leading_ws
  
  Accumulator for ws, not yet clear whether to deliver or to ignore.
- currentState
  
  protected Text2Udom.parsingState currentState
  
  Current micro-state, values from Text2Udom.parsingState.
- verbatimSuppress
  
  protected int verbatimSuppress
  
  How many command characters in verbatim mode will not generated a warning.
- interruptedState
  
  protected Text2Udom.parsingState interruptedState
  
  Needed to treat setcommand, setcomment and numeric input (i.e. pseudo-tags, which look like a tag, but do not behave so) like whitespace = resuming the interrupted state.
- skipContainer
  
  protected ResultingStructure skipContainer
  
  For reporting skipped input
- lastCloseWasXslt
  
  protected boolean lastCloseWasXslt
  
  Whether we come from an xslt element. Any explicitly closed xslt element is no longer in the stack, but a subsequent target-language element must nevertheless be treated as coming from "inxslt".
- allSortedKeys
  
  protected static final Map<Enumeration,String[]> allSortedKeys
- MEMSTRING_ID_SYNTHETIC
  
  public static final String MEMSTRING_ID_SYNTHETIC
  See Also:
  
  Constant Field Values
- synthLocation
  
  public static final Location<String> synthLocation
- bicLoc
  
  protected Location<String> bicLoc
- unicodeResults
  
  final Map<Integer,ResultingChars> unicodeResults
Constructor Details
- Text2Udom
  
  public Text2Udom(MessageReceiver<SimpleMessage<String>> msg, Text2Udom.ErrorStrategy errorStrategy, int traceLevel)
  
  Only constructor.
Method Details
- P
  
  public static void P(String s)
  
  /
- getXsltInputNamespaces
  
  public Map<String,String> getXsltInputNamespaces()
  
  Get all xslt input name spaces, as defined in the header of the input file. Valid result can be obtained only after successful parsing.
- isXsltMode
  
  boolean isXsltMode()
  
  Whether to parse an XSLT source.
- isXsltDef
  
  protected boolean isXsltDef(Definition def)
  
  Whether a definition is an xslt element.
- fromFile
  
  @Opt public @Opt ResultingStructure fromFile(String s, ModuleRegistry moduleRegistry)
  Parse the contents of the file found at the given location. The calling graph of the main scanner functions is :
  fromFile(String filename,..) -> fromFile(File,..) -> fromMemString(locationText,MemString,..) -> scanForHeader -> Starter.install -> digest() // Starting in "nothing open" mode, no verbatim or charparser // is supported as top-level element
- fromFile
  
  @Opt public @Opt ResultingStructure fromFile(File f, ModuleRegistry moduleRegistry)
  
  Parse the contents of the given file.
- fromFile
  
  @Opt public @Opt ResultingStructure fromFile(String locationText, File f, ModuleRegistry moduleRegistry)
  
  Parse the contents of the given file, using the given location text for all error messages.
- fromReader
  
  @Opt public @Opt ResultingStructure fromReader(String locationText, Reader r, ModuleRegistry moduleRegistry)
  
  Parse the contents of the given file, using the given location text for all error messages.
- fromMemString
  
  @Opt public @Opt ResultingStructure fromMemString(String locationText, MemString<String> text, ModuleRegistry moduleRegistry)
  
  Parse the contents of the given MemString object, which includes a text type declaration header.
  
  Parameters:
  
  locationText - use this in error messages.
- fromMemString
  
  @Opt public @Opt ResultingStructure fromMemString(String locationText, MemString<String> text, ModuleRegistry moduleRegistry, @Opt @Opt XRegExp toplevelXRegExp, Text2Udom.modes mode)
  
  Parse the contents of the given MemString object, containing just the text body. The text format etc. are already known and fixed and given as argument values .
  
  Parameters:
  
  locationText - use this in error messages.
  
  text - to parse
  
  moduleRegistry - for to load the xslt code and the meta module.
  
  toplevelXRegExp - the top regexp of the target, either directly or as target indication for the xslt source to parse. (== null iff mode == xsltText)
  
  mode - whether to parse XML or XSLT or XSLT for mere text
- warning
  
  protected void warning(Location<String> loc, String text, Object... args)
  
  Emits warning iff traceLevel > 0.
- warning
  
  protected void warning(String text, Object... args)
  
  Emits warning iff traceLevel > 0.
- hint
  
  protected void hint(Location<String> loc, String text, Object... args)
  
  Emits hint iff traceLevel > 0.
- logStart
  
  protected void logStart(String text, Object... args)
  
  Emits message iff traceLevel > 0.
- logEnd
  
  protected void logEnd(String text, Object... args)
  
  Emits message iff traceLevel > 0.
- hint_xml
  
  protected void hint_xml(Location<XMLDocumentIdentifier> loc, String text, Object... args)
  
  Emits message iff traceLevel > 0.
- error
  
  protected void error(Location<String> loc, String text, Object... args)
  
  Emits an error message and further context information depending on the values in errorStrategy.
- error
  
  protected void error(String text, Object... args)
  
  Calls error(eu.bandm.tools.location.Location<java.lang.String>,java.lang.String,java.lang.Object...) with no location.
- failure
  
  protected void failure(Location<?> loc, String text, Object... args)
  
  Throws a corresponding message exception.
- loadMetaDefinition
  
  protected Definition loadMetaDefinition(Map<String,Definition> defs, String name)
  
  Find the definition from the loaded metamodule.
- loadMetaModule
  
  protected void loadMetaModule(ModuleRegistry moduleRegistry)
  
  Load the general d2d meta module und find the required definitions. Store them into the global constant fields "meta_warning", "meta_location", "meta_warning", etc., for later use. Is required for xslt mode and when "partially correct documents" are allowed.
- digest
  
  protected void digest()
  
  Main loop, consumes input data according to current "micro"-state currentState. Only called once, from fromMemString(String,MemString,ModuleRegistry). Calls the digest_<> delegate functions according to currentState. These are only called from here and could be in-lined. They mostly do not change micro-state themselves, but call process_open_tag(String,boolean), process_close_tag(String,boolean) or process_close_char(). These can change the micro-state, which has consequences when re-entering this "digest" loop.
  Each loop pass (and each delegate method) ends with calling scanner .accept(). Therefore, when each loop pass (and each delegate method) starts, the next-to-consume token is already lexically recognized and ready for consumption.
- digest_nothing_open
  
  protected void digest_nothing_open()
  
  Initial (micro-)state, or after an explicit close. Switches currentState on command-char and on non-whitespace character data, which implies the invisible "char-data-tag". If the current top expression does NOT accept characters, then non-Whitespace is implicitly tagged as such, and that tag is searched up-ward, as usual. Whitespace however is accumulated and delayed for possible deliverance, until this happens. If the current top expression DOES accept characters, then whitespace is immediately treated as valid character data and delivered.
- digest_look_for_tag
  
  protected boolean digest_look_for_tag()
  
  After a command char has been read. An open tag or (two kinds of) close tag may follow. Whitespace, comments and further command chars are discarded, other chars lead to an error.
  
  Returns:
  
  whether end-of-input token (=pseudo tag) has been accepted.
- process_open_tag
  
  protected boolean process_open_tag(String tag, boolean isCharData)
  Only method for processing an "open" tag. Is called from digest_nothing_open() for character data, from digest_look_for_tag() for an explicit tag, and from digest_verbatim().
  ATTENTION: must be called in an "un-accepted" state of MemScanner, because next may come an "open tag modifier", which has to be decoded explicty, not by MemScanner.accept(). This may be followed by characters which are consumed by a "chars parser" or an "enum" definition, again not by the MemScanner! In these cases the outer state machine in not altered, but returned to.
  In case of a non-empty "tags parser", the state stack is extended accordingly by one level (or two levels in case of "#implicit".)
  In case of xslt-mode, the state machine is
  startInXslt=startXslt // whether current frame is an xslt | | ascend into non-xslt, and NO xslt components are missing | // if xslt itself is incomplete (CANNOT HAPPEN currently!), | // then treat ALL missings as ERROR. V found_weakMode=true // use "weak_firsts" modified director map | // interpret all missing as "assumed" | | cross open element boundary V all false, normal parsing // report "assumed" and collect all FURTHER // missings as errors
- process_char_parser_error
  
  protected void process_char_parser_error(String tag, MemString<String> charsStart)
  
  Is called after a non-acceptions by char parser or by an enumeration has been detected. If no "partial doc output" is enabled: throw exception and abort parsing. Otherweise add Add diagnostic outputs with tags from "d2d-meta" into the result udom, set up skip container and switch parser state to Text2Udom.parsingState.skip_for_command.
- sortKeys
  
  protected static String[] sortKeys(Enumeration e)
- consume_enumeration
  
  @Opt protected static @Opt ResultingStructure consume_enumeration(MemScanner<String> scan, Enumeration etype)
  
  Used by tag parsers AND character parsers (with multiple parallel scanners !)
- process_close_char
  
  protected void process_close_char()
  
  Called from digest_nothing_open/look_for_tag/consume_chars, in case that closing parenthesis character has been recognized. resets scanner and simulates the reaction to a "complete" close tag.
- process_close_tag
  
  protected void process_close_tag(String tag, boolean force)
  
  Called by digest_look_for_tag, process_open_tag (in case of empty element declarations), process_close_char. Tag may be =null for an unspecified "close lowest element"
- process_close_tag_inner
  
  protected void process_close_tag_inner(@Opt @Opt String tag, boolean force, Location<String> closeTagDefLoc)
  
  Search the state space for "nexttag" as possible close tag. Then adjust the stack accordingly. Whenever obligate entries are missing in between, then store them to "found_missing" and execute the error diagnosis.
  
  Parameters:
  
  tag - the tag of the element to close, or null to close the last opened element.
- return_to_upper_input_mode
  
  protected void return_to_upper_input_mode()
  
  Called to re-enter the input mode (verbatim or nothing_open) after an explicit close (tags had been openend and closed) or an implicit close (enum or character parser have consumed enough) has happened.
  Assume the stack frame of the parser/enum/closed tag is already popped from the stack.
  Assume that "inputmode verbatim" is only allowed with "content = (#PCDATA|..)*". TYPCHECKER MISSING FIXME.
- insertPlainChars
  
  public static void insertPlainChars(ResultingStructure host, Definition tag, String chars)
  
  Aux function to insert synthesized character data in the output, which is NOT contained in the source, e.g. error messages.
  DISLOC FIXME (besser eine allgemeine lösung wie "IncludingCharBuffer" etc.
  
  Parameters:
  
  host - the toplevel element
  
  tag - the definition for the content
  
  chars - the content
- makeGlobalSkipContainer
  
  protected void makeGlobalSkipContainer(ResultingStructure errormsg)
- report_assumed_xslt_output
  
  protected void report_assumed_xslt_output(CheckedList<Expression> misslist)
  
  Is called when an target-language open/close tag has been found after an xslt template call, to report the necessary expansion. Or as soon as the possible range of the template's output (one content model) is left, so that all further missing elements will be treated as normal "missing errors".
- report_missing_elements
  
  protected void report_missing_elements(boolean isOpen, String tag, boolean frameFound, CheckedList<Expression> misslist, List<SimpleMessage<String>> messlist, Location<String> closeTagDefLoc)
  this proc can be called from process_open_tag(String, boolean) or process_close_tag(String, boolean).
  Basically, there are two classes of error recovery:
  
  framefound = true ==> the tag is KNOWN, but tags before are missing, --- then continue with a shrunk stack state.
  
  framefound = false ==> the tag is NOT KNOWN, then discard all input up to the next tag and try again.
  
  General issue: these "Errors" behave like "Warnings" in case of "partialDocs==true".
  
  FIXME DocumentError should have TYPED fields like expression ,etc.
  FIXME parameter for controlling the kind of reaction MISSING FIXME
- digest_consume_characters
  
  protected void digest_consume_characters()
  
  When an element may receive character data, and some non-ws character data has been recognized at the reading position. Discards comments, delivers text, whitespace and other chars, and switches state on a command char.
- ignore_superfluous_end_tag
  
  protected void ignore_superfluous_end_tag(String tag, boolean defIsEmpty, Definition def, Location<String> startLoc)
  
  Called after an empty tag, a parser or an enum have been consumed. Called with scanner in the "un-accepted" state, but leaves after calling "accept()" (so the next token to digest == scanner.current)
  After a character parser there MAY appear an explicit closing tag:
  #tag 12[345]!!! #/tag or
  #tag 12[345]!!! #///tag or
  #tag 12[345]!!! #/ <-ws! or
  #tag$ 12[345]!!! $
  All these are to be IGNORED, since the application of the character parser must be finished (= it must have reached a final state OR an error stste!) anyhow.
  For empty content the situation is similar:
  #tag continue text do NOT consume anything.
  #tag#/tag continue text DO consume end-tag, but nothing more (esp. not the following blank!)
  #tag# /tag continue text this is ok, and treated the same way
  . #tag #/tag continue text this is NOT OK, and should be rejected
  #tag-continue word do NOT consume anything.
  #tag!! continue do consume the parentheses (has been done by lexer).
  #tag#///tag should possibly NOT be supported. If, then treat like simple end tag!
  
  Operation starts with an accept(), meaning "decode at current point of reading".
  On return, also an accept() has been performed, meaning "next lexer token to digest by the top level "digest" loop (=what follows after the ended element) is now reflected by the scanner output fields".
- accept_superfluous_end_tag
  
  protected boolean accept_superfluous_end_tag(String tag, boolean defIsEmpty, Definition def, Location<String> startLoc)
  
  Lets the scanner accept and discard an end tag, either normal or forced or by a parenthesis character. Only caller is ignore_superfluous_end_tag(String,boolean,Definition,Lcation)
- digest_verbatim
  
  protected void digest_verbatim()
  
  Parses "verbatim input" mode, which (a) requires an explicit #/tag end tag, and (b) accepts only tags of sub-elements immediatly contained in its definition.
- digest_skip_for_command
  
  protected void digest_skip_for_command()
  
  Micro-state for error recovery: skip until command char and then goto tag-reading mode again. Store the skipped input chars into skipContainer for meta-error-elements.
- bicERROR
  
  protected void bicERROR(String text, boolean isCommand)
- builtInMetaCommands
  
  protected boolean builtInMetaCommands(String tag)
- parseVerbatimSuppres
  
  protected boolean parseVerbatimSuppres()
- deliver_to_singletonstate
  
  protected void deliver_to_singletonstate(ResultingStructure res, State_singleton tss)
- deliver_numeric
  
  protected void deliver_numeric(Location<String> loc, int val)
  
  Append the unicode character to the top-most resulting structure. A java string must contain two(2) "code units", what are 16bit "chars" for representing a code point > 0x0ffff, (int=32 bit char=16bit)
- deliver
  
  protected void deliver(ResultingStructure res)
  
  Append the argument to the top-most resulting structure.
- deliver_spontanuous
  
  protected void deliver_spontanuous(ResultingStructure res)
  
  Append the argument to the top-most resulting structure which is not encoded as xml attribute.
- deliver
  
  protected void deliver(ResultingChars chars)
- deliver
  
  protected void deliver(Udom res)
- deliver_last
  
  protected void deliver_last()
  
  Deliver the last scanned character data, and set the "is whitespace" flag according to the token type returned by scanner.
- addRestPerm
  
  protected void addRestPerm(State_perm permstate, CheckedList<Expression> misslist)
  
  Add obligate members of a "perm" expression to the term which describes the missing input.
- find_top_singleton
  
  protected State_singleton find_top_singleton()
  
  Called from return_to_upper_input_mode(). and .... DELIVER()??? AND OPEN XSLTMODE ??
- find_top_singleton
  
  protected State_singleton find_top_singleton(boolean mustBeNonAtt)

Class Text2Udom

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

default_ErrorStrategy

interactive_ErrorStrategy

non_interactive_ErrorStrategy

scanner

msg

messageCounter

xml_msg

errorStrategy

charSetCalc

traceLevel

meta_warning

meta_location

meta_messageText

meta_kind

meta_tag

meta_parsingError

meta_expected

meta_skipped

meta_assumedXsltOutput

textFileHeader

mode

top_result

xslt_alt_ubiquituous

xslt_alt_ubiquituous_repeated

xslt_module

state

leading_ws

currentState

verbatimSuppress

interruptedState

skipContainer

lastCloseWasXslt

allSortedKeys

MEMSTRING_ID_SYNTHETIC

synthLocation

bicLoc

unicodeResults

Constructor Details

Text2Udom

Method Details

P

getXsltInputNamespaces

isXsltMode

isXsltDef

fromFile

fromFile

fromFile

fromReader

fromMemString

fromMemString

warning

warning

hint

logStart

logEnd

hint_xml

error

error

failure

loadMetaDefinition

loadMetaModule

digest

digest_nothing_open

digest_look_for_tag

process_open_tag

process_char_parser_error

sortKeys

consume_enumeration

process_close_char

process_close_tag

process_close_tag_inner

return_to_upper_input_mode

insertPlainChars