public class Text2Udom extends Object
Modifier and Type | Class and Description |
---|---|
protected class |
Text2Udom.CannotContinue |
protected class |
Text2Udom.CollectAllElements
Collect all elements which can appear in an xslt template "spontanuously"
on top level of a template content.
|
static class |
Text2Udom.ErrorStrategy
Configuration object for the different ways of reacting to input errors.
|
static class |
Text2Udom.parsingState
Realizes the parsing state on lower/character level.
|
protected class |
Text2Udom.PrematureEndOfFile
Explicit end of input sources is modelled as follows:
the pseudo-token
Chars.STRING_TAGNAME_eof
is defined to "eof"
this is specially tested for in process_open_tag(String, boolean) . |
protected class |
Text2Udom.Starter
Opens all "
State "-frames needed for later continuations,
directed by an Expression and the recognized tag. |
Modifier and Type | Field and Description |
---|---|
protected static Map<Enumeration,String[]> |
allSortedKeys |
protected Location<String> |
bicLoc |
protected Navigate.CharSetCalc |
charSetCalc |
protected static Function<Definition,Reference> |
co_resolve |
protected Text2Udom.parsingState |
currentState
Current micro-state, values from
#parsingState . |
static Text2Udom.ErrorStrategy |
default_ErrorStrategy |
protected boolean |
doTrace |
protected Text2Udom.ErrorStrategy |
errorStrategy |
protected boolean |
found_eof_parsed |
static Function<SourceItem,String> |
getFullPath |
static Text2Udom.ErrorStrategy |
interactive_ErrorStrategy |
protected Text2Udom.parsingState |
interruptedState
Needed to treat
setcommand , setcomment and numeric input
(i.e. |
protected boolean |
lastCloseWasXslt
An EXPLICITLY closed xslt element is no longer in the stack,
but a subsequent target-language element must nevertheless be treated
as coming from "inxslt"!
|
protected List<ResultingChars> |
leading_ws
Accumulator for ws, not yet clear whether to deliver or to ignore.
|
static Location<XMLDocumentIdentifier> |
locnull |
protected static Function<String,Reference> |
makeRef |
static String |
MEMSTRING_ID_SYNTHETIC
Aux function to insert synthesized character data in the output, which is
NOT contained in the source, e.g.
|
protected Definition |
meta_assumedXsltOutput |
protected Definition |
meta_expected |
protected Definition |
meta_kind |
protected Definition |
meta_location |
protected Definition |
meta_messageText |
protected Module |
meta_module
Attention, only initialized and used iff xslt or "partial" documents are created.
|
protected Definition |
meta_parsingError |
protected Definition |
meta_skipped |
protected Definition |
meta_tag |
protected Definition |
meta_warning |
protected String |
modulename
Maybe == null iff xsltmode.
|
protected ModuleRegistry |
moduleRegistry |
protected MessageReceiver<SimpleMessage<String>> |
msg |
protected MemScanner |
scanner |
protected ResultingStructure |
skipContainer
For reporting skipped input
|
protected static boolean |
sloppyHeaderSyntax |
protected State |
state
The crurently growing stackframe, see
State . |
static Location<String> |
synthLocation |
protected ResultingStructure |
top_result
Toplevel result of parsing the input text.
|
protected String |
toplevel_fileLocation |
protected @Opt String |
toplevelelementname
Maybe == null iff xsltmode.
|
protected Module |
toplevelModule |
protected XRegExp |
toplevelXRegExp |
protected int |
verbatimSuppress |
protected boolean |
version_2_0_selected |
protected MessageReceiver<SimpleMessage<XMLDocumentIdentifier>> |
xml_msg |
protected Expression |
xslt_alt_ubiquituous |
protected Expression |
xslt_alt_ubiquituous_repeated |
static String |
xslt_generic_tag_for_backend_elements |
protected Module |
xslt_module |
static String |
xslt_modulename |
static String[] |
xslt_prefices |
static String |
xslt_tag_toplevel |
static String |
xslt_tag_ubiquituous |
protected Map<String,String> |
xsltInputNamespaces
Is a map from URI TO prefix.
|
protected boolean |
xsltmode |
Constructor and Description |
---|
Text2Udom(MessageReceiver<SimpleMessage<String>> msg,
ModuleRegistry moduleRegistry,
Text2Udom.ErrorStrategy errorStrategy) |
Text2Udom(MessageReceiver<SimpleMessage<String>> msg,
ModuleRegistry moduleRegistry,
Text2Udom.ErrorStrategy errorStrategy,
boolean doTrace) |
Modifier and Type | Method and Description |
---|---|
protected boolean |
accept_superfluous_end_tag(String tag,
boolean defIsEmpty,
Definition def,
Location<String> startLoc)
Lets the scanner accept and discard an end tag, either normal
or forced or by a parenthesis character.
|
protected void |
addRestPerm(State_perm permstate,
CheckedList<Expression> misslist)
Add obligate members of a "perm" expression to the term which
describes the missing input.
|
protected void |
bicERROR(String text,
boolean isCommand) |
protected boolean |
builtInMetaCommands(String tag) |
protected static @Opt ResultingStructure |
consume_enumeration(MemScanner scan,
Enumeration etype)
Used by tag parsers AND character parsers (with multiple parallel scanners !)
|
protected void |
deliver_last()
Deliver the last scanned character data, and set the "is whitespace" flag
according to the token type returned by scanner.
|
protected void |
deliver_numeric(int val)
Append the unicode character to the top-most resulting structure.
|
protected void |
deliver_spontanuous(ResultingStructure res)
Append the argument to the top-most resulting structure which is
not encoded as xml attribute.
|
protected void |
deliver_to_singletonstate(ResultingStructure res,
State_singleton tss) |
protected void |
deliver(ResultingChars chars) |
protected void |
deliver(ResultingStructure res)
Append the argument to the top-most resulting structure.
|
protected void |
deliver(Udom res) |
protected void |
digest_consume_characters()
When an element may receive character data, and some non-ws character data
has been recognized at the reading position.
|
protected void |
digest_look_for_tag()
After a command char has been read.
|
protected void |
digest_nothing_open()
Initial (micro-)state, or after an explicit close.
|
protected void |
digest_skip_for_command()
Micro-state for error recovery: skip until command char and then goto tag-reading
mode again.
|
protected void |
digest_verbatim()
Parses "verbatim input" mode, which (a) requires an explicit
#/tag end tag, and (b) accepts only tags of sub-elements
immediatly contained in its definition. |
protected void |
digest()
Main loop, consumes input data according to current "micro"-state
currentState . |
protected void |
error(Location<String> loc,
String text) |
protected void |
error(Location<String> loc,
String text,
Object... args) |
protected void |
error(String text) |
protected void |
errorcheck_xsltprepare(MessageCounter<SimpleMessage<XMLDocumentIdentifier>> cnt) |
protected void |
failure(Location<String> loc,
String text) |
protected void |
failureDEF(Location<XMLDocumentIdentifier> loc,
String text) |
protected State_singleton |
find_top_singleton()
Called from
return_to_upper_input_mode() . |
protected State_singleton |
find_top_singleton(boolean mustBeNonAtt) |
@Opt ResultingStructure |
fromFile(File f) |
@Opt ResultingStructure |
fromFile(String s)
The calling graph of the main scanner functions is :
|
protected @Opt ResultingStructure |
fromMemString(MemString text) |
@Opt ResultingStructure |
fromMemString(String locationText,
MemString text) |
Map<String,String> |
getXsltInputNamespaces() |
protected void |
hint_xml(Location<XMLDocumentIdentifier> loc,
String text) |
protected void |
hint(Location<String> loc,
String text) |
protected void |
ignore_superfluous_end_tag(String tag,
boolean defIsEmpty,
Definition def,
Location<String> startLoc)
Called after an empty tag, a parser or an enum have been consumed.
|
static void |
insertPlainChars(ResultingStructure host,
Definition tag,
String chars) |
protected boolean |
isXsltDef(Definition def) |
protected Definition |
loadMetaDefinition(Map<String,Definition> defs,
String name) |
protected void |
loadMetaModule() |
protected void |
logEnd(String text) |
protected void |
logStart(String text) |
protected Alt |
makealt(CheckedList<Expression> subs) |
protected Alt |
makealt(Collection<Reference> subs) |
protected void |
makeGlobalSkipContainer(ResultingStructure errormsg) |
static Text2Udom.ErrorStrategy |
non_interactive_ErrorStrategy()
Used for situations when basic data is required by the "programmer" and
not by a user.
|
static void |
P(String s) |
protected boolean |
parseVerbatimSuppres() |
protected void |
prepareXsltMode_text()
For text mode the xslt module can be used directly, no rewriting is necessary
but eliminating "xslt:RESULT_ELEMENTS"
|
protected void |
prepareXsltMode()
The insertion of the target language classes into the content-containing
xslt elements is done by once modifying the (copy of the) xslt declaration.
|
protected void |
process_char_parser_error(String tag,
MemString charsStart)
Is called after a non-acceptions by char parser or by an enumeration has been detected.
|
protected void |
process_close_char()
Called from digest_nothing_open/look_for_tag/consume_chars,
in case that closing parenthesis character has been recognized.
|
protected void |
process_close_tag_inner(@Opt String tag,
boolean force,
Location<String> closeTagDefLoc)
Search the state space for "nexttag" as possible close tag.
|
protected void |
process_close_tag(String tag,
boolean force)
Called by digest_look_for_tag,
process_open_tag (in case of empty element declarations), process_close_char.
|
protected void |
process_open_tag(String tag,
boolean isCharData)
Only method for processing an "open" tag.
|
protected void |
report_assumed_xslt_output(State_singleton ss,
CheckedList<Expression> misslist)
Is called when an target-language open/close tag has been found after
an xslt template call, to report the necessary expansion.
|
protected void |
report_missing_elements(boolean isOpen,
String tag,
boolean frameFound,
CheckedList<Expression> misslist,
List<SimpleMessage<String>> messlist,
Location<String> closeTagDefLoc)
this proc can be called from
process_open_tag(String, boolean) or
process_close_tag(String, boolean) .Basically, there are two classes of error recovery: framefound = false ==> the tag is NOT KNOWN, then discard all input up to the next tag and try again. |
protected void |
return_to_upper_input_mode()
Called to re-enter the input mode (verbatim or nothing_open) after
an explicit close (tags had been openend and closed) or an implicit
close (enum or character parser have consumed enough) has happened.
Assume the stack frame of the parser/enum/closed tag is already popped from the stack. Assume that "inputmode verbatim" is only allowed with "content = (#PCDATA|..)*". |
protected void |
scanForHeader()
Sets modname==null and toplevelelementname==null => xsltmode=true
Generates warnings/errors
|
protected static String[] |
sortKeys(Enumeration e) |
protected void |
warning(Location<String> loc,
String text) |
protected void |
warning(String text) |
protected void |
warnOrError_incompleteHeader(Location<String> loc,
String missing) |
public static final Text2Udom.ErrorStrategy default_ErrorStrategy
public static final Text2Udom.ErrorStrategy interactive_ErrorStrategy
protected final ModuleRegistry moduleRegistry
protected MemScanner scanner
protected final MessageReceiver<SimpleMessage<String>> msg
protected final MessageReceiver<SimpleMessage<XMLDocumentIdentifier>> xml_msg
protected final Text2Udom.ErrorStrategy errorStrategy
protected final Navigate.CharSetCalc charSetCalc
protected final boolean doTrace
protected Module meta_module
protected Definition meta_warning
protected Definition meta_location
protected Definition meta_messageText
protected Definition meta_kind
protected Definition meta_tag
protected Definition meta_parsingError
protected Definition meta_expected
protected Definition meta_skipped
protected Definition meta_assumedXsltOutput
protected String toplevel_fileLocation
protected String modulename
protected Module toplevelModule
protected XRegExp toplevelXRegExp
protected boolean version_2_0_selected
protected Map<String,String> xsltInputNamespaces
protected ResultingStructure top_result
protected boolean xsltmode
protected Expression xslt_alt_ubiquituous
protected Expression xslt_alt_ubiquituous_repeated
protected Module xslt_module
protected boolean found_eof_parsed
protected List<ResultingChars> leading_ws
protected Text2Udom.parsingState currentState
#parsingState
.protected int verbatimSuppress
protected Text2Udom.parsingState interruptedState
setcommand
, setcomment
and numeric input
(i.e. pseudo-tags, which look like a tag, but do not behave so)
like whitespace.protected ResultingStructure skipContainer
protected boolean lastCloseWasXslt
protected static final boolean sloppyHeaderSyntax
public static final String xslt_modulename
public static final String xslt_generic_tag_for_backend_elements
public static final String xslt_tag_toplevel
public static final String xslt_tag_ubiquituous
public static final String[] xslt_prefices
public static final Location<XMLDocumentIdentifier> locnull
protected static final Function<Definition,Reference> co_resolve
public static final Function<SourceItem,String> getFullPath
protected static final Map<Enumeration,String[]> allSortedKeys
public static final String MEMSTRING_ID_SYNTHETIC
public Text2Udom(MessageReceiver<SimpleMessage<String>> msg, ModuleRegistry moduleRegistry, Text2Udom.ErrorStrategy errorStrategy)
public Text2Udom(MessageReceiver<SimpleMessage<String>> msg, ModuleRegistry moduleRegistry, Text2Udom.ErrorStrategy errorStrategy, boolean doTrace)
public static final Text2Udom.ErrorStrategy non_interactive_ErrorStrategy()
public static void P(String s)
@Opt public @Opt ResultingStructure fromFile(String s)
fromFile(String filename) -> fromFile(File) -> fromMemString(MemString) -> scanForHeader -> Starter.install -> digest() // Starting in "nothing open" mode, no verbatim/charparser top-level supported FIXME
@Opt public @Opt ResultingStructure fromFile(File f)
@Opt public @Opt ResultingStructure fromMemString(String locationText, MemString text)
@Opt protected @Opt ResultingStructure fromMemString(MemString text)
protected void warning(String text)
protected void logStart(String text)
protected void logEnd(String text)
protected void hint_xml(Location<XMLDocumentIdentifier> loc, String text)
protected void error(String text)
protected void failureDEF(Location<XMLDocumentIdentifier> loc, String text)
protected Definition loadMetaDefinition(Map<String,Definition> defs, String name)
protected void loadMetaModule()
protected void warnOrError_incompleteHeader(Location<String> loc, String missing)
protected void scanForHeader()
protected boolean isXsltDef(Definition def)
protected Alt makealt(CheckedList<Expression> subs)
protected Alt makealt(Collection<Reference> subs)
protected void prepareXsltMode()
process_open_tag(String, boolean)
.protected void prepareXsltMode_text()
protected void errorcheck_xsltprepare(MessageCounter<SimpleMessage<XMLDocumentIdentifier>> cnt)
protected void digest()
currentState
.
Only called once, from fromMemString(MemString)
.
Calls the digest_<>
delegate functions according to currentState.
These are only called from here and could be in-lined.
They mostly do not change micro-state themselves, but call
#process_open_tag(String.boolean)
,
#process_close_tag(String.boolean)
or
process_close_char()
.
scanner
.accept().
protected void digest_nothing_open()
currentState
on command-char and on non-whitespace character
data, which implies the invisible "char-data-tag".
If the current top expression does NOT accept characters, then
non-Whitespace is implicitly tagged as such, and that tag is searched up-ward,
as usual. Whitespace however is accumulated and delayed for possible deliverance,
until this happens.
If the current top expression DOES accept characters, then whitespace is
immediately treated as valid character data and delivered.protected void digest_look_for_tag()
protected void process_open_tag(String tag, boolean isCharData)
digest_nothing_open()
for character data, from digest_look_for_tag()
for an explicit
tag, and from digest_verbatim()
.
MemScanner.accept()
.
This may be followed by characters which are consumed by a "chars parser"
or an "enum" definition, again not by the MemScanner!
In these cases the outer state machine in not altered, but returned to.
startInXslt=startXslt // whether current frame is an xslt | | ascend into non-xslt, and NO xslt components are missing | // if xslt itself is incomplete (CANNOT HAPPEN currently!), | // then treat ALL missings as ERROR. V found_weakMode=true // use "weak_firsts" modified director map | // interpret all missing as "assumed" | | cross open element boundary V all false, normal parsing // report "assumed" and collect all FURTHER // missings as errors
protected void process_char_parser_error(String tag, MemString charsStart)
parsingState.skip_for_command
.protected static String[] sortKeys(Enumeration e)
@Opt protected static @Opt ResultingStructure consume_enumeration(MemScanner scan, Enumeration etype)
protected void process_close_char()
protected void process_close_tag(String tag, boolean force)
protected void process_close_tag_inner(@Opt @Opt String tag, boolean force, Location<String> closeTagDefLoc)
nexttag
- the tag of the element to close, or null to close
the last opened element.protected void return_to_upper_input_mode()
public static void insertPlainChars(ResultingStructure host, Definition tag, String chars)
protected void makeGlobalSkipContainer(ResultingStructure errormsg)
protected void report_assumed_xslt_output(State_singleton ss, CheckedList<Expression> misslist)
protected void report_missing_elements(boolean isOpen, String tag, boolean frameFound, CheckedList<Expression> misslist, List<SimpleMessage<String>> messlist, Location<String> closeTagDefLoc)
process_open_tag(String, boolean)
or
process_close_tag(String, boolean)
.protected void digest_consume_characters()
protected void ignore_superfluous_end_tag(String tag, boolean defIsEmpty, Definition def, Location<String> startLoc)
#tag 12[345]!!! #/tag
or #tag 12[345]!!! #///tag
or #tag 12[345]!!! #/ <-ws!
or #tag$ 12[345]!!! $
#tag continue text
do NOT consume anything.#tag#/tag continue text
DO consume end-tag, but nothing more
(esp. not the following blank!)#tag# /tag continue text
this is ok, and treated the same way#tag #/tag continue text
this is NOT OK, and should be rejected#tag-continue word
do NOT consume anything.#tag!! continue
do consume the parentheses (has been done by lexer).#tag#///tag
should possibly NOT be supported. If, then treat like
simple end tag! accept()
, meaning "decode at current point
of reading".accept()
has been performed, meaning "next
lexer token to digest by the top level "digest" loop (=what follows after the
ended element) is now reflected by the scanner output fields".protected boolean accept_superfluous_end_tag(String tag, boolean defIsEmpty, Definition def, Location<String> startLoc)
#ignore_superfluous_end_tag(String,boolean)
protected void digest_verbatim()
#/tag
end tag, and (b) accepts only tags of sub-elements
immediatly contained in its definition.protected void digest_skip_for_command()
protected void bicERROR(String text, boolean isCommand)
protected boolean builtInMetaCommands(String tag)
protected boolean parseVerbatimSuppres()
protected void deliver_to_singletonstate(ResultingStructure res, State_singleton tss)
protected void deliver_numeric(int val)
protected void deliver(ResultingStructure res)
protected void deliver_spontanuous(ResultingStructure res)
protected void deliver(ResultingChars chars)
protected void deliver(Udom res)
protected void deliver_last()
protected void addRestPerm(State_perm permstate, CheckedList<Expression> misslist)
protected State_singleton find_top_singleton()
return_to_upper_input_mode()
.
and .... DELIVER()??? AND OPEN XSLTMODE ??protected State_singleton find_top_singleton(boolean mustBeNonAtt)
see also the complete user documentation .