#d2d 2.0 text using  mtdocpage : webpage 

#title   #umod --- an Automated Model Generator  
// #htmlTitle bandm metatools umod, the Micro Model Generator
#lang en

#tableOfContents

#h1 #title Purpose and Way of Operation

#p
#umod  compiles a data model from a high-level definition language
into java code.
The input language is designed for utmost compactness. Although propietary, it
closely follows mathematical notation.

#p
Java is a rather verbose language. So the definition of large data models
is a rather tedious, monotone and error-prone
task, requiring lots of redundant typing.
#umod  automates this typing process, by generating from an compact, 
non-redundant notation the vast amount of code which is needed 
to realize #ldots
#list
#i the data type definitions,
#i methods for creating, modifying and inquiring the model,
#i different kinds of visitors,
#i methods for visualization,
#i methods for a "soap"-like (de-)serialization #cite soap.
#/list

Some of these features can be enabled separately, according to the user's need.

#p#kind missing 
statt "tsoap" refer to mathml etc. ??


#p
In contrast to other similar projects, e.g. pizza, #umod  is not realized as a 
pre-processor to
arbitrarily arranged java source files.
Instead, it is a compiler which generates one(1) model from one(1) single, central
source file. (This is true conceptually; technically see #ref txt_splitInput.)
#p
// THIS IS A LABEL TEST #link#loc txtstrictness #/link ENDE LINK
#p
This approach has both advantages and disadvantages.
E.g.  you cannot use the convenient #umod  notation for complicated
nested typed collection 
classes #ital ad hoc#/  in any source file, but only in such a central model 
definition file.
#p
As an advantage you have one central compact definition of the main model
of a project, --- mostly not much more than one single screen page!
This turned out to be really helpful when developing the
further code or writing documentation.
#p
Please note:#nl
1) Some consistency conditions are easily checked by the java compiler, but
only at much higher costs by the #umod  tool. Therefore in most such
cases the corresponding checks are 
left to the former, and error messages will come from the attempt to
compile the generated source text, not from #umod  generation process.
This requires some experience and "type checking" skills from the user
(but still much less than the C++/STL error messages !-)
#footnote
For example: the "#src!ORDERED!" keywords initiates the generation of a 
"#src!compareTo(..)!" #java  method. 
// FIXME DOC FEHLT
The generated code first compares the constructors, then steps
through the field values. In case of object references, simply recursive calls to 
"#src!compareTo(..)!" will
be generated. #umod   does  #emph!not guarantee! that such a method does exist.
This is because this test is quite different when the type of the field is 
a reference to (a)
a class of the same model, (b) to an external class given in binary form, or (c) given in
source text form. So we deliberately leave the test to the further processing.
#/footnote
#p
2) The generated code can always be by-passed by inserting #ital!verbatim!  java code.
So the data models generated by #umod  are widely, but not totally  
fool-proved. The usage of #umod  does not replace responsibility and survey.


// ------------------------------------------------------------------

#h1 #title Input Notation Syntax and Generated #java  Classes
#p

The input syntax for #umod  definition files is somehow graphic-oriented, for the 
sake of maximal compactness and clear arrangement.
#nl
The #umod  definition file is intended to be also used as a #emph!documentation!, esp.
for the programmer during her/his coding work.

#p#kind src 
The syntax of #umod  input is defined in 
#link 3/eu/bandm/tools/umod/parser/umod.g 
#text this grammar file#/.


// ------------------------------------------------------------------
#h2 #title Model Declaration

#p
For the basic data types, i.e. the elements which make up the
data model,   #umod  supports two flavours of definitions:
#list
#i either "class" definitions, using a graphic oriented input format, and supporting
inheritance/specialization, 
#i or "type" definitions, denotatend by a pure term notation and not supporting 
specialization. ((THE "TYPE" CONSTRUCT IS CURRENTLY NOT YET WELL SUPPORTED!))
#/list
#p
Additionally, there can be
#list
#i definitions of visitors,
#i definitions of simple interfaces,
#i definitions of simple enumeration types, 
#i import declarations of external, predefined classes,
#i and documentation text for most of these constructs.
#/list
After their declaration, most of these entities are
referred to by an #nonterminal identifier.

#cfRule identifier ::= identifier_lower | identifier_upper ; 

#cfRule identifier_lower ::= lowerCaseAlpha (lowerCaseAlpha | upperCaseAlpha | digit | "_" )* ; 

#cfRule identifier_upper ::= upperCaseAlpha (lowerCaseAlpha | upperCaseAlpha | digit | "_" )* ; 


#p
There is #emph!only one single name space!  for the identifiers of
all these different categories, and duplicate usage of an identifier will
yield an error.
#p
All #umod  identifiers must start with a Latin character, and may be continued with
characters, Arabic digits and the underscore "#src!_!". Consequently, all generated
Java objects with an identifier #emph!starting! with an underscore do not correspond
to a #umod   definition, but are additional and ancillary.
#nl
In #umod   there is a difference between lower and upper case identifiers:
Class, enumeration and visitor 
names must start with an upper case, field names must start with lower case.
External declarations, enumeration items and types can be lower or upper case.

#p
The top level syntax of each #umod  source file 
contains the declaration of the model name, like #ldots
#source
MODEL myModel =
  .. 
  ..
END MODEL
#/source

#p
The underlying grammar can be described as #ldots

#cfRule umodFile  ::=  "MODEL" identifier_upper "=" (docEntry)? modelLines "END" "MODEL" ; 

#cfRule modelLines ::=   (visitorDeclaration)*  (importDeclaration)* 
   \nl (typeDef | enumDef | toplevelClassDef | extendClassDef | interfaceDef )+ ;

#p
The #umod  compiler then generates source files of a certain #emph!package!
(in the java sense.) The name of the package must be given as a command line
parameter when calling the tool, cf. #ref txtcommandlineoptions below.
#p
For each class definition and type definition the #umod  compiler generates
one single java class.
#nl
Additionally, it generates sources of further java classes, realizing
the different purposes mentioned above and explained in detail
further below, e.g. visitors, serializers, visualizations, etc.
#p
The #umod  compiler can work either in #emph!package mode! or in 
#emph!monolithic mode!. This mode also is selected by a command line switch.
#p
In package mode all generated classes are contained in the named package,
and each class is contained in its own source file as a 
top-level #src!public! class.
Additionally, a #emph!model class! 
with the name of the model is created, which only contains
some central entry points and attributes valid for the model as a whole.
#p
In monolithic mode, this model class is created in the same way.
But all other generated classes are realized as  #emph!static inner classes! 
(in the java sense) of that one model class.
#p
#xemph!Attention:! Type definitions "should work", but
have not yet been tested thoroughly and are not covered
in the following documentation. Up to now, our extensive applications of #umod  did not 
make them appear really necessary.


// ------------------------------------------------------------------
#h2 #title Import Declaration
#p
An #umod   model can refer to any existing java class for 
defining field types and data types, and for declaring them as
superclasses to and interfaces implemented by a model class.
#p
This is prepared  by an import declaration :
#cfRule importDeclaration ::= "EXT" ("SOURCE")? identifier "=" pathToClass ;
#cfRule pathToClass ::= identifier ("." identifier)* ; 
#p
//The identifier must begin with an upper case ascii character.#nl
The identifier must begin with an ascii character, either upper or lower case,
and can contain ascii charactes, 
digits and the underscore "#src!_!".#nl
#p
The pathToClass is the fully qualified class name (in the jave sense).
In the rest of the data model declaration this identifier will be used 
to refer to the specified external class, 
e.g. when declaring superclasses, interfaces or field types.
#p
If the modifier #src!SOURCE! is #emph!not! given, then the class
must be reachable and is loaded by the #umod  compiler.
#p
#umod  uses a second, #emph dedicated class loader#/  for this purpose.
In case that the class loading context in which #umod  itself is running,
and the future context of the generated classes differs, than this
class loader can be parametrized with a command line switch, 
cf. #ref txtcommandlineoptions.

#p
The modifier #src!SOURCE! must always be added whenever the 
external class cannot be loaded at all during the run of #umod .
In this case the #umod  compiler will insert references to the class without 
testing whether it is reachable. The user has to care that the java compiler will
later reach either the source or the class file.

// ------------------------------------------------------------------
#h2 #title Enumerations
#p
As auxialliary class, as field values, #umod  supports the easy definiton of 
simple enumeration types. 

#cfRule enumDef ::=      "ENUM" identUpper (docEntry)? "="  enumitem ("," enumitem)*  ; 
#cfRule enumItem  ::=  ident (docEntry)? ; 

#p
The identifiers can be used in the type language for fields, as described in 
#ref txt_typeExprs.

// ------------------------------------------------------------------
#h2 #title Class Hierarchy
#p
The syntax of class definitions is somehow "graphic oriented", for sake
of compactness. 
The fragment of input text  #ldots
#source
TOPLEVEL CLASS
   A
   | B1
   | | C1
   | | C2
   | B2
#/source
#p
#ldots defines "graphically" a generated hierarchy of classes,
namely #src!A! extending #src!java.lang.Object!, #src!B1! and #src!B2! extending 
#src!A!, and #src!C1! and #src!C2! extending #src!B1!.
#nl
In contrast to #java, Class names #emph!must! begin with an upper-case letter.
Character sequences 
which are used by the #umod  front-end syntax as keywords, as described in
this document (e.g. "#src!TOPLEVEL!", "#src!CLASS!", "#src!JAVA!", #ldots)
are rejected as identifiers. This rejection is (currently) #emph!implicit! by 
the parsing process, and reported as a syntax error.
#p
Please note that, inspite of the graphic appearance, the syntax definition
and the implemented parser of #umod  do #emph!not! treat line-breaks as 
significant.
#nl
The same example could have been written as 
#source  A | B1 | | C1 | | C2 | B2 
#/source

#p
For sake of clarity the definitions can be broken into sub-graphs, which are
automatically combined, like:
#source
TOPLEVEL CLASS
   A
   | B1
   | B2

EXTEND CLASS
   B1
   | C1
   | C2
#/source
#p
The appearance of each class definition in the #emph!first! block should
leave out most details, so that this top-level class tree can serve as a
dcoumentation for the basic skelton of a model.
Then more details may be added later, in one or more "#src!EXTEND CLASS!" blocks.
#p
Every field definition (see #ref txtfielddefs) can be placed individually
at either appearance of "B1". The same holds for every modifying attribute
(see #ref abstractalgebraic).
Please note that an extensive usage of this feature will result in #emph!less! instead
of more clarity.

#p
The syntax of class definitions can be described more formally as #ldots

#cfRule toplevelClassDef ::= "TOPLEVEL" "CLASS" classdef ; 

#cfRule extendClassDef ::= "EXTEND" "CLASS" classdef ; 

#cfRule classDef ::=  identifier_upper classModifieres (docEntry)? 
\nl
(fieldDef | fieldDoc | superField)*  subClassesDef ; 

#cfRule subClassesDef  ::=  ("|")+ classDef ; 


// ------------------------------------------------------------------
#h3 #title Extending and implementing external classes.

// #src!EXTENDS! and #sr.IMPLEMENTS . 
#label abstractalgebraic

#p
Following the class name there can appear different modifying attributes,
defined by #ldots

#cfRule classModifiers ::= "ABSTRACT"? "ALGEBRAIC"? 
   \nl ("EXTENDS" idenifier)? ("IMPLEMENTS" idenifier+)? ;

#p
Their usage is explaind by the following examples:

#source
TOPLEVEL CLASS
   A EXTENDS SomeExternalClass 
   | B1 IMPLEMENTS Interface0, Interface1
#/source
#p
Only #emph!toplevel! class definitions can be given an explicit superclass
by the #src!EXTENDS! keyword. It no such is given, then a toplevel class
extends #src!java.lang.object!.#nl
Only #emph!external! classes can be used for such a superclass declaration.
#p
Every class can be given a list of interfaces it implements
by the #src!IMPLEMENTS! keyword.#nl
Only #emph!external! classes can be used for interface declaration.

// ------------------------------------------------------------------
#h3 #title Declaring Classes as #src!ABSTRACT! and #src!ALGEBRAIC!

#source
TOPLEVEL CLASS
   A ALGEBRAIC
   | B1
#/source

#p
Every #emph!toplevel! class can be declared #src!ALGEBRAIC!. This
enforces "algebraic semantics" on the equality relation.
Consequently, in the generated code 
an #src!equals()! method is included which defines equality by comparing all
field contents and ignoring missing pointer identity. A #src!hashcode()! method
is constructed accordingly.
#p
The algebraic property is automatically distributed to all sub-classes
of the class it appears with.
#p
Currently it is #emph not defined#/
 what an #src!ALGEBRAIC! modifier does when applied to
a non-toplevel class, and the authors can #emph!not!  imagine what this 
#emph!should! mean !-)

#p
Currently it is #emph not defined#/  what happens when a #src!float! field 
appears in an algebraic data type, because floats do #emph!not! have a
precise and canonical notion of "identity". There must be some "epsilon" value
to implement algebraic identity, and we do not yet know where to get this from.

#p#kind missing 
FIXME algebraic identity for floats missing 

#p
Please note that the algebraic semantics of class definitions
which have fields of compound types (sequences, 
sets, maps, etc.) rely on the correct implementation
of the "#src!equals()!" method in the corresponding runtime libraries.
For those employed as default by the automatically generated code,
this is guaranteed.
#p
Even more important: the objects which realize the
values of these fields, i.e. the employed #bold!collection objects are still
modifiable!! This is of course not optimal,
since in-place update are forbidden and would better be prevented
by the generated code. But the alternative would have been to choose
as default the copy of these container objects into non-modifiable variants.
But since this must be done fully recursively, it can come out
to be very expensive, so we decided that the 
programmer stays responsible that all collection objects
which are referred to from any algebraic umod value indeed stay un-altered!


#p
Every class can be declared #src!ABSTRACT!. This (a) is translated into
an "#src!abstract!" declaration for the java compiler, and (b) some
parts of the code will not be generated for this class, e.g. constructors.#nl
The #src!ABSTRACT! attribute does only apply to the class level it appears
on explicitly. 


// ------------------------------------------------------------------
#h3 #title #ital!Verbatim!  Java  Source Text in Class Definitions  
#label txtverbatimjava
#p 
You can insert free java source text into each
class definition. This text will be inserted "#ital!verbatim!" into the generated
java class. It is subject to syntax check and re-formatting. 

This is implemented by calling the  
#link 2/eu/bandm/tools/metajava/GeneratedClass.html 
#loc addDeclarations(java.lang.String)
#text #src!GeneratedClass.addDeclarations()!#/link  method 
from the #link metajava.html #text metajava model#/.)#nl

It is not subject to context check or type check. Therefore some kinds of 
error will be reported by the subsequenct attempt to run the java compiler.


#source
TOPLEVEL CLASS
   A
   | B1 
     JAVA public String myfunction(int i){ return ""+this+i;} $$
   | B2
#/source

#p 
Alternatively, you can can insert free java source text into the
java source generated for the top-level, model representation class:
#source
TOPLEVEL CLASS
   A
   | B1 
<<JAVA public static String myfunction(int i){ return ""+i;} $$
   | B2
#/source
#p
Please note that also this construct has to appear #emph!inside! the class
hierarchy, in spite of resulting in top-level code. It can #emph!not! be
placed on the syntactic top-level of the #umod  input file.
#p
Both kinds are esp. useful for declaring #emph!instances! of model classes,
because #umod  itself has no language constructs on the instance level:
#source
TOPLEVEL CLASS
   A
   | B1 
<<JAVA public static final B1 CONST_B1 = new B1();
   | B2
#/source


#p 
A special case is the #emph!toString()! method 
(see also #ref txttostring), which can be 
defined by simply giving the method #emph!body!:
#source
TOPLEVEL CLASS
   A
   | B1 
        TOSTRING JAVA return "[B1:"+this+"]"; $$
   | B2
#/source

#p
((Remark:#nl
The syntax #src!JAVA...$$! is certainly not very pretty.
A markup of this kind is required,
because these java fragments are by-passed
already on the #emph!lexer! level of the employed #antlr  lexer/parser architecture.
))

// ------------------------------------------------------------------
#h3 #title Per-Class Generated Methods: #src!doClone()! and #src!initFrom()!.
#p
For each generated class "#src!C!"  #umod  provides #ldots
#source
  class C {
    ... 
    public C doclone(){
    }
    public C initFrom(Object o){
    }
#/source
#p
"#src!doclone()!" returns a shallow copy of the object it is called upon.
#p
"#src!initFrom()!" copies the values of all #emph!those! #umod  defined
fields from the argument 
object #src!o! to the object #src!this! , which are defined on the level of the 
"most special common superclass" of both objects, and on all levels above.

// ------------------------------------------------------------------
#h2 #title Field Definitions #label txtfielddefs
#p
Field definitions are interspered into the class definitions. They
follow the syntax
#cfRule fieldDef ::= identifier_lower (abstrfield)? type (defaultValue)? 
  \nl (fieldPragmas)? (docEntry)? ;
#cfRule fieldPragmas   ::= "!" (traversalPragma|constructorPragma)+ ";" ;


#p
Field name and type #emph!must! be given. 
#p
For every field declaration (in the #umod  source)
the java class generated for the #umod  class
definition will be given a corresponding field (in the java sense),
together with a zoo of getter, setter and auxiliary functions, depending
on the type of the field.
#p
E.g.:
#source
   A
   | B1
     b11 int = "12" 
     b12 int = "my.package.Global.function(\"string\")"
   | | C1
       c1 MAP string TO C2 
   | | C2
   | B2
#/source

#p
#ldots defines an attribute for the class #src!B1! which is named #src!b1! and
has a simple #java  type of #src!int!, and an attribute of class #src!C1! which is
named #src!c1! and is a aggregate type, namely a map from #src!string! values 
to references to objects of class #src!C2!.

// ------------------------------------------------------------------
#h3 #title Field Names #label txtfieldnames
#p
In contrast to #java #ldots
#list
#i#ldots  a field name #emph!must! begin with a lower case letter,
#i#ldots  a certain field name may not appear more than once in a single
upward path of class definitions. I.e., the attempt of  "shadowing" is treated
as an error.
#/list
#p
The generated java classes will contain a field with the given name.
There is no mangling of field names.#nl
Therefore all lower-case identifiers which are #emph!reserved! #emph!words! in java,
are not allowed as field identifiers and will be rejected (in contrast
to forbidden class names, see above, which do not pass through the
#umod  parser) #emph!explicitly!,
during the context analysis phase.
#p
The fields themselves will nevertheless not be accessible. 
In package mode they will be declared "#src!protected!", and in
monolithic mode they will be declared "#src!private!".
#p
Instead, code for getter and setter methods will be produced.
This guarantees certain
integrity conditions, esp. strictness of non-opt values, see
#ref txtstrictness  below.

// ------------------------------------------------------------------
#h3 #title Initial Field Values  #label txtfieldinitvalues
#p
Initial values can be given to each field.
They have to be denotated as string constants in double quotes which
contain directly inserted java source text (see example above).
Double quotes and backslashes can be used in the contained text by
escaping them with backslashes.
#nl
The java source text is undertaken a syntax check, but no type check. 
The syntax check is done by calling 
#link 2/eu/bandm/tools/metajava/FormatClosure.html 
#loc expression(java.lang.String)
#text !#src FormatClosure.expression() #/link 
from the #link metajava.html #text metajava model#/link.#nl

Most errors in these initialization text will be reported by the 
subsequent run of the java compiler.

// ------------------------------------------------------------------
#h3 #title Per-Field Generated Methods
#p
For each field declaration
#source
EXTEND CLASS C
               f T 
#/source 
#p #ldots there will be #ldots
#commentchar\
#source
  class C {
    ...
    public T get_f() {...}
      // returns the current value
    public boolean set_f(T arg) 
      // raises umod.runtime.StrictnessException iff arg==null
      //   and T is not "OPT xxx"
      // returns true IFF a change is caused by the assignment,
      //   ie. oldvalue!=newvalue
      {...}
    ...
  }
#/source 
#commentchar/

#p
If the command line switch (see #ref txtcommandlineoptions) "#src!--getterfunctions!"
is set to true, there will be additionally #ldots
#source
  class C {
    ...
    public static final ops.Function<C,T> get_f 
      = new ops.Function<C,T>{ public T apply(C c){return c.f;}} ;
    ...
  }
#/source 

#p
If the command line switch (see #ref txtcommandlineoptions) "#src!--setterfunctions!"
is set to true, there will be additionally #ldots
#source
  class C {
    ...
    public static final C opx.Consumer<C,T> set_f
      = new ops.Consumer<C,T>{ public C consume (T arg, C state){
          state.set_f(arg); return state; }} ;
    ...
  }
#/source 

#p
Both these objects are very convenient for using the elegant way of programming
offered by
#link ops.html 
#text metatools'  "#src!ops!" package#/.

//#link 2/eu/bandm/tools/ops/package-summary.html
//#text metatools'  "#src!ops!" package#/.


// ------------------------------------------------------------------
#h3 #title Abstract Fields, Generalized Setter and Getter Methods
#p
By inserting an #nonterminal abstrfield  construct into an 
#nonterminal fieldDef, a "virtual" field can be declared:

#cfRule abstrfield ::= "ABSTRACT" ("GETTER" | "SETTER")? ; 
#p
The #src!ABSTRACT! keyword alone 
causes the generation of both a getter and a setter method.
No field is generated, but these functions rely on the content
of a field defined on a subclass level.
With the keywords #src!GETTER! and #src!SETTER! you can restrict
the generation to that method.
#nl
The definitions of the concrete fields may employ 
a more specific type, iff the type is a model element class and
the more specific type a sub-class thereof.
#nl
The concrete fields do not need to be present in every branch of the
sub-class tree.
#p
The different cases are as follows:

#source
       B
       | B1
       | B2

       A
          f     ABSTRACT B 
          g     ABSTRACT OPT B 
       ¦ A1
          f     B1
          g     B1
       ¦ A2
          f     B2
          g     OPT B2
       | A3
#/source

#p
This code will lead to getter and setter functions in A, A1 and A2, with
different #java  signatures:
#source
     class A  { ... B  get_f(); boolean set_f(B);  B  get_g(); boolean set_g(B); ...}
     class A1 { ... B1 get_f(); boolean set_f(B1); B1 get_g(); boolean set_g(B1); ...}
     class A2 { ... B2 get_f(); boolean set_f(B2); B2 get_g(); boolean set_g(B2); ...}
#/source

#p
For A1 and A1, the getter functions will simply
return the current value of the corresponding field.
#nl
"#src!A3.get_g()!" will return #src!null! as the default value for every
#src!OPT! type.
#nl
"#src!A3.get_f()!" will throw an #src!UnsupportedOperationException!, since
the value can not be delivered, and there is no global default.
#p
With the setter functions its a little bit more complicated, but
also quite canonical:


#source
   x.set_f(a) [/x.set_g(a)]

           a.class==     null            B1       B2 
   x.class==
        A1               XPstrict        OK       XPtype
        A2               XPstrict [/OK]  XPtype   OK
        A3               XPunsp          XPunsp   XPunsp

 OK = store value and return change flag, as usual
 XPunsp = a special Unsupported Operation Exception 
 XPstrict = the special umod Strictness Exception
 XPtype = "normal" java runtime typing error, "class cast exception"
#/source  

#p
This shows that this feature does loose some static type safety. But in practice this
turned out to be not really significant, and the benefits of more
specialization in the subclasses and elegant abstraction pay out.
#p
A common practice is to define an abstract field on the level of some superclass and
to realize its getters and setters
by a #umod   field definition in some of its sub-branches, but by verbatim
given Java methods in some others, see #ref txtverbatimjava.
#p
It is not necessary for non-algebraic types 
that a declared abstract getter function is also implemented
in every subclass, as long as it is not used during program execution.
(But this seems bad programming style ?-)
#nl
But it is necessary for algebraic types, because their "#src!hashcode()! method
does call the getter function for each abstract field and not for the concrete implementations.

// ------------------------------------------------------------------
#h3 #title Pragmas for Field Definitions 
#p
The pragmas are used to control the generation of constructors and visitors,
and are explained in detail together with these, cf.
#ref txtconstructors and #ref txtvisitors.


// ------------------------------------------------------------------
#h2 #title Types #label txt_typeExprs
#p
The same kinds of type expressions can be used to declare the type of a field of 
some
class, or for creating #java  classes  on their own with a type definition statement.
In both cases the syntax is #ldots

#cfRule type ::= reference | primitiveType | constructedType ;

#cfRule reference  ::= classReference | enumReference | typeReference | externalClassReference ; 

#cfInf classReference #def a reference to a class (by its identifier) 
defined locally in this model #/cfInf

#cfInf enumReference #def a reference to an enumeration type (by its identifier) 
defined locally in this model #/cfInf

#cfInf externalClassReference #def a reference to a class declared as "EXT" in an
#nonterminal importDeclaration #/cfInf

#cfRule primitiveType ::= int | float | char | string | bool ;

#cfRule constructedType ::= 
  "OPT" type 
| type "->" type 
| "MAP" type "TO" type 
| type "<->" type 
| "REL" type "TO" type 
| "SET" type 
| "SEQ" type 
| type "*" type | type "+" type | type ("/" type)+ 
;

// ------------------------------------------------------------------
#h3 #title Primitive Types
#p
The primitive types are mapped to #java   types currently as follows
#table#border 1
#tr#td #umod : #td#src!int! #td#src!float! #td#src!char!#td#src!string!#td#src!bool!
#tr#td #java : #td#src!int! #td#src!double[float]!#td#src!char!#td#src!String!#td#src!boolean!
#tr#td boxed type 
             #td#src!Integer!#td#src!Double[Float]!#td#src!Character!#td --- #td#src!Boolean!
#tr#td missing/not yet supported:
               #td#src!long!   #td            #td         #td           #td 
#/table
#p
Please not that "#src!string!" in the #umod  sense is a scalar type, and therefor
written with #emph!lower-case! intial character.
#p
The command line switch #src!--floatNotDouble t! makes that the umod
"#src!float!" type is realized by a Java "#src!float!". 
Cf. #ref txtcommandlineoptions. Without this, the Java type
"#src!doublle!" is employed.
#nl
(The Java types "#src!long!",  "#src!Biginteger!" and   "#src!BigDecimal!" are currently
not yet supported.)

// ------------------------------------------------------------------
#h3 #title Reference Types
#p
Reference types are denotated by identifiers.#nl
They have to correspond to either an external declaration, a class definition or
a type definition.#nl
They are translated into a reference to a java object of the corresponding
java class, #emph!but excluding any reference to  "null"!, cf. #ref txtstrictness.

// ------------------------------------------------------------------
#h3 #title Constructed Type #src!OPT! #label txtstrictness
#p
In java reference types implicitly #emph!always! contain the additional object called
"#src!null!", but all primitive types #emph!never! do.
With #umod  this is treated in a more orthogonal way:
Types of both kinds do #emph!not!  include the #src!null! value.
But by applying the #src!OPT!  constructor you get a type which is "optional",
i.e. which includes the value "#src!null!" as an additional value in its
"carrier set".
#p
The #src!OPT! type construcutor does not create new java class definitions on its own,
but it (a) modifies the code realizing the "#src!set_<>()!" functions
and the constructors, 
and (b) selects which proxy class will be selected 
for aggregate types (lists, sets, maps, etc.), when applied to their argument(s).
#p
By these means a #umod  model always guarantees #emph!strictness!,
i.e. that a value the type of which is
not #src!OPT! will never take the value #src!null!.

// ------------------------------------------------------------------
#h3 #title Constructed Types in General
#p
All type constructors are fully compositional, i.e. can be nested arbitrarily !-)
#nl
(Of course, some combinations do not make any sense, e.g. a multiple application
of #src!OPT!, which is idempotent.)
#p
Most constructed types are translated into parameterized instances of generic
classes, either directly from the "#src!java.util!" zoo, or from 
our own proxy classes in 
#link 2/eu/bandm/tools/umod/runtime/package-summary.html
#text #src!umod/runtime!.#/link
(The latter are needed
to guarantee the strictness condition ("#src !=null#/src") for all fields and
values which are not of type "#src!OPT!".)
#p
In both cases the interfaces for constructing, changing and inquiring 
follow the interface definitions of the corresponding collection
types from "#src!java.util!".
#p
The different type constructors and their notation are #ldots
#table 
#tr#td #src!SEQ! #ital!t! #td Sequence (= list).
#tr#td #src!SET! #ital!t! #td (Final) power set.
#tr#td #src!MAP! #ital!t1! #src!TO! #ital!t2! #nl
       #ital!t1! #src!->! #ital!t2
!               #td final (possibly partial) map
#tr#td #src!REL! #ital!t1! #src!TO! #ital!t2! #nl
       #ital!t1! #src!<->! #ital!t2
!               #td multimap, as defined in
#link 2/eu/bandm/tools/ops/Multimap.html
#text #src!ops/Multimap!.#/link


#tr#td #ital!t1! #src!*!  #ital!t2
!               #td pair, i.e. simultanuous combination of two instances of the two 
types.
#tr#td #ital!t1! #src!+!  #ital!t2
!               #td co-pair, i.e. alternative selection of left or right side.
#/table

#p
The default value for every field of a (non-optional!) aggretate type is
an #emph.empty instance.  of this aggregate, cf. #ref txtconstructors below.
#p
Whenever a new instance for such a field needs to be created explicitly,
the constructor call of the correctly instatiated run-time class must
be hand-coded explicitly. This can be very tedious, 
cf. #src!CheckedMap_LR<String, CheckedMap_L<Integer,CheckedSet<Integer>>>!.
It may be easier to create a dummy instance of some class definition and
make a #src!get_<field>()!" for retrieving a correctly typed empty instance.


// ------------------------------------------------------------------
#h3 #title Special "Un-Curry-ed" Treatment of Cascaded #src!MAP!s
#p
In case of sparse data and for sake of efficiency,
on the conceptual level oftenly the following type transformation, called
"Currying", is applied:
#source
   (A * B * C) -> D  ==>  A -> B -> C -> D
#/source
#p
So the data is #emph!realized! as a map of maps of maps, but
the way we want to operate is  "on one single three-dimensional" map.
This interpretation requires the following operations:

#source
   m.containsKey(a,b,c) 
      = m.containsKey(a) ? m.get(a).containsKey(b) ? m.get(a).get(b).containsKey(c)
                                                   : false 
                         : false 

   m.get(a,b,c)   = m.get(a).get(b).get(c)
   m.put(a,b,c,d) = m.get(a).get(b).put(c,d) // <- AND CREATE all intermediate maps
                                             //    as necessary

#/source
#p
Let "#src!F!" be the same of a field definition and "#src!o!" an object reference.
Each field can be defined as "strict", non-null simply by not prefixing its
type with "#src!OPT!". So the default for any top-level field of
type "map" is an #emph!empty! map.
But this is not the case on the further levels of nesting:
Initialy, the map "#src!o.get_F().get(a)!" does not exist, i.e. "#src!a!" is
not contained in the domaim ("as a key") in the map returned by #src!o.get_F()!, and
#src!o.get_F().get(a).get(b)!  consequently throws a null pointer exception.
#p
To support this "un-curried" view to the map,
#umod  generates code for these two methods which are
safe: You always can call "#src!o.put_F!(a,b,c,d) ", and the necessary intermediate
maps will be constructed automatically. You always can call
"#src!o.containsKey_F(a,b,c)!". If this returns #src!true!, then you can
safely call "#src!F_get().get(a).get(b).get(c)!".

#p
The analog mechanism exists for sets, lists and multimaps:
#source
  A
  | f  A -> B -> SET C 
  | g  A -> B -> LIST  C 
  | h  A -> B <->  C 
#/source
#p #ldots generates code for  #ldots
#commentchar\
#source
  class A { ....
     public void add_f (a, b, c) {// add c to the set selected by a and b,
                                     and create this and all intermediate maps
                                     iff necessary 

     public void add_g (a, b, c) {// append c to the end of the sequence
				     selected by a and b,
                                     and create this and all intermediate maps
                                     iff necessary 

     public void add_h (a, b, c) {// add c as a value for the key b to the
                                     multi-map set selected by a, 
                                     and create this iff necessary 
#/source
#commentchar/


// ------------------------------------------------------------------
#h3 #title Overloading of a #src!null! function result 
in standard java runtime libraries
#p
Please note that for the standard java implementations it holds that
#source
   m.containsKey(a) == false  ==> m.get(a) == null
#/source 
#p
We do #emph!not! follow this rather confusing overloading of "null".
Indeed, it does not make real sense when thinking in a "strongly typed way":
In case of #src!A->B! (with #src!B! not-optional) you want to be guaranteed 
#emph!never! to get a #src!null!. In case of #src!a->OPT B!,  a value of #src!null! 
contained in the map, and the key not being in the map at all, are two very different things.
#p
Therefore with #umod  the attempt of a "get" without "containsKey==true" is undefined,
and may result in an exception.


// ------------------------------------------------------------------
#h2 #title Documentation
#p
Documentation can be entered in the #umod  source. It will be attached to the generated
code as "Java doc comment", and thus re-appear when generating API documentation of the
generated sources by applying "#src!javadoc!" etc.
Therefore the usual stylistic rules for the writing of doc comment should be considered,
esp. that the first sentence up to the character sequence "#src!. !" is 
quoted in survey tables.
#p
The #nonterminal docEntry   defines the format for any doc text entry.

#cfRule docEntry  ::=  "DOC"  (characters)* "$$" ;

#p
As shown in the rule #nonterminal umodFile, 
documentation can be attached to the module as a whole.
Similar this is allowed by rule
#nonterminal classDef for every class definition, 
by #nonterminal fieldDef  for every field definition,
by #nonterminal enumDef  for an enumeration type as a whole, and 
by #nonterminal enumItem  for every single enumeration value.

// visitor docu ??
// type definition, ??

Addtionally, the construction #nonterminal fieldDoc  has been introduced especially 
into the definition #nonterminal classDef  for the 
separation of field declarations and their documentation.

#cfRule fieldDoc  ::=  identifier_lower "DOC"  (characters)* "$$" ;

#p
For all these places, more than one such construct may appear: They will be 
concatenated in text order.
#p
Whenever at least one such doc entry is contained, then #umod   runs in #emph!documented mode!,
otherwise #emph!undocumented!.
#p
In documented mode, a stream-lined pretty print of the original source (as navigateable
HTML) will be inclduded
in the "#src!doc-files!" directory, and the generated doc comment will frequently refer
to this (reconstructed) source. 
#p
The toplevel docu will be attached (as doc comment) to the model class; 
additionally it will be written
into the file "#src!package-info.java!", iff #umod   runs in non-monolithic mode (=package mode).
#nl
The toplevel docu will be followed by a second, synthesized comment which reflects
date and time of creation of the java sources, and the command line
parameters. This synthesized comment will always be attached, also in non-documented mode.
#p
All docu attached to fields and classes will be followed by a second, synthesized comment
which gives the link into the pretty-printed source.
#p
If running in documented mode, all classes #emph!not! having documentation text
will insert a warning text into the generated Java API doc.

#p#kind missing
DOC of interface def

#p#kind missing
DOC of type declarations 


// ------------------------------------------------------------------
#h2 #title Constructors and Default Values for Fields #label txtconstructors
#p
Whenever a new object instance is constructed, the value for every
single field must be defined.
This can be done by
#list
#i implicit default
#i explicit default
#i constructor argument.
#/list

#p
For #emph!implicit! defaults is holds that #ldots
#list
#i a field of type #src!OPT(x)! has the value #src!null! as its default.
#i a field of a non-optional aggregate type (set, list, map, multimap, etc.) has
the #emph!empty aggregate! as its default.
#i all other fields (primitive types and references, which are not optional)
do #emph!not! have an implicit default. 
#/list

#p
An #emph!explicit! default can be given to any field by notation 
mentioned above in #ref txtfieldinitvalues.
A field with neither implicit nor explicit default value must appear as a constructor
argument and is called #emph!obligate field! for the rest of this section.
// NEU seit 20181218
#p
A #emph!minimal constructor! is a constructor the parameters of which are exactly the
values for the obligate fields.
The user can specify one minimal constructor explicitly, or more than one, if the
sequential order of the field values yields different type signatures.
If there is no single explicit minimal constructor, one minimal constructor is
supplied by the #umod  compiler implicitly; its parameters are the values for the obligate
fields in the source text order.
#footnote
Please note that this implicit constructor may lead to overloading conflicts in 
between constructor type signatures, eg. in the case
#nl#src(TOPLEVEL CLASS)
#nl#src(A)
#nl#src(   f1 int     ! C 0/0 ; )
#nl#src(   f2 int               )
#nl#src(   f3 int = 3 ! C 0/1 ; )
#nl
See #ref txt_consabmig.
#/footnote
#p
A special subcase of a minimal constructor is the 0ary constructor, iff there are no
obligate fields. The automated generation is suppressed iff the user defines a 
0ary constructor by explicit Java source, see #ref txtverbatimjava.
#footnote
In an older implementation implicit supply of minimal constructors had been restricted to
0ary constructors. A source text which failed to define at least one 
constructor covering all obligate fields (here and of all superclasses) had been rejected.
This behaviour is still available by the command line switch #src!--constructorsPre20181214 t!
#/footnote


/*  OLD VERSION =================================================================
#p
In case that #emph!all! fields of a class definition (including all those
inherited from a superclass) do have a default value, then a
#emph!0ary! constructor (a constructor with zero arguments) is created
for this class automatically.
=== */

#p
// pre 20181214 All other constructors, i.e. those with parameters, 
All non-minimal constructors 
must be declared #emph!explicitly! by the user.
It the class defines no obligate fields, all constructors of the superclass are
inherited (which is different to Java).
Otherwise, the signatures of these constructors can be expanded eplicitly to 
make them applicable. At least all obligate fields must be added.
#nl
(Such inheritance is only supported from a superclass which is a #umod  class definition, not from
an external, imported class.)
#p
The declaration of a constructor is done by pragmas following the
field definition, as mentioned above in #ref txtfielddefs, by appending
#nonterminal fieldPragmas . 
The syntax for constructor declaration is defined as #ldots

#cfRule constructorPragma ::= "C" (constructorNumber "/" sequentialOrder)+ ;

#p
E.g. #ldots

#source
TOPLEVEL CLASS
A
   f   int      ! C 0/0  C 1/0 ;
   g   OPT int  ! C 0/1        ;
#/source

#p
Declarations of constructors use a pragma starting with the keyword  "#src!C!".
#nl
The first number following the keyword is a number identifying the constructor.
#nl
The second number, after the slash, indicates the position of the argument
which will be used to initialize the field to which the pragma belongs.
#nl
These position indications only stand for their #emph!sequential order!.
The numbers can increase with arbitrary step width.
#nl
Every combination of constructor and argument number may only appear once 
with all field definitions of the same class definition level.
#nl
Every constructor must initialize all obligate fields.

#p
NB: Since the character "#src!C!" in these pragmas is parsed as
an "identifier", there must be whitespace between it and the first digit.
#p#kind src 
cf. #link 3/eu/bandm/tools/umod/parser/umod.g #text umod.g

#p
So the example above creates two constructors:
#source
  public A (int arg0, int arg1){ f = arg0 ; g = arg1 ; }
  public A (int arg0)          { f = arg0 ; }
#/source
#p
Please note that constructor "1" can only be defined because field "#src!g!" has
a default value (namely #src!null!).
#p
Please note further that constructor "1" is the only possible minimal constructor and would be 
synthesized implicitly if not specified explicitly.
/* ====
The #emph!definedness requirement! says that an
attempt to define a constructor which does #emph!not supply an argument
for all obligate fields // PRE 20181214 those fields which do not have a default value! 
is considered an error and signalled as such.
==*/
#p
In contrast to java, constructors are #emph!inherited! from (#umod -defined) 
superclasses.
This happens is different ways:
#list
#i 
if no pragma with the same visitor number appears in the subclass:
   #list 
   #i 
   if the sub-class introduces no obligate fields,
// only fields with default values (implicit or explicit), 
   then the constructor is inherited "as is" for this subclass.
   #i 
   if there are new new obligate fields on this class level,
// fields which would require constructor arguments, 
   than the constructor is #emph!not! inherited to this subclass and any 
   further subclass, and a corresponding #emph!warning! is emitted.
   #/list
#i
if one or more pragmas with the same visitor number do appear in the subclass:

   #list
   #i 
   if the lowest argument number is larger than the largest argument number
   used in the superclass, then the construtor is "extended": The new arguments
   are appended to the list of the arguments of the superclass, the generated
   code assigns the values of the "new" arguments to the corrsponding fields,
   after calling the constructor of the superclass with the sequence of
   inherited arguments.#nl
   Again, all obligate fields must be included.
//    Please note that the "definedness" requirement mentioned above applies
//   here accordingly.
   #i
   if the lowest argument number is equal to #src!0! , then the constructor
   number is "recycled" and a totally new constructor chain is started here.
//NEW 20181214
   Please note that all obligate fields must be included in such 
   a constructor explicitly, including those of all superclasses,
   using the "#src!^!" Notation, see below.
/* === pre 20181213    Please note that, according to the definedness requirement, 
   #emph!every start of a constructor chain is only possible if the superclass has a 0ary
   constructor. ==*/

   #i 
   if the lowest argument number is not equal to #src!0! , but lower than
   the highest argument number used on the superclass level
   for this constructor, than it is an error.
   #/list
#/list

The special notation 
// "#src!^!<fieldname>" 

#cfRule superField ::= "^" identifier_lower fieldPragmas ; 

#p
(as contained in #nonterminal classDef) 
allows to refer to a field of 
some higher level class definition for including its initialization into
a new constructor:
#source 
A
   f   OPT int
   g   OPT int
| B
| | C
   name string  ! C 2/0 ;
   ^f           ! C 2/1 ;
#/source
#p
// NEU 20181214:
Also with this device, one particular field name may appear at most once in a constructor
signature.


#p#kind missing 
link to non-umod superclass (e.g using 0ary constructor, which 
is called by all others !!)


// ------------------------------------------------------------------
#h3 #title Ambiguity in Overloading Resolution for Constructors
#label txt_consabmig
#p
In java constructors are identified by their class signature.
This can lead to overloading situations which cannot be resolved by a
java compiler. E.g.
#source
A 
  a1 OPT int     ! C 0/0      ;
  a2 OPT int     ! C     1/0  ;
#/source
#p
#umod  does #emph!not! warn you in these cases, but lets the java compiler
discover the problem.

// ------------------------------------------------------------------
#h3 #title Defining Constructors with #ital!Verbatim!  Inserted Java Source
#p
If the #ital!verbatim!  inserted java source 
(cf. #ref txtverbatimjava) defines a #emph!0ary! constructor explicitly, then the
implicit creation of such is suppressed.
#p
All other conflicts between explicitly specified constructors and
#ital!verbatim! inserted java source are discovered #emph!not before running the
java compiler!.

// ------------------------------------------------------------------
#h2 #title Pattern Handling Methods
#p
For to use with the 
#link 2/eu/bandm/tools/paisley/package-summary.html #text Paisley#/link   
  pattern matching library, 
special pattern handling methods can be generated, two for every class and
one for every field definition.
This in controlled by the command line switch #src!--patterns!, see
#ref txtcommandlineoptions.

#p
The names and meanings of the generated methods are as follows:
#commentchar \
#source
   class A extends B {
     protected F1 f1 ; 
     protected F2 f2 ; 

     public static Pattern<A> get_f1 (Pattern<F1> p){..}
     public static Pattern<A> get_f2 (Pattern<F2> p){..}
     // These pattern match the object iff the pattern argument "p"
     //   matches the resp. field value.
   }

   class __Patterns {
     // ...

     public static Pattern<Object> cast_A (Pattern<? super A> p){..}
     // matches iff p matches and argument is instance of A 

     public static Pattern<? super A> term_A
        ( Pattern<? super B> superpattern,
          Pattern<? super F1> f1_pattern,
          Pattern<? super F2> f2_pattern){..}
     // matches iff object matches superpattern (= a pattern defined for the
     //   superclass) and all field values match the resp. patterns.
     // NOTE: one can treat any of these as "don't care" by setting them
     //         to "Pattern.any"

     // ...
   }


#/source
#commentchar /

// ------------------------------------------------------------------
#h2 #title Visitors #label txtvisitors
#p
#umod  supports the code generation for different types of visitors.
They are used in the traditional way, i.e. used as a superclass
for a user-defined class, which overrides only those methods which deal
with those parts of the model the user is interested in.
#p
In a #umod  source, the declaration of a visitor involves two steps:
#list 
#i definition of one or more different #emph.traversal orders..
#i declaration of the visitor classes.
#/list
#p
The different traversal orders are identified by numbers.
They are declared within the class definitions, by
appending pragmas to the field definitions.
This is similar to constructor declarations (#ref txtconstructors).
#p
The syntax is
#cfRule traversalPragma ::= "V" (traversalNumber "/" sequentialOrder (lrCode WS)* )+ ;
#cfRule lrCode ::= ( "L" | "R" )+ ; 
#p
So the first number after the leading "#src!V!" identifies the traversal
order. The second number, after the slash, indicates the sequential order
in which the corresponding field's contents  will be visited, relative to the
contents of the other fields of this level of class definition.
#p
(As with constructor declarations, there must be white-space after the "#src!V!".)
#p
As with constructor declarations, the position numbers are only relevant w.r.t. their
sequential order; they can increase with arbitrary step width.
#p
In contrast to constructor declarations, they are not related to position 
numbers used in the definition of the superclass. They only define
the traversal order among the fields of this this level of class definition.
The traversal order w.r.t. the superclass cannot be influenced by their selection, 
but is defined by the "kind" of generated visitor, declared as described below.
So their sequence can start with an arbitrary numeric value.
#p
Visitor and constructor defining pragmas can arbitrarily be mixed in the
pragma section "#src!!!...#src!;!" at the end of a field definition.
#p
Example:
#source
   A
   | B1
     b1  B1               ! C 0/0  V 0/1 1/0 ;
     b2 SEQ B1            !        V 0/0     ;
   | | C1
       c1 string -> C2    !        V 0/2 V 1/20 C 0/20 ;
       c2 OPT int 
#/source
#p
Whenever the type of a field a visitor shall follow contains #src!MAP! or
#src!REL! constructs, an additional #nonterminal lrCode  can be inserted after
the numeric code. It indicates whether to visit the left or the right side of
each level of these binaray type constructor applications. The codes can
enable  leaves or whole sub-brances, as in
#source
   A
   | a1  (A -> int) -> SEQ (A <-> A)  ! V 0 LL   V 1 LL R  ; 
#/source
#p
where traversal code #src!0! will only select the references to #src!A!  in the
domain of the domain, while #src!1! will visit additionally both sides of all 
contained multi-maps.
#emph!Please note! that the #nonterminal lrCode  does #emph!not! alter the sequential
order of visiting, but constitutes only an enabling condition.
#nl
(In general: Whenever you want to program an algorithm which depends on 
a "local" consequence of a "global" property, as it is the case with 
the sequential order of visiting, then the code is better maintainable 
when realizing this explicitly order-respecting behaviour
#emph!locally!. The required extra code is in most cases only a three-liner !-)
#p
At the beginning of each #umod   definition  file the generated
visitors are declared, as already mentioned in the grammar rule 
#nonterminal modelLines above.


#label txt_syntax_visitor_declaration
The syntax for these declarations is #ldots
#cfRule 
visitorDeclaration ::= "VISITOR" int identifier_upper 
  \nl                         ( "MULTIPHASE"
                        | "IS" "PRINTER"
                        | "IS" "REWRITER"
                        | "IS" "COREWRITER" 
                        )?    (docEntry)?           ";" ;

#p
#nonterminal identifier_upper  directly gives the name of the generated java class which
realizes the visitor.
#nl
The #nonterminal int  indicates which traversal order is used by the
generated visitor. Of course, one and the same number can be
used for more than one visitors.
#nl
Then follows the optional indication of the visitor kind. If this is omitted,
a "simple kind" visitor  is generated.


// ------------------------------------------------------------------
#h3 #title Common Base Class and Calling of Visitors
#label txtvisitorbasics
#p
The common base class generated by #umod  for all visitors is
one and the same abstract class. Let this be called
"#src!BaseVisitor!" in this and subsequent paragraphs. (Indeed, in the generated code its 
currently named "#src!MATCH_ONLY_00!",
but this name is normally not visible to the user and may change without notice.)
#nl
For each class #src!C! of the model, #src!BaseVisitor!  provides a method 
#src.public void match (C x){}.. 
#nl
Additionally, it provides a method
#src!public void match (java.lang.Object x){}!, which allows to
dynamically type an arbitrary object. Neither the class of this object
needs to be known statically, nor even whether it is an instance of any
model class at all.
#p
Calling #src!match()! on an arbitrary object from the model 
is the most common way of activating a visitor.
Internally, the specialization on the argument is performed explicitly by a chain
of #src!if(x instanceof C'){...}else!-statements.
#p
Whenever the #emph!most special! model class
 #src!C'! of the visited object is identified, this information
is carried over into the static type information of the visitors source code,
and the corresponding "#src!action()!" method is called by explicit casting.
#nl
There is a default "#src!action()!" method in #src!BaseVisitor!, which indeed does 
call the #src!action()! method with the argument casted statically to its
superclass, or calling a special #src!nomatch()! method, if the class is a 
top-level class of the model.
This raises a #src!RuntimeException!  in case that the #src!partial! flag is not
set to true. This feature can be used to discover forgotten cases, while assuming
all cases covered.
#p
The different kinds of derived visitors, automatically generated or user defined,
differ in the contents of this #src!action()! method,
as described in the following sections.
#p
In case that the visited object is neither an 
instance of any model class, nor of an imported external class,
then the method #src!BaseVisitor.foreignObject(Object o)! is called.
This method #emph!must! be overridden whenever a #src!match()! shall be applied
also to objects of un-known classes. 
As a default, this method #emph!throws a Runtime Exception!
with the message that "o" ist not an instance of a model class.
#p
Furthermore, for each field definition "#src!f!" which refers to an
aggregate (list, set, map) of instances of model classes, 
the generated code (for the class which contains this field)
provides the method "#src!public void descend_f(final BaseVisitor visitor){..}!".
#nl
This code loops over the contents of the aggregate automatically and can be
used #emph!from anywhere! for explicitly applying a visitor  to all
elements contained in a given aggregate field.
#nl
If a field f carries a visitor code with number n and an 
#nonterminal lrCode which selects only a subset of all branches, then 
a #src!descend_n_f(BaseVisitor)! method is generated which respects this
selection.

#p
Instances of "#src!BaseVisitor!" itself can be used for just classifying model objects
by overriding some "#src!action()!" methods, while
not providing any default descending behaviour.


#p#kind missing 
"partial" flag behaviour not supported on the next class levels ????


// ------------------------------------------------------------------
#h3 #title "Simple Kind" Visitor
#p
The #src!action()! method of a generated visitor of the "simple kind"
first calls the #src!match()! method on the sequence of fields, as
determined by the selected traversal order, and then calls
#src!action()! with the parameter casted statically to its superclass.
#p
#commentchar \
For example, assume a model definition like #ldots
#source
MODEL example = 
VISITOR 0  V0  // simple kind
VISITOR 0  V1 MULTIPHASE

TOPLEVEL CLASS 
X
A
      a1 X             ! V 0/0 ; 
      a2 SEQ (X->X)    ! V 0/1 ; 
| B
      b1 X             ! V 0/0 ; 
      b2 SEQ (X->X)    ! V 0/1 ; 
| | C
      c1 X             ! V 0/0 ; 
      c2 SEQ (X->X )   ! V 0/1 ; 
<<< JAVA static class Derived extends V0 {
           public void action (B x){
             //do something
             super.action(x);
           }
         }   
#/source
#commentchar /

#p
Then a typical control flow when calling "#src!Derived.match(o)!" with an argument 
which happens to be of class "#src!C!" can be depicted as follows:

#commentchar \
#source
                    Derived.match(o)
                    /
         __________/
        /          
       V  
BaseVisitor.match(Object o) 
       |        
       V        
BaseVisitor.match(A x)  ........................> V0.action(A x)
       |           		             ^\                            
       |           		             | \                           
       |           		             |  +-> match(x.a1);
       |           		             |      x.descend_a2(this)
       V                                     |  
BaseVisitor.match(B x) ....> Derived.action(B x){     |
       |           //do something            |                             
       |           super.action(x);          |                      
       |         }   |                       |                          
       |             +-----------------> V0.action(B x)                       
       |         ^                            \                            
       |         | 		             . \                           
       |         | 		             .  +-> match(x.b1);      
       |         | 		             .      x.descend_b2(this);
       |         |		             .      this.action((A)x); 
       V         |		                  
BaseVisitor.match(C x) -- |  -------------------> V0.action(C x)                          
                 |		              \                            
                  \      	               \                           
                   \	   	                +-> match(x.c1);
                    \	   	                    x.descend_c2(this);
                     -----------------------------< this.action((B)x);                  		                  
#/source
#commentchar /

/* ?????????????????????????????????????????????
#p
The picture shows clearly #bold that there is no inheritance/code reusage#/
 between the #src!action()! methods: In spite of being a sublass 
of #src!B!, the processing of an object of class #src!C!  never reaches
the method #src!action(B x)!.#nl
With this kind of visitors, all abstraction/code re-usage has to be
coded explicitly, by local subroutines called from multiple
#src!Derived.action()!-methods.
??????????????????????????????????????????????? */


// ------------------------------------------------------------------
#h3 #title Multiphase Visitor
#p
With these simple visitors the code re-usage (induced by a common 
superclass and thus a common "#src!action()!" method) does only
take place #emph!after! the specific descends have been carried out
(by the more specific "#src!action()!" method). 
#p
Visitors of the #src!MULTIPHASE! kind do separate the code abstraction 
and the descending into different phases.
//support a finer granularity of
//control by separating the visting process into three phases.

The generated code basically looks like this:


/* ========================== ???????????????????????
The #java "inheritance" axis is employed for deriving one visitor from
another.
Therefore explicit coding is needed to the support the #emph!re-usage! of
code w.r.t. the #em.visited objects..
This code is generated in the  #src!MULTIPHASE! kind of visitors.
#nl
Their #src!action! method is defined by #ldots
======================================= */


#source

  public static class V1 extends BaseVisitor {
    protected boolean haspre=true;
    protected boolean hasdescend=true;
    protected boolean haspost=true;

    public void action (C x){
      if (haspre) pre(x);
      if (hasdescend) descend(x);
      if (haspost) post(x);
    }

    public void pre (C x) {pre((B)x); }
    public void pre (B x) {pre((A)x); }
    public void pre (A x) {}

    public void descend (C x) { match(x.c1);
                                x.descend_c2(this);
                                descend((B) x);
                              }
    public void descend (B x) { match(x.b1);
                                x.descend_b2(this);
                                descend((A) x);
                              } 
    public void descend (A x) {  ...
                              }

    public void post (C x) {post((B)x); }
    public void post (B x) {post((A)x); }
    public void post (A x) {}

#/source
#p
The variables #src!haspre!, #src!hasdescend! and #src!haspost! are "global 
switches" to enable these three phases independently. They can be overwritten
by the derived visitor's code. E.g. they can be set to "false" once,
at intialization time, or switched on and off dynamically during execution,
#p
This more complex schedule allows specialization/inheritance of activities, without
disturbing the inheritance w.r.t. descending.
Again, this may become clear when looking at a graphical representation of the
resulting control flow:


#source
match(Object o)       :
       |              :                   ..............
       V              :  user-defined ... :            :
match(A x)            --------------------+            :
       |                         ^             ^       :              
       |                         |             |       : ...specialized
       V                         |             |       :      processing           .
match(B x)                   pre(B x)  descend(B x);   +------------------
       |                         ^             ^\                  ^
       |                         |             | \=> call match()  |
       |                         |             |     for fields on |
       |                         |             |      "B"-level    |
       V                         |             |                   |
match(C x)--->action(C x)--> pre(C x); descend(C x);         post(C x)
                                             \
                                              \=> call match() for 
                                                   fields on "C"-level
#/source

#p
Please note that with this variant there is  #bold no inheritance#/ between 
the #src!action()! methods. 
For code re-usage you always have to to program the
#src!pre()! methods, possibly disabling both #src!descend()! and #src!post()!.
#p
Esp. when adding the #src!MULTIPHASE! behaviour to an existing "simple" visitor,
then the exisiting #bold inheritance between#/bold  #src!action()! #bold  will 
be lost#/bold and
replaced by the described three(3) separate  inheritances ! 


/* ========================== ?????????????????????????????
#p
#bold Please note#/ that there is (as with simple kind visitors)
still #bold no inheritance#/ between 
the #src!action()! methods, --- a fact the author of #umod  himself quite
frequently forgot when using it #src!!-)! //
For code re-usage you always have to to program the
#src!pre()! methods, possibly disabling both #src!descend()! and #src!post()!.
==================================================== */

// ------------------------------------------------------------------
#h3 #title Rewriters
#label txt_rewriters
#p
There are two kinds of rewriters: A visitor declared as #src!COREWRITER! can
deal with cycles, but #emph!always! creates copies, even if nothing changes.
#nl
A visitor declared as #src!REWRITER! cannot deal with cycles, but does
cloning only if necessary. It is most convenient for transforming
"term-like" data, and preserves sharing as far as possible.
#p
Both kinds of rewriters are non-destructive: Whenever only a single value
must be changed due to rewriting, a new copy of the containing object
is created, altered and used for the further rewriting process.

#p
The usage of both kinds of rewriters follows the pattern #ldots
#source
  MyRW rw = new MyRW();
  rw.match(o);
  Object rewritten_object = rw.get_result();
#/source
#p
For convenience this is the same as #ldots
#source
  Object rewritten_object = (new MyRW()).rewrite(o);
#/source
#p
#ldots and there is also a typed variant #ldots
#source
  A original ; 
  A rewritten_a = (new MyRW()).rewrite_typed(a);
#/source
#p
In case of a (non-co-)rewriter there is a second constructor
#source
  public RM (RW parent){..}
#/source
#p
which takes an existing rewriter as its argument.
This is made the "parent" rewriter, and all cache look-ups will
be passed to this parent, iff they are unsuccessful in the local cache.
So things like "nesting and inheritance of scopes" can easily be modelled.
#p
When the generated (non-co-)rewriter is called directly, 
without overriding any method by the user,
it does an #emph!identity! transformation, it does nothing.
#nl
But when the generated #emph!co-!rewriter is called directly, without any 
overriding by the user, it creates a #emph!deep copy!  of its argument.


#p
All generated rewriters contain two methods for every class definition 
#src!C!, namely #src!action(C)! and #src!rewriteFields(C)!.
#nl
#src!action()! is called by the #src!match()! cascade, as defined for the
general case and described above in 
#ref txtvisitorbasics.
It (1) performs the preparatory steps of rewriting, and (2) is not
specially concerned with the fields selected by the traversal order.
#p
Both kinds of methods may be overwritten by the user.
A certain contract must be kept, which is best explained by looking
at the behaviour of the generated code. 

#p
#bold!For the (non-co-)rewriter!, the interface to use consists of #ldots
#commentchar\
#source
\\private   Map<Object,Object> cache  // can be set to null for disabling caching
\\private                             // it must not be manipulated explicitly,
   boolean lookUp(Object) // sets result/ismulti flag and returns true,
                          //   iff an entry exists in the rewriter's cache, 
                          //   or in that of its parent.
   void putToCache()      // memorize the currently set result (single or multiple)
   void useCache(boolean) // sets whether the generated "action()" may read the cache

   Object original ;      // must be readable and writeable by "action()"
   Object getResult();    // returns the most recently set result
   boolean isMulti();     // returns whether this is multiple (= a list)

   void revert();         // reset result to original
   void substitute(Object newresult);
                          // set newresult as result
   void substitute_multiple(List<Object> newresults);
                          // set newresults as multiple result
   void substitute_empty(); // set empty list as multiple result
#/source
#commentchar/

#p
In case of the (non-co-)rewriter, the 
generated method #src!RW.action(C c)! does the following:
#list
#i It looks up in the cache #src!RW.cache! whether
the object #src!o! has already been processed by this rewriter instance
(or by some rewriter in the #src!parent! chain).
In this case the result of the earlier visiting process is drawn from the cache and 
stored as result (single or multiple),
and the method returns immediately.
#nl
This cache look-up can explicitly be disabled by calling #src!useCache(false)!.

#i 
If no result is retrieved from the cache, then the object #src!o! itself 
is stored into #src!original! and memorized as the
(likely only intermediate!)  value of #src!result!.
#i 
A copy (i.e. a shallow clone) of #src!o! is created,
#i and the method  #src!rewriteFields(Object)! 
is called with that clone as an argument.
#i When this method returns,
whatever is currently the value of #src!result! is left there
(for the caller of #src!action(Object)!) and saved to the cache
as the rewriting result of the visited object #src!o!, by calling #src!putToCache()!
#/list

The method #src!rewriteFields(C c)! generated for every class #src!C! performs the
non-generic, field structure specific rewriting.
Its argument is the clone of the object.
It first calls #src!rewriteFields((D)c)! for the superclass "D".
Then it saves the  current value of "result" into a local variable.
It assumes that this points either to the original or to the clone,
depending on whether changes to any field have happened in the super-class(es).
#p
Then for all those fields which are selected by the chosen traversal order,
#src!match()! is called on their contents. 
#p
Whatever this method returns in the variable #src!result! is compared
with the original value contained in the field. Iff a change has happend, the result
is stored into the field of the clone, 
and the overall local result of the method is overwritten to point to the clone.
#p
Finally, after all fields from the corresponding traversal selection
have been rewritten, the local result is copied to #src!result!, for communicating
it to the caller (which may be a #src rewriteFields()#/  of a sub-class or
the #src action()#/ method of the same class).
#p
If a field value is an aggregate (i.e., is of a "container type"), 
(1) a temporary new aggregate object ist constructed.
Then (2)  #src!match()! is called sequentially on the contained objects,
and (3)  #src!result! is step by step treated accordingly, i.e. stored into
the temporary aggregate.
Here also the occurance of changes is monitored in a similar was as described
for simple values. Whenever a change happens in an arbitrarily deep nesting level,
then the local result is re-adjusted to point to the clone.
#p
Any user-defined, overwritten method may behave similar, e.g. define the
result of the rewriting by calling the methods from the interface above.
#p
A user-defined, overwritten method may return more or less than one object
by calling #src!substitute_multiple(List<Object>)! or
#src!substitute_empty()!. This list of objects
will be inserted in the nearest enclosing list or map structure.
Up to this, multiplicity distributes! 
#p
E.g. having a structure and code  like (in a symbolic notation!)
#source
     A     a  SEQ (B * C)

     action (B b){ original = b ; substitute_multiple(new List(b1, b2));
     action (C c){ original = c ; substitute_multiple(new List(c1, c2, c3));

     a1 = { (B1,C2) }
#/source
#p
#ldots this will yield #ldots
#source
     rewrite(a1) = { (b1, c1)(b1, c2)(b1, c3), (b2, c1)(b2, c2)(b2, c3) }
#/source

#p
Please note that the level which calls "#src!match!" always needs 
the #emph!pointer value! comparison #src original!=getResult()#/  for determining
whether a relevant change has happened. So it is part of the
contract of #src!action()! to set #emph!both! values before returning!

#p
#bold The operations of the #ital!co-!rewriter#/bold  are much simpler.
The interface is 
#commentchar\
#source
   boolean lookUp(Object) // sets the variable "result" and returns true,
                          //   iff an entry exists in the co-rewriter's cache, 

   void putToCache(origObj, newObj)
           // memorize newObj as the rewriting result of origObj
           // set the value of result=newObj
	   // This can ONLY BE CALLED ONCE for each key

   Object getResult()     // returns the most recently set result
   rewriteDone(Object key)// restores the "result" value to the clone of "key"
                          //   (this is called before returning from the 
                          //    rewriting method, mostly "action(Object)")
#/source
#commentchar/

#p
The generated #src!action(o)! method #ldots
#list
#i 
#ldots first creates  #src!clone=o.doclone()!.
#i
Then it enters the clone into the cache #emph!in advance!.
Only this enables the generated co-rewriter to deal with cyclic data.
#i
Finally it calls #src!rewriteFields(clone)!.
#/list
The method  #src!rewriteFields(clone)! calls #src!match(clone.get_f())! on
all selected fields, as in the non-co-rewriter case. But it needs not
monitor whether changes occur, since all objects are copied anyhow.
#p
When the user overrides the generated #src!action(Object o)! method, the user's code
should #ldots
#list
#i look-up in the cache whether #src!o! has already been visited.
#i enter a new rewriting result into the cache by calling 
#src!putToCache(object, object)!
#emph!before! descending into sub-fields, whenever there could be a 
("cyclic") path in the sub-structure which leads back to the currently
rewritten object!
This method can only be called #emph!once! and will throw an 
#src!InvalidStateException! iff the key is already contained in the map
#i
Call #src!match()! on field contents and update the fields of the clone
by setting them to the value of #src!getResult()!
#i
Immediately before returning, if any recursive descend has happened,
finally  #src!rewriteDone(orig)!  must be called
to restore the result variable to the clone which has been  cached for #src!orig!.
#/list


// ------------------------------------------------------------------
#h3 #title Rewriting of Aggregates
#p
When rewriting the field contents of  #src!SET! and #src!SEQ! type,
the rewriting is done element-wise, from "left to right", in the
case of #src!SEQ!. But of course a declarative, sequence independent
style of coding is always more robust and better readable.
#p
This is esp. true when rewriting #src!MAP! and #src!REL! type values.
The #java/  libraries which realize the corresponding data structures have
imperative behaviour: E.g. the #src!map! class has  overwriting semantics:
Whatever is put LAST determines the current value. Of course this can
make programs very hard to understand.
#p
Here our approach is more declarative, and independent of this sequential order.
Consider the following diagram:

#source
                    M
           --------------------->
         |                        |
         |  L                     | R
         V                        V
           =====================>
                    M'
#/source
#p
Let "M" be the mapping which shall be rewritten. It can be a map or a
multi-map.
#nl
Then rewriting is applied to the domain of M, yielding a new, auxiliary
mapping "L", and then to the range of M, yielding "R".
When we alllow  "#src!substitute_multiple()!", then one single (1) element can be
re-written to more than one (>1) elements, and we get multimaps for L and/or R.
Otherwise we get maps.
#p
As can easily be seen in the diagram, the result M' of rewriting M is defined to
#nl
#src!    L-inv o M o R!
#p
If the declared type of M (and consequently M') is a multimap, this works in any case.
#p
If the declared type of M (and M') is a map, then L-inv and R must be
maps, i.e. L can be a multi-map, but must be injective, while R must be
a map. If these conditions are violated by the user-defined rewriting rules
(which define L and R) when applied to the current data M, then an
exception is thrown.
The result is independent of any sequential order of API calls.


// ------------------------------------------------------------------
#h3 #title Visiting and Rewriting "#src!null!"
#p
"The invention of #src!null! was a billion dollar mistake".
#p
Of course you should avoid to use it, whenever possible.
One major achievement of #umod  is eliminate illegal null references,
and legal references must be declared 
explicitly by the type constructor "#src!OPT!", se #ref txtstrictness.
#p
#src!null! behaves in a very irregular way.
#list
#i
It is a value which has a special type, which is sub-type of any other type.
#i
It can be cast into a certain, special type and thus used
for controling overloading resolution.
#i
But the "#src!instanceof!" test again shows a different behaviour.
#/list

#p
On the static, text level, there are always different "types" of null,
so we could have supported
#src!match((A)null)! and  #src!action((A)null)!.
The latter could even make a static cast to the superclass, say "B",
symbolically written as
#nl
#src!action((A)null){  ! #nl
#src!  match((B)null); ! #nl
#src!}                 ! #nl
#p
But of course you cannot descend to any field in the null case, so at least when
#src!action()! starts doing so, the value #src!null! must be treated specially.
#p
But, even worse, you cannot store all these different null values to a cache!
They all are the same, when seen as a runtime value! (Of course you COULD
introduce auxiliary wrappers which tag all these different nulls.
But this would be a lot of work for a construct which, as demonstrated
above, is better avoided anyhow!)
#p
So we decided #bold  not to visit nor to rewrite any null value!#/  
As soon as any visitor or rewriter finds a value (contained in a field or
an aggregate) to be #src!==null!, it does not do anything but leaves it
unchanged.
#p
This is not really a problem, because you can treat the (rare!) cases when
field values may be #src!==null! explicitly, one step earlier, when visiting
the containing object itself! This is much more sensible also because at this point
the context of the #src!null! value is still known. A (theoretically possible,
but not implemented) 
visitation of a #src!null! value would require explicit passing of additional
information, anyhow, to be of any worth.


// ------------------------------------------------------------------
#h3 #title Diagnosis
#p
Each visitor-based processing code is somewhere in the middle
between "declarative" and "imperative" style of programming.
Being determined by the selection of the methods overriden
as well as by the processed data, the outcoming flow of control
can be quite surprising.
Therefore an interactive debugging is supported by the generated code.
#p
Every visitor/rewriter has a field
#source
    protected java.io.PrintStream _visitor_debug_stream = null;
#/source
#p
Whenver this value is #src(!=null), some intermediate steps 
#nl
((currently: only the replacing of the result by the clone AND
sub-change in MAPs  in the (non-co-)rewriting process))
#nl
 are dumped to this #src!PrintStream!.
#p
When the #emph!command line switch! "#src!--visitordebug! " is set to true
for code generation, more debugging code will be included.
see #ref txtcommandlineoptions.

#p#kind missing
Aufräumen !!
//for (non-co-)rewriting sets and lists the results of each descending,
//and #emph!before!  descending in a multiphase visitor.
#nl


// ------------------------------------------------------------------
#h3 #title Optimization
#p
By setting the command line switch "#src!--visitoroptimize!" to true
(see #ref txtcommandlineoptions), all generated visitor and rewriter code
will incorporate the following  optimization tactics:
#list
#i
When compiling the model, #umod  makes an "SCC" analysis of all model
classes w.r.t. the "associations" (in UML speak) defined by the types
of the field definitions, 
#i and then analyses which SCCs are reachable by every distinct field.
#i The results of this analysis are encoded into some static final data, and
thus available at runtime.
#i Whenever (at runtime)
the code of a user defined class, derived from a certain generated
visitor/rewriter, is  #emph!loaded!, this code is questioned for the set of
classes for which an override of any method does exist.
#nl
(This analysis is performed on the #emph!binary! code, at class loading time,
but this is only for technical reasons. The semantically identical results
could be drawn out of the source text.)
#i
From the "overriden classes" we can abstract to "overridden SCCs", 
#i
and finally can conclude which fields ("assiciations") #emph!never! need to
be followed, because they
only lead to one or more SCCs for which no user-defined visitor/rewriter method
exists.
#/list

#p
This optimization should make sense with data models which decompose into
disjoint spheres with only few connections, for all those visitors/rewriters
which process not all of these spheres.
These results were presented in #cite lt11a on the "ICMT 2011";
for further info please refer to the paper or to 
#link 0/markuslepper.eu/papers/zuerich.pdf #text the slides of the talk#/link.

// FIXME hier besser einen BIB EINTRAG ?????

#p#kind missing 
what with CO-rewriters ????


// ------------------------------------------------------------------
#h2 #title Visualization
#p
Visualization of a data model (or parts thereof) 
is supported by different means.

// ------------------------------------------------------------------
#h3 #title User-Defined Visualization by the Modifier #src!TOSTRING!#label txttostring

#source
   A 
   | B1
     b1 B1 
     b2 char  
   | | C1
       c1 MAP string TO C2
       c2 int 
       TOSTRING JAVA return b1.toString().substring(2)+">>"+c1 ; $$
       FORMAT "c2 '==>' b1 ; c1"
#/source

#p
The #src!TOSTRING! directive can appear in a class definition at any 
position like a field definition.
It is followed by a fragment of java source text enclosed in "#src!JAVA...$$!", like
all other #emph!verbatim! java.
#nl
This source text fragment must consist of a statement or a 
sequence of statements (in the java sense).
#nl
It will be #ital verbatim #/ encapsulated in the methode declaration
#src!public String toString(){ ... }!, so it has to end with the
type-correct #src!return! statement.
#p
W.r.t error reporting, the same rules apply as with other #ital!verbatim! java
source text, cf. #ref txtverbatimjava above.


// ------------------------------------------------------------------
#h3 #title User-Defined Visualization by the Modifier #src.FORMAT.
#label txt_format_frontend_language

#p
The #src!FORMAT!  directive can appear in a class definition at any 
position like a field definition.
The #src!FORMAT!  keyword is followed by string constant which contains
a format description. This is compiled  into a method of a dedicated
visitor, which constructs a #src!Format! object for visualizing 
an instance of this class.
#p
The syntax of the format directives is an instance of the
#link format.html #loc txt_format_frontend_language
#text generic syntax for format front-ends.
#p
The #src!Format! object which represents an object #src!a! of class #src!A! of a 
model class #src!M! can be generated by calling #ldots
#source  
  class A {
    public Format format() {...}
  }
#/source
#p
#ldots which is a wrapper for #ldots 
#source
  class M {
    public static Format toFormat (Object o) {...}
  }
--or--
  class M {
    class Formatter {
      public static Format process (Object o) {...}
    }
  }
#/source

#p #ldots which is a wrapper for (the protected method!) #ldots
#source
  new M.__Formatter().toFormat(Object o)
#/source

#p
Further there is a static function with a #src!mode!  parameter. It
first set a global variable in the 
#src!Formatter! object which normally defaults to #src!0!(zero), 
and which is used in the #src!$switch $mode{..}! expressions as described
#link format.html #loc txt_switch_mode#text in the format front-end documentation#/.

#source
  class M {
    public static Format toFormat (Object o, int mode) {...}
  }
#/source


#p
Whenever a format directive needs to embed a format for an object of a
model class for which no #src!FORMAT! directive is given, 
a call is compiled to #ldots
#source
  protected Format M.__Formatter.defaultformat(Object o){
    result = Format.literal(String.valueOf(o)) ;
  }
#/source

#p
Whenever a format directive needs to embed a format for an object which is not
part of the #umod  model (i.e. whenever an field with a type
defined by an #src!EXT! import appears in the format directive), 
a call is compiled to #ldots

#source
  protected  Format M.__Formatter.foreignObject(Object o)
#/source
#p 
This method tests whether #src!o! implements
#link 2/eu/bandm/tools/format/Formattable.html
#text !/format/Formattable #/link, and in this case #src!!format() is called,
otherwise #src!Format.literal(String.valueOf(o))!; 
#p
These both methods can be  overridden by deriving a
new formatting visitor from #src!M.__Formatter!.
#p
Whenever the user wants to change the values of 
#src!nulltext!, #src!mode!, #src!format_empty! and/or #src!default_indent!, he/she
cannot use the above-mentioned wrappers, but
(1) has to create an explicit instance anyhow 
(e.g. by #src!formatter = new M.__Formatter()!), (2) assign to these public
fields, and (3) create the format by calling #src!format = formatter.process(o)!
explicitly.


// ------------------------------------------------------------------
#h3 #title Automated Swing #src!Tree! Generation
#p
If activated by a command line switch (see #ref txtcommandlineoptions),
code is generated for a java swing tree representation.
#p
For a certain model definiton #src!Mymod! and a reference to an
object #src!myobj!, a swing tree is generated by calling
#source
 JTree tree = new JTree(new Mymod.__TREEGEN__().growRoot(myobj)) ;
#/source
#p
For convenience, there is a runtime class offering the static method
#link 2/eu/bandm/tools/umod/runtime/SwingBrowser.html 
#loc model2swingpanel(java.lang.String,java.lang.Object,java.lang.Class)
#text #src!umod.runtime.SwingBrowser model2swingpanel(...)! #/link, which
opens a top-level window with scrolls bars etc. and all you need for
browsing a #umod  model via a GUI.


/* ============================ ALT 
#p
An example can be found at the end of 
#link 1/src/de/tu_berlin/cs/uebb/tools/umod/UModMain.java#blank 
text !/.../umod/UModMain.java#/.
===*/

#p#kind src 
Our implementation is based on #emph!suspensions!, i.e.
implements #emph!lazy! generation of tree objects.
The source is found in 
#link 3/eu/bandm/tools/util/SwingForester.java#/link, and rather instructive to read.


// ------------------------------------------------------------------
#h3 #title User-Directed Visitor-Based Dump Routines
#p
Esp. for debugging purpose, a dedicated visitor can be generated
which ia declared "#src!IS PRINTER!", cf. the syntax of
visitor declarations in  #ref txt_syntax_visitor_declaration.
#p
The constructor of each such visitor takes a #src!java.io.PrintStream! as its
only argument.
Whenever the #src!match()! method of this visitor is called for a 
certain object, this object is printed to this stream as follows:
#list 
#i first a sequence of characters like "#src!| | |...!", indicating the 
match call's nesting level, 
#i then a simpel #src!toString()! represetnation of the object, 
#i then the name/value pairs of all fields 
which are #emph!not! marked for descending (using #src!toString()!),
#i followed by the output caused by (1) recursively matching all fields
marked for descending (by the traversal order selected for this visitor),
after (2) the above-mentioned nesting level has been inctreased-
#/list

#p
When a field has to be printed which is of aggregate type, then (1)
a new line is opened for every item in the aggregate, and (2)  the
name of this field and the current index position is printed  in "#src![....]!"
before the call to #src!match()! of the current value.

#p
Plese note that this is currently still a primitive implementation, 
and #bold cannot deal with cylic data !#/bold


/* === OLD 
For an example please refer to 
#link 1/src2/de/tu_berlin/cs/uebb/tools/dtd/Test.java#blank #/, which also
demonstrates the TSoap-serialization.
== */


#p#kind missing 
The name of the field currently causing the descending is #emph!not! printed.


// ------------------------------------------------------------------
#h2 #title XML Encoded Serialization/Deserialization
#label txt_tsoap
//#h2 #title "Typed SOAP" #xml -Based Serialization

// DISKUSSION der varianten in MeMo090331
// und in umod/TODO_umod.txt 
#p
The basic umod xml-serialization is based on rules which try to
combine simplicity, readability and non-redundancy.
#p
#list
#i all primitive types are encapsulated into their type name used as a tag.
#i all sequences, sets, maps, multimaps simply serialize their
content (in canonical order), and encapsulate this into one and the
same generic "aggregate" tag
#i all fields are tagged with the field name
#i the type-driven tags as listed above (=primitives and generic aggregates),
are #emph!omitted! when directly
under a field tag.
#i all objects are tagged with the class name
#i on top-level of field contents, null values MAY be read, but are not written.
Instead, fields with a null value are simply omitted.
#i empty structures on top-level, which are not optional, are also omitted.
(this corresponds to the default rules when constructing an object:
only in case of 
#src!OPT! types there is a difference between null and an empty aggregate!)
#i the "right-not-left" case of a co-pair is wrapped into a dedicated element.
#/list

Additionally, there is a special empty #emph!reference! element
which realizes (by an "idref" attribute)
a pointer to some object defined at some other place (i.e.
earlier when writing, or earlier or later when reading).
The code generated by #umod  starts every serialization of objects with
one(1) single certain root object, descending in a
#emph!depth-first! discipline. Therefore #emph!back-patching! is never required
in this case, but the de-serialization code does support it.
The first reference to an object always leads to an "in-place" dump
of the complete object structure, as defined above.
This is fine in case of objects which are only referred to once.
It is esp. fine for human readers in case of "algebraic" objects, which do
not have an "identity" beyond there structure. In this case, the
usage of "id", "idref" and "reference" would only serve as a kind of 
shorthand notation. This is different with non-algebraic objects,
where identity (and "self-identity" and "non-identity" !-) does carry more semantics
than the collection of field values!

#p
The writing out of an instance of model "#src!M!" is started by some code like 
#source
   final java.io.PrintWriter p0 = new java.io.PrintWriter(outstream);
   final eu.bandm.tools.util.ContentPrinter cp = new ContentPrinter(p0); 
   final eu.bandm.umod.runtime.XMLconfiguration conf = new XMLconfiguration();
   final M.SAX_Writer dumper = new M.SAX_Writer(cp, conf);
   dumper.match(myTopLevelOjectForWriting);
#/source
#p
The concrete tag strings and attribute names to use are configurable and
are initialized via the 
#link 2/eu/bandm/tools/umod/runtime/XMLconfiguration.html
#text XMLconfiguration #/ object.
#p
When writing, no errors should occur, but failures. These are reported
via thrown exceptions.

#p#kind src 
An example is in 
#link 3/eu/bandm/tools/d2d2/base/Main.java
#text the main file of triple-dee#/.

#p#kind src 
CURRENTLY there is a test like
#src!
~/metatools/src/eu/bandm/tools/d2d2/test>make test_ddf2tsoap MODULE=basic OUT=dumpbasic.xml ; less dumpbasic.xml
!
which works quite nice !

#p
The reading works as follows:
#source 
  FIXME MISSING 
#/source
#p
On reading of course errors can occur, esp. when the external representation is not
"valid" w.r.t. the implicit syntax rules of the DTD, which reflect the object
structure of the umod model.

#p#kind missing  
ERKLAERUNG WIE lesen ?? WO BEISPIEL ???


#p#kind missing 
Link to EXTERNALLY defined serialization/deserialization for external classes.


//  -------------------------------------------------------
#h1 #title  Using the #umod  Tool 
//  -------------------------------------------------------------

//  -------------------------------------------------------------
#h2 #title  Command Line Options #label txtcommandlineoptions

#p
The options for the current implementation of the #umod  tool are as follows:

#p
#cmdline_option_documentation ../../src/eu/bandm/tools/umod/umodOptions.xml #lang en

#p
#xemph!Attention! #src!--setterfunctions! and  #src!--getterfunctions! are deprecated.

//  -------------------------------------------------------------
#h2 #title  Splitting the input text into input files
#label txt_splitInput
#p
Currently more than one input files can be supplied. 
All these files must follow the same syntax, as described
above, and use the same module name. 
All contained declarations will be processes as if they were contained in one single source file.
So this allows the separation e.g. of documentation and declaration, or of different
trees of the forest. But is only a  #emph!provisionary! means, until real modularization
and parametrization will be introduced.


//  -------------------------------------------------------------
#h2 #title  Error Messages 

#p
The basic philosophy is to delegate most error messages to the subsequent
step, i.e. the execution of a Java compiler. Some problems cannot be 
detected without detailed analysis, and we do not want to re-implement things
done by the Java compiler anyhow.
#nl
This implies that the error messages generated there must be "calculated backwards"
to find their cause in the #umod  source.
#nl
Nevertheless basic errors and warnings will be generated by the #umod   tool
on its own.

#p (More to come)

#p#kind missing  FIXME ERROR MESSAGES MISSING


#eof