[all pages:] introduction message / location / muli format dtd xantlr tdom ops paisley metajava umod option auxiliaries d2d downloads & licenses people bibliography APPENDICES:: white papers white papers 2 white papers 3 project struct proposal cygwin tips SOURCE:option.dtd SOURCE:dtd.umod DOC:deliverables.ddf DOC-DE:deliverables.ddf DOC:mtdocpage.ddf DOC-DE:mtdocpage.ddf SOURCE:basic.dd2 SOURCE:xslt.dd2
umod | bandm meta_tools | auxiliaries |
option --- Standardized Command Line Parsing and GUI Editing
(related API documentation: package option.runtime )
1
Purpose and Way of Operation
2
Data Model and Description Format
2.1
Historic Predecessors and Paradigms
2.2
General Data Model
2.2.1
Data Types for Arguments
2.2.2
Positional Options
2.3
Input Data Format
2.3.1
Multilingual Text for Documentation
2.3.2
Enumeration Types
2.3.3
Grouping Documentation Comments
2.3.4
Option Declarations
2.3.5
Types
2.3.6
Default Values
2.3.7
Special Empty Type
2.3.8
Enabling Conditions
3
Operation of the Tool and the Generated Code
3.1
Compiler for Model and Gui Code
3.2
Generated Model Code
3.2.1
Setter Functions
3.3
Command Line Parsing
3.3.1
String Arguments containing Whitespace
3.3.2
Representing Boolean values
3.3.3
Multiple File Names, e.g. for "class path"
3.3.4
Argument values starting with a "minus-sign" Character
3.3.5
Fragmented Lists
3.4
Help Function "usage()"
3.5
Unparsing
3.6
Usage of the GUI
3.7
Automated Integration of Options' Descriptions into HTML Based Manuals
Purpose of the option package is the parametrization of applications, --- by command line parameters and/or by a graphical user interface (GUI).
Historically, there have been several attempts for standardization in both areas, but no one has definitely prevailed. The option package is our approach to automate both by source code generation.
The option compiler is realized by the class <mt>.option.Compiler. The required run-time code is in <mt>.option.runtime.
The compiler takes a specification (including documentation) of all options of an application, encoded in XML, and generates ...
The central aim of this package is to treat different operating systems,
different ways for starting an application, different styles of use,
different languages,
and differently socialized users transparently and combinable.
On a first glance, one might assume the task to be rather trivial, but it is not.
Historically there exist several standardizations for parametrization of an application by comand line arguments:
Our approach basically follows getopt_long(),but goes beyond w.r.t. typing and type checking of the options' parameters.
The terminology in this area has grown historically, and many words are heavily overloaded. We use in the following:
The main issue with our data model is that it is more abstract than merely the command-line oriented. It shall serve for different representations, text as well as gui, which makes the task non-trivial. From this, the most important consequences in the following model are ...
The resulting underlying data model can be more easily described following the command-line version:
The data type assigned to each option can be defined by the following grammar:
simpletypes = int | float | string | char | bool | enum | enumset | specialtypes specialtypes = uri // more to come: | file | action | ... type = simpletypes*, rep? rep = RepKind, simpletypes+ RepKind = star | plus |
The sequence of arguments of a certain option must comply to the sequence of types as specified.
The types and corresponding arguments contained in the "rep" construct are called "repeting group" in the following. This group may be repeated arbitrarily often. It needs to appear at least once in the "plus" case, and can have the length zero in the "star" case.
Concerning the command line parsing, there are some further rules for
modeling the historic behaviour:
The posix standard and other publications describe the syntax of a certain
application instantiation
(e.g.
here ) frequently by examples like
utility_name[-a][-b][-c option_argument] [-d|-e][-f[option_argument]][operand...] |
Therein "utility_name" corresponds to our notion of "application",
and "option_argument" is nearly the same as with us.
Now "operand" are all these tokens on the commandline following the
last option, if any, which (1) are not not arguments to the preceding (last) option,
and (2) are not options themselves.
For us (2) it means, that their text does not start with
a minus-sign character.
(In posix they could also be explicitly declared not to be options
by the special separator "--" which ends the option list. This case we do ignore.)
These "operands" are parameters to the application, but are not options in the sense defined so far. But of course we want to treat all command line components in a unified way, and we have to map them to some gui. Therefore we define some further rules:
A standard way to support this historic way of parametrizing is to give those positional parameters "speaking" long names, as in
-m / --mode : what to do with the files -C / --classpath : where the libraries are searched for -0 / --inputfile -1 / --outputfile -2 / --logfile |
All option specifications must be contained in one single XML document. This must be declared as
<!DOCTYPE optionlist PUBLIC "+//IDN bandm.eu//DTD option//EN" ""> |
The syntax of the input is defined in the dtd file contained in the APPENDIX: option dtd file
The toplevel definition is
<!ELEMENT optionlist (enumeration|option|comment)+ > |
A text element type is employed ubiquituously for multi-lingual documentation:
<!ELEMENT text (#PCDATA)> <!ATTLIST text lang NMTOKEN #REQUIRED> |
The lang attribute should be used following the same rules as common practice with xml:lang, i.e. following RFC 4646 / RFC 4647 / "[IETF BCP 47], Tags for the Identification of Languages".
(( Currently, we refer to ISO 639-2, as listed e.g. in [isolanguage], instead, because the URL for IETF BCP 47 as referred to at the end of [xml] is BROKEN !!))
The set of supported languages may be chosen arbitrarily (including the empty set), but it should be the same in all lists of text elements.
Enumeration types which shall serve as data type for option arguments must be declared according to ...
<!ELEMENT enumeration (desc?, enumitem)+) > <!ATTLIST enumeration name NMTOKEN #IMPLIED> <!ELEMENT enumitem (desc)? > <!ATTLIST enumitem value CDATA #REQUIRED compilable NMTOKEN #IMPLIED > |
Multilingual documentation text may be added to enumeration types as a whole, and/or to every item separately. (As mentioned above, the support of the different languages should be consistent over all objects carrying doc texts.)
The front-end appearance of an enumeration item (on the command line or shown in a GUI may be an arbitrary string, containing arbitrary character data. In cases they do not make up a valid JAVA identifier, the attribute compilable must be given, which is used in the generated enum{..} code instead.
comment elements can be interspered into the sequence of option declarations. Currently, they only lead to additional text labels in the generated GUI.
<!ELEMENT comment (text)+> <!ATTLIST comment name NMTOKEN #IMPLIED> |
(The "name" attribute shall be used later to switch on/off a whole group of options in the GUI.)
In the following definutions it holds that ...
<!ELEMENT option (desc, type?, condition?) > <!ATTLIST option name NMTOKEN #IMPLIED abbrev NMTOKEN #IMPLIED > <!ELEMENT desc (text)+> |
Basically each option can have arbitrarily many arguments.
The prefix of the argument sequence has to follow a first sequence of
types, the rest of the arguments can follow a repeated pattern of types.
By this means argument lists of varying length can be specified.
Currently arguments which are purely optional are not supported
(with one exception, see below Section 2.3.7).
<!ENTITY % simpletypes '(int | float | bool | char | string | uri | enum | enumset | action)'> <!ELEMENT simpletypes (%simpletypes;)> <!ELEMENT type ( (%simpletypes;)*, rep? )> <!ELEMENT rep ((%simpletypes;)+, defaults?) > <!ATTLIST rep kind (plus|star) #REQUIRED> |
Only the enum and enumset type identifiers have a required attribute, namely the name of the enumeration declaration it refers to.
All option arguments are parsed into a variable with the intuitively corresponding Java type.
<!ELEMENT int EMPTY> <!ATTLIST int default NMTOKEN #IMPLIED> <!ELEMENT float EMPTY> <!ATTLIST float default NMTOKEN #IMPLIED> <!ELEMENT bool EMPTY> <!ATTLIST bool default NMTOKEN #IMPLIED> <!ELEMENT char EMPTY> <!ATTLIST char default NMTOKEN #IMPLIED> <!ELEMENT string EMPTY> <!ATTLIST string default NMTOKEN #IMPLIED> <!ELEMENT enum EMPTY> <!ATTLIST enum name NMTOKEN #REQUIRED default NMTOKEN #IMPLIED> <!ELEMENT enumset EMPTY> <!ATTLIST enumset name NMTOKEN #REQUIRED default NMTOKENS #IMPLIED> <!ELEMENT uri EMPTY> <!ATTLIST uri default NMTOKEN #IMPLIED> <!ELEMENT action EMPTY> <!ATTLIST action name NMTOKEN #IMPLIED> |
Each single argument can may optionally be given a default value. This value will be used as initial value for the corresponding fields of the model class (Section 3.2).
Please note that the default value must verbatim be a valid Java expression for initializing a field of this type. The check for correctness is left to the subsequent application of the Java compiler.
(The defaults of enumsets are currently not yet implemented! FIXME)
With rep elements the situation is more complicated: First method: the elements which represent the single simple types of the repeated type pattern can each be given their own default, as described above. They define the default values for any newly created instance of the Java class, which is generated to represent this repetion group. One or zero repetition groups (with these values) are created implicitly. depending on "star" or "plus" flavour.
Second method: the rep element itself may be given a default element, consisting of a sequence of v elements, which initialize one of the repeated primitive arguments each. The text values must correspond to the repeated types and must be an integral multiple of the repeated sequence.
For example, the following default declaration leads to three instantions of the argument pattern:
<option name="vle"> <desc><text lang="en">variable length default example</text></desc> <type> <int default="0"/> <rep> <int/> <char/> <defaults> <v>1</v> <v>'c'</v> <v>1+1</v> <v>'d'</v> <v>3</v> <v>'\n'</v> </defaults> </rep> </type> </option> |
Again, type and syntax errrors are left to the Java compiler.
An option with no argument at all
carries as only semantics its presence or
absence on a command-line input.
Historically, they are often called "switches" or "flags".
But there is natural correspondance to "presence" when using a GUI.
Therefore we decided above, that the mere presence or absence of
an option with additional parameters
should better not carry any semantics. Instead, the
parameters should be given a default value which indicates their
"absence" in the application's logic.
Since nevertheless parameter-less "switches" are oftenly valuable, we define a dedicated default value for boolean arguments named "presence", and implicitly give every "empty type"
<option abbrev="x"><desc><text lang="en">a simple switch</text></desc> <type/> </option> |
...the meaning of something like ...
<option abbrev=="x"><desc><text lang="en">a simple switch</text></desc> <type><bool default="presence"/></type> </option> |
The value "presence" is virtual and cannot be used explicitly by the user.
On later parsing it will result to a default value of "true" if the
option is present, and of "false" if not (thereby definining also the actual value
to "false"!). This is the only case of an optional argument !
So these all are valid command lines:
applic // leads to x = false applic -x // leads to x = true applic -x false // leads to x = false applic -x true // leads to x = true |
Some options do only make sense in certain modes of operation, as defined by other options. For convenience, this can be indicated by a GUI e.g. by disactivating the input widgets for a currently not applicable option.
In our model, each option can be assigned an "enabling condition", according to the following grammar:
Condition ::= PrimeCondition | CompoundCondition CompoundCondition ::= not, Condition | and, Condition+ | or, Condition+ PrimeCondition ::= testset Ident Number | testEqual TestValue TestValue | testGreater TestValue TestValue TestValue ::= constant | optionvalue, Ident, Number |
So (1) it can be tested whether a certain argument of a certain option is equal to or greater than a constant value or some other argument, and (2) these tests can be combined by standard logical operators.
Please note that there should be no cyclic dependencies, because then the current, straight-forward implementation of the GUI could run into a dead lock.
The option compiler is realized by the
class
<mt>.option.Compiler.
Its run-time code is in
<mt>.option.runtime.
The compiler is called like
$(JAVA) eu.bandm.tools.option.Compiler <inputFile> \ <packageName><modelClassName><guiClassName> \ <rootOfJavaSources> |
The input file is an XML file, as described above. The compiler generates one or two(2) Java source files: the model class source is generated anyhow, with the given name, in a package with the given name. The source is written to the file system, relative to the given root, descending the directory levels which correspond to the package name.
The gui class is only generated if the gui class name is not the empty string.
In the following examples, let "MyOpts" be the name of the model class, and "MyGui" be the name of the GUI class.
The generated model class MyOpts will offer the following API:
public static main (final String[] args){ final MessageReceiver<Message> m0 = new MessagePrinter<Message>(); final MessageCounter<Message> m1 = new MessageCounter<Message>(); final MessageTee<Message> m3 = new MessageTee<Message>(m0,m1); final MyOpts myOpts = new MyOpts(); myOpts.parse(args, m3); if(m1.getCriticalCount()>0) System.exit(99); // ---- e.t.c.--- } |
public class MyOpt extends <mt>/options/runtime/Model { ... public class MyOpt extends <mt>/options/runtime/Model { ... public int get_grmmpf_0(); public double get_grmmpf_1(); public string get_grmmpf_2(); public int repcount_grmmpf(); // repeting group starts counting arguments from zero again: public char get_grmmpf_0(int); public boolean get_grmmpf_1(int); // additionally indicating whether the option "grmmpf" did appear at all: // this SHOULD NOT be used! public boolean has_grmmpf ; |
In case a certain option does not have a long name, the short name is taken for constructing these identifiers.
Whenever the generated option model class shall be used for representing configuration data / taks specifiations in a more advanced mode, e.g. programatically controlled, the declaration
<optionlist setterFunction="yes" >... |
will cause the compile to additionally generate setter functions which allow to change the options parameter values.
Currently not yet supported!
The parsing of the options and their parameters may seem trivial, but is not. This because different OSs are involved. The ieee posix specification "Utility Argument Syntax" (esp. point 2.b, to make it more confusing!) uses the notion of "argument string" for the portioning already done be some "shell" program, before the getopt()-code starts working. This is the portioning reflected by the array structure of "static main(final String[] args)" in case of Java.
This also has to be considered here, as well as the different shell processors and system services which lead to the appearance of this "array of strings" or "argument strings".
We took a brute-force approach, what is always a good thing to do when things get toooo historically determind. So we found the following rules, which might appear somehow rude, but do ensure portability:
The last point show clearly, that the "positional options" do make the parsing process much more complicated. They are only supported for historic reasons, and should be abandoned when the calls for applications are generated e.g. by make scripts. The difference is small, like
d2d --path $(HOME)/lib/documents $(PWD) --mode text2xml source.d2d tmp.xml ... vs ... d2d --path $(HOME)/lib/documents $(PWD) --mode text2xml -0 source.d2d -1 tmp.xml |
The parsing algorithm recognizes only the "gaps" between the "argument strings" as separating whitespace, i.e. the gaps between the string objects which make up the string[] passed as an argument to static void main(String[]). All whitespace contained in one such string is masked and treated as character data. Therefore it will end up in the value of an option argument whenever it can be parsed as such, e.g. whenever the type of this argument is "string" (or "character" or "URI", which special cases can be ignored in the context of this discussion).
The string array which arrives at the start of main() is constructed by the co-operative efforts of some shell program and a call to the system service "exec", or sim. Therefore it depends on their correct handling, whether some whitespace can enter the value of an options argument. Consequently, this is out of scope w.r.t. the option module! Puhhhh!!!
E.g. normally any input in "double quotes", like
$(JAVA) eu.bandm.Application --title "this is a headline with whitespace" |
will stimulate bash or any other shell-like program, to pass the contents (sic!) of these quotes as one single argument string to the system call named "exec", or sim. Therefore it will arrive as one(1) single token at the option parser. But the concrete details of these rules may be very complicated and os-specific.
As Boolean values are recognized (not regarding case):
t true 1 + --> true f false 0 - --> false |
It is an old habit of UNIX programs to concatenate file names using a colon ":" in the value of arguments which represent more than one position in the file system.
We recommend using multiple arguments instead. This is os-independent, because some shells will try to resolve the colon as a separator for "drive letters".
For $(JAVA) eu.bandm.Application --libraries a.b:c.d:e.f better $(JAVA) eu.bandm.Application --libraries a.b c.d e.f |
Please note that the fact whether a token starts with a minus-sign character is evaluated only whenever a new option may start. As long as the type specification of the arguments of the current option still requires character data, a minus-sign never is interpreted as a new option keyword lead-in.
As soon as a parsing error occurs, input is skipped upto the next minus-sign character, and a new option name is assumed to follow. This does of course not happen after the parsing of the positional option is entered!
When constructing command lines automatically, parts of an option's repeting group may come from different sources. In this case it may by helpful to allow multiple occurences of an option, the arguments being concatenated. This is activated by the declaration
<optionlist fragmentedLists="yes" >... |
This will allow:
declaration: <option foo><type><int/><string/><rep kind="plus"><int/><bool/></rep></type> </option> command line --foo 3 "stringvalue" 3 true --otheroption ... --foo 7 false |
The prefix before the repeting group must neverhteless be given completely, with the first appearance of the option.
As usual with unix programs, the compiler generates help functions which list the options, the types of their arguments and the description texts in various languages. Assuming the source file contains descriptions for the languages en, de and sv. Then the following functions will exist:
public class MyOpt extends <mt>/options/runtime/Model { ... public void usage_en (PrintStream p){...} public void usage_de (PrintStream p){...} public void usage_sv (PrintStream p){...} public void usage (String lang, PrintStream p){...} public void usage (PrintStream p){...} // defaults to a language chosen by random, only supplied for // historic reasons! } |
The function
public class MyOpt extends <mt>/options/runtime/Model { ... public String serialize (){...} } |
delivers a textual representation of the state of the model which can be used for persistent storage and which will reconstruct this state when submitted to command line parsing.
It most cases it is also wise to insert it into the produced output, together with the name of the programm, its version, the date, etc., as supported by <mt>.format.java.CommentFormats.
The generated GUI class is simply a specialization of a swing JPanel. It contains on the left side (in a "grid bag layout") the short and long names of the options, and on the right side a sequence of input widgets, corresponding to the option's argument types.
When pointing to the names, a tool tip will appear with the description of the option in the current language.
Before the GUI can be operated, i.e. edited by the user, it must
be initialized by the values taken from some well-defined model.
A command line parsing process may precede a graphic editing phase,
but thanks to the default parameters, this is not required.
Vice versa, after editing is complete, the values have to be transfered back
to a model instance, to be retrievable as described above.
For this the gui class offers the following methods:
public class MyGui extends <mt>/options/runtime/Gui { ... public static MyGui makeInstance(MyOpts model) public void model2view(MyOpts model) public void view2model(MyOpts model) ... } |
The JPanel can be integrated into arbitrary swing containers and programming contexts.
For convenience, it offers (currently just one) default method which
runs a dialog, detects input errors and adds some "meta-"buttons to the
option panel.
There are more than one call patterns to this method, supplying
different default arguments. The user language must always be given explicitly.
The result indicates whether the user ended the dialog with the "ok" or
the "cancel" button.
public class MyGui extends <mt>/options/runtime/Gui { ... public boolean editGraphically(String userlanguage) public boolean editGraphically(String userlanguage, int width, int height) public boolean editGraphically(String userlanguage, int width, int height, boolean languageSwitchable) ... } |
The example from above can thus be continued
final MyGui myGui = MyGui.makeInstance(myOpts); final boolean userPressedOk = myGui.editGraphically("en", 400, 500); if (!userPressedOk){ System.err.println("Leaving program due to user's cancellation."); System.exit(99); } final MyOpts editedOpts = new MyOpts(); myGui.view2model(editedOpts); // perform task according to the settings in "editedOpts" // of course, the "old" model could be re-used to store // the new parametrization, by performing // myGui.view2model(myOpts); |
The d2d/xml/xhtml-based documentation system, as employed in the doc of meta_tools itself, can integrate a user-readable form of the options specification as a table into the html text. This is part of the "d2d_gp" application architecture, and described in more detail with the module technicalDoc.commandLineDoc.
The d2d-frontend representation works like
#cmdline_option_documentation ../../src/eu/bandm/tools/umod/umodOptions.xml #lang en |
This corresponds to the xml element
<cmdline_option_documentation> <url>../../src/eu/bandm/tools/umod/umodOptions.xml</url> <lang>en</lang> </cmdline_option_documentation> |
The rendering process as defined by "docpage_xml2xhtml.xsl" will create a nice table with names, descriptions, type patterns and default values.
The result can be seen e.g. in the umod documentation.
The sequential order of the print-out can be defined indenpendently of the source text by
<optionlist defaultSorting="0AaB"> .... with <!ENTITY % sorting '( 0AaB | 0ABa | AaB0 | ABa0 | AaB | ABa )'> |
The meaning of these "sort strategies" is:
AaB -> treat lower case and upper case equivalently
ABa -> first upper case, then lower case (like the old-fashioned ASCII table !-)
Sort strategy contains "0" -> sort according to abbreviations(=one character names);
afterwards all those which do not have an abbreviation. The position of the "0" is
the position of the numeric abbreviations = positional parameters.
Sort strategy does not contain "0" -> sort according to long names.
Afterwards all those which do not have a long name.
[all pages:] introduction message / location / muli format dtd xantlr tdom ops paisley metajava umod option auxiliaries d2d downloads & licenses people bibliography APPENDICES:: white papers white papers 2 white papers 3 project struct proposal cygwin tips SOURCE:option.dtd SOURCE:dtd.umod DOC:deliverables.ddf DOC-DE:deliverables.ddf DOC:mtdocpage.ddf DOC-DE:mtdocpage.ddf SOURCE:basic.dd2 SOURCE:xslt.dd2
umod | bandm meta_tools | auxiliaries |
made
2019-01-30_09h47 by
lepper on
linux-q699.site
produced with
eu.bandm.metatools.d2d
and
XSLT
FYI view
page d2d source text