Class ULex<T>

  extended by<T>
Direct Known Subclasses:
ULex.Alt, ULex.CharSet, ULex.Const, ULex.ConstMap, ULex.End, ULex.Fail, ULex.Length, ULex.Negate, ULex.Opt, ULex.Plus, ULex.SemanticPattern, ULex.Seq, ULex.SkipWhitespace, ULex.Star, ULex.ToLower

public abstract class ULex<T>
extends Object

"u-lex" stands for "micro lexer" and supports small-scale text analysis by a fully typed combinator library.
Each instance of a subclass of ULex is either a primitive lexical scanner, or a combinator.
Each subclass of ULex<T> is parametrized with "T", which the type of the result returned by successful parsing.
Parsing operates on java.lang.String values, and is performed by calling the method parse(State).
Such a ULex.State object encapsulates the data to be parsed and the current position.
In most cases the user creates instances of subclasses of ULex and parametrizes them according to the text fragments which shall be accepted. For this a number of factory methods are provided, esp. for using the java type inference.
In other cases, a sub-class must be derived for overriding one or more methods, e.g. as in ULex.Seq and ULex.SemanticPattern for defining the outcome of the parsing in the semantic domain <T>.

Technical implementation:
The method parse(State) is only called once, from outside, by the user, intitating the parsing process of a given String value. Then, internally, all ULex objects perform parsing by calling tryParse(State). This can throw an ULex.ExceptionFail in case it cannot succeed any more.
Parser combinators like "alternative" (Alt) and "star closure" (Star) catch this exception and then try their alternative solutions.
Whenever an ExceptionFail reaches the toplevel parse(State) method, it is translated to null. This value hence indicates that parsing was not successful.
Independently, whenever the sub-parser fails, an OPT also returns null, indicating the absence of some sub-parser's match.
These two semantics of null are not at all consistent. [[FIXME ( ):DO IT BETTER !!!???]]
Currently all subclasses of ULex are also static inner classes of ULex. This is because the author does not like many little source files. It is no good practice, because each sub-class currently is (via inheritance) an inner class of itself !-)
We should change this !

Currently no back-tracking is supported.
Further alternatives (of some combinator) are only tested as long as no alternative succeeds. E.g.

    ("ab" | "a" ) "bc"
will never match "abc".
There is a "backtracking-library" by bt, which COULD(/SHOULD ??) be used for realizing a back-tracking version whenever appropriate.

Nested Class Summary
static class ULex.Alt<A>
          Parser which accepts one of two sub-parsers.
static class ULex.CharSet
          Accepts the next character iff it is contained/not contained n the given character Set (encoded as a String).
static class ULex.Concatenate
          Convenience sub-class of ULex.Seq to concatenate two string results.
static class ULex.Const
          Accepts given constant String value and returns it, or throws ULex.ExceptionFail.
static class ULex.ConstMap<T>
          Accepts a longest prefix match from a set of constant String values and returns the values defined by the map argument; or throws ULex.ExceptionFail.
static class ULex.DecimalDigit
          Returns a parsed decimal digit 0..9
static class ULex.End
          Parser which accepts the end of the input string.
protected static class ULex.ExceptionFail
static class ULex.Fail<T>
          Parser which accepts nothing
static class ULex.Int
          Returns a parsed integer
static class ULex.Length<T>
          Parser which delivers the length of items accepted by its sub-parser
static class ULex.Natural
          Returns a parsed natural number > 0, iff possible
static class ULex.Natural_0
          Returns a parsed natural number >= 0, iff possible
static class ULex.Negate
          Returns the integer "0-r" whenever it sub-parser returns the integer "r".
static class ULex.Opt<S>
          Parser which returns null in case the sub-parser throws ULex.ExceptionFail
static class ULex.Pattern
          Convenience sub-class of ULex.SemanticPattern in which the accepted string itself is the result returned from parsing.
protected static class ULex.PatternInteger
          Returns a parsed integer
static class ULex.PatternLength
          Convenience sub-class of ULex.SemanticPattern in which the character count of the accepted string is the result returned from parsing.
static class ULex.Plus<S>
          Parser which accepts one or more instances of its sub-parser and returns them in one List<S> datum.
static class ULex.SemanticPattern<R>
          Accepts a regular expression pattern as defined by Pattern and returns what the user-defined method ULex.SemanticPattern.semantics(String) calculates from it.
static class ULex.Seq<A,B,R>
          Parser which accepts a sequence of two sub-parsers and returns, what is calculated by the user-defined method ULex.Seq.combine(A, B).
static class ULex.Seq_1<A,B>
          A predefined subclass of ULex.Seq which discards the result of the second sub-parser and returns the result of the first sub-parser.
static class ULex.Seq_2<A,B>
          A predefined subclass of ULex.Seq which discards the result of the first sub-parser and returns the result of the second sub-parser.
static class ULex.SkipWhitespace<S>
          Skips whitespace and the executes the sub-parser.
static class ULex.Star<S>
          Parser which accepts zero or more instances of its sub-parser and returns them in one List<S> datum.
static class ULex.State
          Encapsulates the input data, output channels and the current state of the parsing process; indeed, only the "current read index" ULex.State.position is dynamic.
static class ULex.ToLower<S>
          Executes the sub-parser on a lower-case version of the input string.
Field Summary
static ULex.ExceptionFail EXCEPTION_FAIL
protected  T result
Constructor Summary
Method Summary
<S> ULex.Alt<S>
alt(ULex<S> a, ULex<S> b)
          Convenience wrapper around constructor call; esp.
<S> ULex.Alt<S>
alt(ULex<S> a, ULex<S> b, ULex<S> c)
          Convenience wrapper around constructor call; esp.
<S> ULex.Alt<S>
alt(ULex<S> a, ULex<S> b, ULex<S> c, ULex<S> d)
          Convenience wrapper around constructor call; esp.
<S> ULex.Alt<S>
alt(ULex<S> a, ULex<S> b, ULex<S> c, ULex<S> d, ULex<S> e)
          Convenience wrapper around constructor call; esp.
static ULex.CharSet charSet(String value, boolean positive)
          Convenience wrapper around constructor call.
static ULex.Concatenate concatenate(ULex<String> a, ULex<String> b)
          Convenience wrapper around constructor call.
static ULex.Concatenate concatenate(ULex<String> a, ULex<String> b, ULex<String> c)
          Convenience wrapper around constructor call.
static ULex.Concatenate concatenate(ULex<String> a, ULex<String> b, ULex<String> c, ULex<String> d)
          Convenience wrapper around constructor call.
<S> ULex.ConstMap<S>
constMap(Map<String,S> map)
          Convenience wrapper around constructor call.
static ULex.End end()
          Convenience wrapper around constructor call.
<T> ULex<T>
          Convenience wrapper around constructor call.
static ULex.Const konst(String s)
<S> ULex.Length<S>
length(ULex<List<S>> sub)
          Convenience wrapper around constructor call; esp.
static ULex.Negate negate(ULex<Integer> sub)
          Convenience wrapper around constructor call.
<S> ULex.Opt<S>
opt(ULex<S> sub)
          Convenience wrapper around constructor call; esp.
 T parse(ULex.State state)
          Toplevel entry point.
static ULex.Pattern pattern(String s)
          Convenience wrapper around constructor call.
static ULex.PatternLength patternLength(String s)
          Convenience wrapper around constructor call.
<S> ULex.Plus<S>
plus(ULex<S> sub)
          Convenience wrapper around constructor call; esp.
<A,B> ULex.Seq_1<A,B>
seq_1(ULex<A> a, ULex<B> b)
          Convenience wrapper around constructor call; esp.
<A,B> ULex.Seq_2<A,B>
seq_2(ULex<A> a, ULex<B> b)
          Convenience wrapper around constructor call; esp.
<S> ULex.SkipWhitespace<S>
skipWhitespace(ULex<S> sub)
          Convenience wrapper around constructor call; esp.
<S> ULex.Star<S>
star(ULex<S> sub)
          Convenience wrapper around constructor call; esp.
static ULex.State state(String s)
          Convenience wrapper around constructor call.
static ULex.State state(String s, int pos)
          Convenience wrapper around constructor call.
<S> ULex.ToLower<S>
toLower(ULex<S> sub)
          Convenience wrapper around constructor call; esp.
protected abstract  T tryParse(ULex.State state)
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail


protected T result


public static final ULex.ExceptionFail EXCEPTION_FAIL
Constructor Detail


public ULex()
Method Detail


public T parse(ULex.State state)
Toplevel entry point.

state - the running state. Only "position" is changed, the other fields are treated read-oly.
null in case of failure, so null cannot be a member of the result type.


protected abstract T tryParse(ULex.State state)


public static ULex.State state(String s)
Convenience wrapper around constructor call.


public static ULex.State state(String s,
                               int pos)
Convenience wrapper around constructor call.


public static <T> ULex<T> fail()
Convenience wrapper around constructor call.


public static ULex.End end()
Convenience wrapper around constructor call.


public static <S> ULex.Opt<S> opt(ULex<S> sub)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.Star<S> star(ULex<S> sub)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.Plus<S> plus(ULex<S> sub)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.Length<S> length(ULex<List<S>> sub)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <A,B> ULex.Seq_1<A,B> seq_1(ULex<A> a,
                                          ULex<B> b)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <A,B> ULex.Seq_2<A,B> seq_2(ULex<A> a,
                                          ULex<B> b)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static ULex.Concatenate concatenate(ULex<String> a,
                                           ULex<String> b)
Convenience wrapper around constructor call.


public static ULex.Concatenate concatenate(ULex<String> a,
                                           ULex<String> b,
                                           ULex<String> c)
Convenience wrapper around constructor call.


public static ULex.Concatenate concatenate(ULex<String> a,
                                           ULex<String> b,
                                           ULex<String> c,
                                           ULex<String> d)
Convenience wrapper around constructor call.


public static <S> ULex.Alt<S> alt(ULex<S> a,
                                  ULex<S> b)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.Alt<S> alt(ULex<S> a,
                                  ULex<S> b,
                                  ULex<S> c)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.Alt<S> alt(ULex<S> a,
                                  ULex<S> b,
                                  ULex<S> c,
                                  ULex<S> d)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.Alt<S> alt(ULex<S> a,
                                  ULex<S> b,
                                  ULex<S> c,
                                  ULex<S> d,
                                  ULex<S> e)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.SkipWhitespace<S> skipWhitespace(ULex<S> sub)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static <S> ULex.ToLower<S> toLower(ULex<S> sub)
Convenience wrapper around constructor call; esp. useful because it supports type inference, in contrast to the unwrapped!


public static ULex.Const konst(String s)


public static ULex.CharSet charSet(String value,
                                   boolean positive)
Convenience wrapper around constructor call.


public static <S> ULex.ConstMap<S> constMap(Map<String,S> map)
Convenience wrapper around constructor call.


public static ULex.Pattern pattern(String s)
Convenience wrapper around constructor call.


public static ULex.PatternLength patternLength(String s)
Convenience wrapper around constructor call.


public static ULex.Negate negate(ULex<Integer> sub)
Convenience wrapper around constructor call.