The alea Language Reference

Draft (2025-01-26)

1          Conventions
1.1          Unicode
1.2          Code Examples
1.3          Type Signatures
2          Files
3          Comments
3.1          Line Comments
3.2          Block Comments
4          Variables
4.1          Assignment
5          Functions
5.1          Overloading
6          Numbers
6.1          Subtypes
6.2          Numerical Constants
6.2.1          Types of Numbers
6.3          Arithmetic Operators
6.3.1          Powers
6.3.2          Multiplication and Division
6.3.3          Addition and Subtraction
6.3.4          Comparisons
6.3.5          Conjunction and Disjunction
6.4          Mathematical Functions
6.4.1          Minimum and Maximum
6.4.2          Rounding
6.4.3          Absolute Value and Sign
6.5          Ranges
6.6          Numerical Case Distinction
7          Collections
7.1          Shapes, Optionality and Element Types
7.2          Building a Collection
7.3          Transforming a Collection
7.3.1          Comprehension Syntax
7.3.2          Comprehension Semantics
7.4          Evaluating a Collection
7.4.1          Reduction Semantics
7.5          Collection Operators
7.6          Collection Functions
7.6.1          Counting Elements
7.6.2          Changing the Collection Shape
7.6.3          Multiplicities
8          Records
9          Tags
10          Distributions

^{^ToC} 1 Conventions

^{^ToC} 1.1 Unicode

A wide variety of Unicode characters may occur in alea source code. The mathematical notation uses the dedicated unicode characters preferentially, but for each character outside the ASCII range, an alternative in terms of combinations of ASCII characters is provided. Thus alea code can be written with any editor and keyboard.

Where this document discusses the characters for a notation, ASCII characters are simply displayed as such, e.g. *. Other Unicode characters are referenced explicitly by code point and name, e.g. ¬ (U+00AC not sign).

^{^ToC} 1.2 Code Examples

Valid examples of code are displayed in boxes.

  valid example code

When discussion alternatives of notation, the meta-separator code!⤏! is used to indicate that the more specific left-hand side notation can be reduced, by rewriting or evaluation, to the right-hand side.

  expression ⤏ alternative or result

Invalid examples of code, demonstrating common errors, are displayed in boxes of a distinct appearance.

  invalid example code

^{^ToC} 1.3 Type Signatures

Type signatures of functions, operators and probability distributions are displayed in boxes and use a standard mathematical notation.

f : A → B

^{^ToC} 2 Files

alea source code can be stored in arbitrarily named files. The content should be a single (commented) expression, and be encoded in UTF-8.

^{^ToC} 3 Comments

Comments are sections of the source code that are ignored for program execution, and exist only for the purpose of the human reader.

^{^ToC} 3.1 Line Comments

The character sequence -- or the single character — (U+2014 em dash) starts a comment that extends to the end of the same line.

  1 + 1     -- a very large number, beware!

A line comment must always be terminated by an explicit line break; it is an error if the file ends before that.

^{^ToC} 3.2 Block Comments

^{^ToC} 4 Variables

^{^ToC} 4.1 Assignment

A subexpression can be assigned to a variable and the result referred to subsequently in another subexpression, the body — a so-called let construct. Assignment uses the operator := or ≝ (U+225D equal to by definition). The subexpressions are separated by ;. This construct is right-associative, so that several assignments can be chained.

  two := 1 + 1; four := two * two; two * four ⤏ 8

For more complex expressions, it is recommended to start a new line after each ;.

If the body is omitted, the overall result is the (last) one assigned.

  a := 42 ⤏ 42

All other operators bind more strongly than := and ;. Hence a let construct nested inside an expression must be enclosed in parentheses ( and ).

The binding of a value to a variable by assignment is only effective for the following body; it expires at the end of the let construct. Variables can be reused in nested subexpressions; in that case the previous value is unaffected outside the body.

  a := 1; (a := 2; a + a) + a ⤏ 5

^{^ToC} 5 Functions

alea comes with a library of predefined functions. Every function is identified uniquely by a name. Predefined functions shall be introduced below, in the respective section dealing with the kind of data that they operate on. Operators built into the language are merely functions with special names and application syntax.

In the current version of alea, there is no mechanism for the user to define their own functions.

^{^ToC} 5.1 Overloading

Each named function is conceived as having a single implementation that operates partially on the universe of untyped values. In order for a function application to be valid, it must be ensured that the function is defined for that particular argument value. This check is done statically, such that invalid function applications are ruled out without having to (try and fail to) evaluate them.

Every expression is assigned a type that approximates the set of its possible values. Functions can be given one or more type signatures of the form f : A → B, meaning the following contract: "If the value of the argument x is in the set denoted by type A, then the result of function application f(x) is defined and in the set denoted by type B." If a function is given more than one type signature, they all apply simultaneously.

Static checking of a function application starts with determining the type T of the argument expression x. Then the type U of the function application f(x) is the greatest lower bound of the B in all type signatures f : A → B where A is a supertype of T.

^{^ToC} 6 Numbers

Numerical computations operate exactly on the rational numbers, the set ℚ. On the one hand, there is no limit to the precision; the number of significant digits can be as large as necessary. On the other hand, operations that conceptually return irrational values, e.g., the square root or the natural logarithm, are not supported. There are no approximate computations on floating-point numbers.

Besides the numbers proper, there is the special value NaN. Arithmetically undefined operations do not fail, but return NaN instead.

^{^ToC} 6.1 Subtypes

Four different sets of numbers are recognized as types:

The rational numbers, i.e., the mathematical set ℚ, including NaN.
The integer numbers, i.e., the mathematical set ℤ, including NaN.
The natural numbers, i.e., the mathematical set ℕ, including zero and NaN.
The booleans, i.e., the mathematical set 𝔹, not including NaN.

These form a linear subtype hierarchy: A boolean is also a natural; a natural is also an integer; an integer is also a rational.

^{^ToC} 6.2 Numerical Constants

Numerical constants are written as decimal numbers. Numbers that are not integers can be written either as decimal fractions or as true fractions with the separator / or ⁄ (U+2044 fraction slash).

  1024, −1, 3.14142, 22/7

The notation of numbers follows traditional mathematical conventions:

Leading zeroes are discouraged but have no effect.
A negative sign, if present, is not part of the number constant itself, but a special case of subtraction; see Section 6.3.3 ("Addition and Subtraction").
If there is a decimal point, it must be surrounded on both sides by at least one digit each.

      123. + .456

The special value NaN can be written as an undefined fraction.

0/0

^{^ToC} 6.2.1 Types of Numbers

Since the number types overlap, a particular number may be an instance of more than one type. But for every number there is a unique principal type that is subtype of all types the number is an instance of:

The numbers zero and one are principally booleans.
All other positive integers and NaN are principally naturals.
All negative integers are principally integers.
All other numbers are principally rationals.

^{^ToC} 6.3 Arithmetic Operators

The notation of arithmetic expressions largely follows traditional mathematical conventions. The available operators are listed here in decreasing order of precedence. Parentheses (...) can be used for grouping.

^{^ToC} 6.3.1 Powers

A number is raised to a power with the operator ^.

2^32, 10^(−12)

Nested powers are right-associative.

a^b^c ⤏ a^(b^c)

The power operation admits several combinations of argument types.

(^) : ℚ × ℤ → ℚ

(^) : ℤ × ℕ → ℤ

(^) : ℕ × ℕ → ℕ

Negative exponents must be parenthesized.

  n^−1

Note that fractional exponents are not supported semantically, even though the syntax may be well-formed.

  2^12^(-1)

^{^ToC} 6.3.2 Multiplication and Division

Numbers are multiplied with the operator *. For division, there are three distinct operators, namely / or ∕ (U+2215 division slash) for exact rational division, // or ⫽ (U+2AFD double solidus operator) for truncating integer division, and \\ for modulo.

Note that the use of specialized Unicode slash symbols can disambiguate between a fractional constant and the division of two integers.

All four operators have equal precedence, can be mixed freely, and are jointly left-associative.

a * b / c \\ d ⤏ ((a * b) / c) \\ d

Multiplicative operators bind less strongly than powers.

6.032 * 10^23

Negative factors must be parenthesized.

  3 * −5

The multiplicative operators admit several, distinct combinations of argument types.

(*) : ℚ × ℚ → ℚ

(*) : ℤ × ℤ → ℤ

(*) : ℕ × ℕ → ℕ

Note that multiplying two booleans does not yield a boolean. This is for symmetry with addition; see Section 6.3.3 ("Addition and Subtraction"). For the proper conjunction operator, see Section 6.3.5 ("Conjunction and Disjunction").

(/) : ℚ × ℚ → ℚ

(//) : ℚ × ℕ → ℤ

(//) : ℕ × ℕ → ℕ

(\\) : ℚ × ℕ → ℚ

(\\) : ℤ × ℕ → ℕ

The truncating division always yields the largest integer less than or equal to the result of exact division; compare Section 6.4.2 ("Rounding"). The modulo operation yields the positive remainder of division.

x / n = n * (x // n) + x \\ n

Note that a negative or fractional modulus is not supported semantically, even though the syntax may be well-formed.

  17 // (-4)

^{^ToC} 6.3.3 Addition and Subtraction

Numbers are added with the operator +. For subtraction, there are two distinct operators, namely - or − (U+2212 minus sign) for signed numbers, and \ or ∖ (U+2216 set minus) for natural numbers.

All three operators have equal precedence, can be mixed freely, and are jointly left-associative.

a − b + c ⤏ (a − b) + c

Additive operators bind less strongly than multiplicative operators.

a + b * c ⤏ a + (b * c)

The multiplicative operators admit several, distinct combinations of argument types. Note that + and \ are also used for collections; see Section 7.5 ("Collection Operators").

(+) : ℚ × ℚ → ℚ

(+) : ℤ × ℤ → ℤ

(+) : ℕ × ℕ → ℕ

Note that adding two booleans does not yield a boolean. This is useful for counting. For the proper disjunction operator, see Section 6.3.5 ("Conjunction and Disjunction").

(−) : ℚ × ℚ → ℚ

(−) : ℤ × ℤ → ℤ

Unary sign operators are understood as having an implicit zero left operand.

−a ⤏ 0 − a

Note that it follows from this reading that −0 is not a boolean, but principally an integer.

The nonnegative subtraction operation a \ b is the same as a − b if positive, otherwise zero.

(\) : ℕ × ℕ → ℕ

^{^ToC} 6.3.4 Comparisons

Numbers can be compared with the relational operators = for equality, and /= or ≠ (U+2260 not equal to) for inequality. They can be compared for order with <, >, <= or ≤ (U+2264 less-than or equal to), and >= or ≥ (U+2265 greater-than or equal to).

Comparisons are not associative, but can be chained. The meaning is the conjunction of all comparisons of pairwise adjacent operands, with each operand evaluated only once; see Section 4.1 ("Assignment") and Section 6.3.5 ("Conjunction and Disjunction").

  a ≤ b < c ⤏ tmp := b; a ≤ tmp ∧ tmp < c

Note that no transitive relation between non-adjacent operands is implied.

  a < b > c ≠ d

Comparison operators bind less strongly than arithmetic operators.

  a^2 + b^2 ≤ c^2

The result of any comparison is boolean.

(=) : ℚ × ℚ → 𝔹

(≠) : ℚ × ℚ → 𝔹

(<) : ℚ × ℚ → 𝔹

(>) : ℚ × ℚ → 𝔹

(≤) : ℚ × ℚ → 𝔹

(≥) : ℚ × ℚ → 𝔹

The operators = and ≠ should be used for the logical operators if-and-only-if and exclusive-or, respectively.

^{^ToC} 6.3.5 Conjunction and Disjunction

Boolean conjunction and disjunction are expressed with the operators && or ∧ (U+2227 logical and), and || or ∨ (U+2228 logical or), respectively.

Each of these operators is left-associative, but does not mix with the other.

a ∧ (b ∨ c) ∧ d

Logical operators bind less strongly than comparison operators.

0 ≤ i ∧ i < n

Logical operators operate on boolean operands only.

(∧) : 𝔹 × 𝔹 → 𝔹

(∨) : 𝔹 × 𝔹 → 𝔹

^{^ToC} 6.4 Mathematical Functions

^{^ToC} 6.4.1 Minimum and Maximum

The minimum and maximum functions return the lesser and the greater of two operands, respectively. The minimum function operates uniformly on each number type.

min : ℚ × ℚ → ℚ

min : ℤ × ℤ → ℤ

min : ℕ × ℕ → ℕ

min : 𝔹 × 𝔹 → 𝔹

Because the subset relation between integers and naturals is not order-symmetric, the maximum function admits more refined combinations of argument types.

max : ℚ × ℚ → ℚ

max : ℤ × ℤ → ℤ

max : ℤ × ℕ → ℕ

max : ℕ × ℤ → ℕ

max : 𝔹 × 𝔹 → 𝔹

Note that, by contrast, the minimum of a boolean and a natural is not generally a boolean, because of NaN.

As a special case on 𝔹, min and max are the same as ∧ and ∨, respectively.

^{^ToC} 6.4.2 Rounding

^{^ToC} 6.4.3 Absolute Value and Sign

abs : ℚ → ℚ

abs : ℤ → ℕ

sign : ℚ → ℤ

sign : ℕ → ℕ

Note that sign applied to a natural number does not yield a boolean, since the sign of NaN is NaN.

^{^ToC} 6.5 Ranges

Where multiple integer values are to given, they can be enumerated explicitly separated by ,. Contiguous intervals can be abbreviated by giving just the limits (both inclusive) with the separator ...

  2 .. 5 ⤏ 2, 3, 4, 5

Note that what a multitude of integers actually means depends on the context.

^{^ToC} 6.6 Numerical Case Distinction

An expression of a number type can be used to choose between alternative cases, very much like the switch statement of the C language family. The operator ? takes a numerical condition expression and a set of cases.

The set of cases is enclosed in curly brackets { and }.
Each case is indexed with one or more numbers, given individually or as a range.
Cases must not have overlapping indices.
A default case is written with the pseudo-index _ instead of a number.
All possible numbers, according to the type of the condition expression, must be covered. Since all number types except the booleans are infinite, it follows that a default case must always be given for non-boolean conditions.
An arrow -> or → (U+2192 rightwards arrow) separates the indices and the body expression of each case.
Cases are separated by ;.
The textual order of cases has no meaning.
The result of the whole expression is the value of the body expression of the case that is indexed by the value of the condition expression.

5 ? {
  1 -> a;
  2, 3 -> b;
  4 .. 6 -> c;
  _ -> d }
⤏ c

a ? b : c ⤏ a ? { 1 -> b; 0 -> c }

^{^ToC} 7 Collections

Collections are composite data aggregated from an arbitrary finite number of heterogeneous elements.

^{^ToC} 7.1 Shapes, Optionality and Element Types

Collections come in three different shapes:

Lists are ordered collections of possibly repeated elements.
Bags are unordered collections of possibly repeated elements.
Sets are unordered collections of unique elements.

Some operations on a collection only make sense if it is nonempty. The type system recognizes expressions that are certain, without any context, to yield nonempty collections.

The elements of a collection are not processed individually, but by bulk jobs that apply the same basic operation uniformly to all elements. Hence these operations can only assume a type of argument of which all collection elements are instances. Such a type is referred to as the element type of a collection.

^{^ToC} 7.2 Building a Collection

The collection shapes are notated each with a distinctive shape of brackets.

Lists use square brackets [ and ].
Sets use curly brackets { and }.
Bags use either angled brackets ⟨ (U+27E8 mathematical left angle bracket) and ⟩ (U+27E9 mathematical right angle bracket), or baggy brackets ⟅ (U+27C5 left s-shaped bag delimiter) and ⟆ (U+27C6 right s-shaped bag delimiter). Alternatively, "poor man's" angled brackets < and > can be used, but this is discouraged due to poor typography and confusion with comparison operators. Opening and closing bracket must be of the same style.

A collection can be built by simply enumerating expressions for its values, individually or as ranges of integers. The separators , and .. bind less strongly than all operators.

[ 1, 2, 1 + 2 ], { 1 .. n }

For ranges, both limits must evaluate to integers, and are understood as inclusive. The resulting collection is nonempty if and only if the upper limit is greater than or equal to the lower limit. It follows that a range denotes an empty collection if one of the limits is NaN.

The element type of a collection-building expression is the least upper bound of the types of the given element expressions.

^{^ToC} 7.3 Transforming a Collection

Collections are transformed to other collections in terms of map and filter operations. These are written in the form of comprehensions.

^{^ToC} 7.3.1 Comprehension Syntax

Comprehensions are written with a bracket indicating the shape of the resulting collection, containing a vertical bar |. On the left hand side of the bar, there is one or more element expressions as in Section 7.2 ("Building a Collection") above. On the right hand side, there is a sequence of zero or more generator and filter clauses, separated by ;.

A generator clause consists of a variable, or the dummy _, and a collection expression, separated by an arrow <- or ← (U+2190 leftwards arrow).
A filter clause is a boolean expression.

[ n^2 | n ← [1..10] ]

{ a+b, a*b | a ← S; b ← T; a ≤ b }

The scope of the variable bound by a generator clause covers all following clauses, and the left hand side of the comprehension.

⟨ x | b ← B; x ← b ⟩

The expression in a generator clause must yield a collection of the same shape as the resulting collection, or a naturally convertible one.

Lists are naturally convertible to bags (forgetting the order of elements).
Bags are naturally convertible to sets (forgetting the multiplicity of elements).
Lists are naturally convertible to sets (forgetting both).

{ n | n ← [1..3] }

[ n | n ← {1..3} ]

As a shortcut that is common in set theory notation, a generator can take the place on the left hand side of the bar. Then it functions both as the leftmost clause and as the element expression.

{ x ← S | x > 0 } ⤏ { x | x ← S; x > 0 }

^{^ToC} 7.3.2 Comprehension Semantics

The meaning of a comprehension can be defined inductively, by successively simplifying or eliminating the leftmost clause, immediately right of the bar.

If the leftmost clause is a generator clause that draws from an empty collection, then the result is empty.

[ es | x ← []; cs ] ⤏ []

Otherwise, if the generator clause draws from a singleton collection, then it is equivalent to a let assignment followed by the rest of the comprehension.

[ es | x ← [v]; cs ] ⤏ let x := v; [ es | cs ]

Otherwise, if the generator clause draws from a collection that can be decomposed into two parts, then it is equivalent to the combination of two analogous comprehensions that draw from the parts.

[ es | x ← l + r; cs ] ⤏ [ es | x ← l; cs ] + [ e | x ← r; cs ]

Otherwise, if the leftmost clause is a filter clause, then it means the same as a conditional that chooses between the rest of the comprehension and the empty collection.

[ es | b; cs ] ⤏ b ? [ es | cs ] : []

When there are no clauses left, the comprehension is equivalent to a plain collection-building expression.

[ es | ] ⤏ [es]

^{^ToC} 7.4 Evaluating a Collection

^{^ToC} 7.4.1 Reduction Semantics

f[] ⤏ n

f[v] ⤏ v

f(l + r) ⤏ f(f(l), f(r))

^{^ToC} 7.5 Collection Operators

(∧) : S(T, *) × S(T, *) → S(T, *)

(∨) : S(T, *) × S(T, *) → S(T, *)

(∨) : S(T, *) × S(T, +) → S(T, +)

(∨) : S(T, +) × S(T, *) → S(T, +)

(+) : L(T, *) × L(T, *) → L(T, *)

(+) : L(T, +) × L(T, *) → L(T, +)

(+) : L(T, *) × L(T, +) → L(T, +)

(+) : B(T, *) × B(T, *) → B(T, *)

(+) : B(T, +) × B(T, *) → B(T, +)

(+) : B(T, *) × B(T, +) → B(T, +)

(\) : B(T, *) × B(T, *) → B(T, *)

(\) : S(T, *) × S(T, *) → S(T, *)

min : B(T, *) × B(T, *) → B(T, *)

max : B(T, *) × B(T, *) → B(T, *)

max : B(T, +) × B(T, *) → B(T, +)

max : B(T, *) × B(T, +) → B(T, +)

(=) : B(T, *) × B(T, *) → 𝔹

...

^{^ToC} 7.6 Collection Functions

^{^ToC} 7.6.1 Counting Elements

size : L(T, *) → ℕ

size : B(T, *) → ℕ

size : S(T, *) → ℕ

^{^ToC} 7.6.2 Changing the Collection Shape

supp : L(T, *) → S(T, *)

supp : L(T, +) → S(T, +)

supp : B(T, *) → S(T, *)

supp : B(T, +) → S(T, +)

shuffle : L(T, *) → B(T, *)

shuffle : L(T, +) → B(T, +)

once : S(T, *) → B(T, *)

once : S(T, +) → B(T, +)

^{^ToC} 7.6.3 Multiplicities

The function mults maps a bag (with arbitrary element type) to the bag of multiplicities of its elements.

mults⟨a, b, c, b, a⟩ ⤏ ⟨1, 2, 2⟩

Note that each occurrence of NaN is not equal to others, and hence counts with multiplicity one.

The bag of multiplicities of a nonempty bag is nonempty.

mults : B(T, *) → B(ℕ, *)

mults : B(T, +) → B(ℕ, +)

^{^ToC} 8 Records

^{^ToC} 9 Tags

^{^ToC} 10 Distributions

produced with eu.bandm.metatools.d2d and XSLT 1.0

^ToC 1 Conventions

^ToC 1.1 Unicode

^ToC 1.2 Code Examples

^ToC 1.3 Type Signatures

^ToC 2 Files

^ToC 3 Comments

^ToC 3.1 Line Comments

^ToC 3.2 Block Comments

^ToC 4 Variables

^ToC 4.1 Assignment

^ToC 5 Functions

^ToC 5.1 Overloading

^ToC 6 Numbers

^ToC 6.1 Subtypes

^ToC 6.2 Numerical Constants

^ToC 6.2.1 Types of Numbers

^ToC 6.3 Arithmetic Operators

^ToC 6.3.1 Powers

^ToC 6.3.2 Multiplication and Division

^ToC 6.3.3 Addition and Subtraction

^ToC 6.3.4 Comparisons

^ToC 6.3.5 Conjunction and Disjunction

^ToC 6.4 Mathematical Functions

^ToC 6.4.1 Minimum and Maximum

^ToC 6.4.2 Rounding

^ToC 6.4.3 Absolute Value and Sign

^ToC 6.5 Ranges

^ToC 6.6 Numerical Case Distinction

^ToC 7 Collections

^ToC 7.1 Shapes, Optionality and Element Types

^ToC 7.2 Building a Collection

^ToC 7.3 Transforming a Collection

^ToC 7.3.1 Comprehension Syntax

^ToC 7.3.2 Comprehension Semantics

^ToC 7.4 Evaluating a Collection

^ToC 7.4.1 Reduction Semantics

^ToC 7.5 Collection Operators

^ToC 7.6 Collection Functions

^ToC 7.6.1 Counting Elements

^ToC 7.6.2 Changing the Collection Shape

^ToC 7.6.3 Multiplicities

^ToC 8 Records

^ToC 9 Tags

^ToC 10 Distributions