This file documents the Mercury programming language, version 0.12.2.

Copyright (C) 1995-2006 The University of Melbourne.

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions.

Table of Contents


Up: (mercury)

The Mercury Language Reference Manual, version 0.12.2


Next: , Previous: Top, Up: Top

1 Introduction

Mercury is a new general-purpose programming language, designed and implemented by a small group of researchers at the University of Melbourne, Australia. Mercury is based on the paradigm of purely declarative programming, and was designed to be useful for the development of large and robust “real-world” applications. It improves on existing logic programming languages by providing increased productivity, reliability and efficiency, and by avoiding the need for non-logical program constructs. Mercury provides the traditional logic programming syntax, but also allows the syntactic convenience of user-defined functions, smoothly integrating logic and functional programming into a single paradigm.

Mercury requires programmers to supply type, mode and determinism declarations for the predicates and functions they write. The compiler checks these declarations, and rejects the program if it cannot prove that every predicate or function satisfies its declarations. This improves reliability, since many kinds of errors simply cannot happen in successfully compiled Mercury programs. It also improves productivity, since the compiler pinpoints many errors that would otherwise require manual debugging to locate. The fact that declarations are checked by the compiler makes them much more useful than comments to anyone who has to maintain the program. The compiler also exploits the guaranteed correctness of the declarations for significantly improving the efficiency of the code it generates.

To facilitate programming-in-the-large, to allow separate compilation, and to support encapsulation, Mercury has a simple module system. Mercury's standard library has a variety of pre-defined modules for common programming tasks — see the Mercury Library Reference Manual.


Next: , Previous: Introduction, Up: Top

2 Syntax


Next: , Up: Syntax

2.1 Syntax overview

Mercury's syntax is similar to the syntax of Prolog, with some additional declarations for types, modes, determinism, the module system, and pragmas, and with the distinction that function symbols may stand also for invocations of user-defined functions as well as for data constructors.

A Mercury program consists of a set of modules. Each module is a file containing a sequence of items (declarations and clauses). Each item is a term followed by a period. Each term is composed of a sequence of tokens, and each token is composed of a sequence of characters. Like Prolog, Mercury has the Definite Clause Grammar (DCG) notation for clauses.


Next: , Previous: Syntax Overview, Up: Syntax

2.2 Tokens

Tokens in Mercury are the same as in ISO Prolog. The only differences are the #line token, which is used as a line number directive (see below) and the backquote (`) token.

The different tokens are as follows. Tokens may be separated by whitespace or line number directives.

line number directive
A line number directive consists of the character #, a positive integer specifying the line number, and then a newline. A #line directive's only role is to specifying the line number; it is otherwise ignored by the syntax. Line number directives may occur anywhere a token may occur. They are used in conjunction with the pragma source_file declaration to indicate that the Mercury code following was generated by another tool; they serve to associate each line in the Mercury code with the source file name and line number of the original source from which the Mercury code was derived, so that the Mercury compiler can issue more informative error messages using the original source code locations. A #line directive specifies the line number for the immediately following line. Line numbers for lines after that are incremented as usual, so the second line after a #100 directive would be considered to be line number 101.
string
A string is a sequence of characters enclosed in double quotes ("). Within a string, two adjacent double quotes stand for a single double quote. For example, the string """" is a string of length one, containing a single double quote: the outermost pair of double quotes encloses the string, and the innermost pair stand for a single double quote. Strings may also contain backslash escapes. \a stands for “alert” (a beep character), \b for backspace, \r for carriage-return, \f for form-feed, \t for tab, \n for newline, \v for vertical-tab. An escaped backslash, single-quote, or double-quote stands for itself. The sequence \x introduces a hexadecimal escape; it must be followed by a sequence of hexadecimal digits and then a closing backslash. It is replaced with the character whose character code is identified by the hexadecimal number. Similarly, a backslash followed by an octal digit is the beginning of an octal escape; as with hexadecimal escapes, the sequence of octal digits must be terminated with a closing backslash. A backslash followed immediately by a newline is deleted; thus an escaped newline can be used to continue a string over more than one source line. (String literals may also contain embedded newlines.)
name
A name is either an unquoted name or a quoted name. An unquoted name is a lowercase letter followed by zero or more letters, underscores, and digits. A quoted name is any sequence of zero or more characters enclosed in single quotes ('). Within a quoted name, two adjacent single quotes stand for a single single quote. Quoted names can also contain backslash escapes of the same form as for strings.
variable
A variable is an uppercase letter or underscore followed by zero or more letters, underscores, and digits. A variable token consisting of single underscore is treated specially: each instance of _ denotes a distinct variable. (In addition, variables starting with an underscore are presumed to be “don't-care” variables; the compiler will issue a warning if a variable that does not start with an underscore occurs only once, or if a variable starting with an underscore occurs more than once in the same scope.)
integer
An integer is either a decimal, binary, octal, hexadecimal, or character-code literal. A decimal literal is any sequence of decimal digits. A binary literal is 0b followed by any sequence of binary digits. An octal literal is 0o followed by any sequence of octal digits. A hexadecimal literal is 0x followed by any sequence of hexadecimal digits. A character-code literal is 0' followed by any single character.
float
A floating point literal consists of a sequence of decimal digits, a decimal point and a sequence of digits (the fraction part), and the letter E and another sequence of decimal digits (the exponent). The fraction part or the exponent (but not both) may be omitted.
open_ct
A left parenthesis, (, that is not preceded by whitespace.
open
A left parenthesis, (, that is preceded by whitespace.
close
A right parenthesis, ).
open_list
A left square bracket, [.
close_list
A right square bracket, ].
open_curly
A left curly bracket, {.
close_curly
A right curly bracket, }.
ht_sep
A “head-tail separator”, i.e. a vertical bar, |.
comma
A comma, ,.
end
A full stop (period), ..
eof
The end of file.


Next: , Previous: Tokens, Up: Syntax

2.3 Terms

Syntactically, terms in Mercury are exactly the same as in ISO Prolog, except that as extensions we permit higher-order terms and the introduction of infix operators by the use of grave accents (backquotes), as described below, and we support an extended set of builtin operators. See Builtin Operators. Also, the constructor for list terms in Mercury is [|]/2, not ./2 as in Prolog.

Note, however, that the meaning of some terms in Mercury is different to that in Prolog. See Data-terms.

A term is either a variable or a functor.

A functor is an integer, a float, a string, a name, a compound term, or a higher-order term.

A compound term is a simple compound term, a list term, a tuple term, an operator term, or a parenthesized term.

A simple compound term is a name followed without any intervening whitespace by an open parenthesis (i.e. an open_ct token), a sequence of argument terms separated by commas, and a close parenthesis.

A list term is an open square bracket (i.e. an open_list token) followed by a sequence of argument terms separated by commas, optionally followed by a vertical bar (i.e. a close_list token) followed by a term, followed by a close square bracket (i.e. a close_list token). An empty list term is an open_list token followed by a close_list token. List terms are parsed as follows:

     parse('[' ']') = [].
     parse('[' List) = parse_list(List).
     parse_list(Head ',' Tail) = '[|]'(parse_term(Head), parse_list(Tail)).
     parse_list(Head '|' Tail ']') = '[|]'(parse_term(Head), parse_term(Tail)).
     parse_list(Head ']') = '[|]'(parse_term(Head), []).

The following terms are all equivalent:

     [1, 2, 3]
     [1, 2, 3 | []]
     [1, 2 | [3]]
     [1 | [2, 3]]
     '[|]'(1, '[|]'(2, '[|]'(3, [])))

A tuple term is a left curly bracket (i.e. an open_curly token) followed by a sequence of argument terms separated by commas, and a right curly bracket. For example, {1, '2', "three"} is a valid tuple term.

An operator term is a term specified using operator notation, as in Prolog. Operators can also be formed by enclosing a name, a module qualified name (see The module system), or a variable between grave accents (backquotes). Any name or variable may be used as an operator in this way. If fun is a variable or name, then a term of the form X `fun` Y is equivalent to fun(X, Y). The operator is left associative and binds more tightly than every operator other than ^ (see Builtin Operators).

A parenthesized term is just an open parenthesis followed by a term and a close parenthesis.

A higher-order term is a “closure” term, which can be any term other than a name or an operator term, followed without any intervening whitespace by an open parenthesis (i.e. an open_ct token), a sequence of argument terms separated by commas, and a close parenthesis. A higher-order term is equivalent to a simple compound term whose functor is the empty name, and whose arguments are the closure term followed by the argument terms of the higher-order term. That is, a term such as Term(Arg1, ..., ArgN) is parsed as ''(Term, Arg1, ..., ArgN). Note that the closure term can be a parenthesized term; for example, (Term ^ FieldName)(Arg1, Arg2) is a higher-order term, and so it gets parsed as if it were ''((Term ^ FieldName), Arg1, Arg2).


Next: , Previous: Terms, Up: Syntax

2.4 Builtin Operators

The following table lists all of Mercury's builtin operators. Operators with a low “Priority” bind more tightly than those with a high “Priority”. For example, given that + has priority 500 and * has priority 400, the term 2 * X + Y would parse as (2 * X) + Y.

The “Specifier” field indicates what structure terms constructed with an operator are allowed to take. “f” represents the operator and “x” and “y” represent arguments. “x” represents an argument whose priority must be strictly lower than that of the operator. “y” represents an argument whose priority is lower or equal to that of the operator. For example, “yfx” indicates a left-associative infix operator, while “xfy” indicates a right-associative infix operator.

     
     Operator          Specifier Priority
     
     .                 yfx       10
     @                 xfx       50
     ^                 xfy       99
     ^                 fx        100
     `op`              yfx       120       1
     **                xfy       200
     -                 fx        200
     \                 fx        200
     *                 yfx       400
     //                yfx       400
     /                 yfx       400
     <<                yfx       400
     >>                yfx       400
     div               yfx       400
     mod               xfx       400
     rem               xfx       400
     ++                xfy       500
     +                 yfx       500
     +                 fx        500
     --                yfx       500
     -                 yfx       500
     /\                yfx       500
     \/                yfx       500
     aditi_bottom_up   fx        500
     aditi_top_down    fx        500
     ..                xfx       550
     :                 yfx       600
     :=                xfx       650
     =^                xfx       650
     <                 xfx       700
     =..               xfx       700
     =:=               xfx       700
     =<                xfx       700
     ==                xfx       700
     =\=               xfx       700
     =                 xfx       700
     >=                xfx       700
     >                 xfx       700
     @<                xfx       700
     @=<               xfx       700
     @>=               xfx       700
     @>                xfx       700
     \==               xfx       700
     \=                xfx       700
     ~=                xfx       700
     is                xfx       701
     and               xfy       720
     or                xfy       740
     func              fx        800
     impure            fy        800
     pred              fx        800
     semipure          fy        800
     \+                fy        900
     not               fy        900
     when              xfx       900
     ~                 fy        900
     <=>               xfy       920
     <=                xfy       920
     =>                xfy       920
     all               fxy       950
     lambda            fxy       950
     some              fxy       950
     ,                 xfy       1000
     &                 xfy       1025
     ->                xfy       1050
     ;                 xfy       1100
     then              xfx       1150
     if                fx        1160
     else              xfy       1170
     ::                xfx       1175
     ==>               xfx       1175
     where             xfx       1175
     --->              xfy       1179
     type              fx        1180
     end_module        fx        1199
     import_module     fx        1199
     include_module    fx        1199
     instance          fx        1199
     inst              fx        1199
     mode              fx        1199
     module            fx        1199
     pragma            fx        1199
     promise           fx        1199
     rule              fx        1199
     typeclass         fx        1199
     use_module        fx        1199
     -->               xfx       1200
     :-                xfx       1200
     :-                fx        1200
     ?-                fx        1200
     


Next: , Previous: Builtin Operators, Up: Syntax

2.5 Items

Each item in a Mercury module is either a declaration or a clause. If the top-level functor of the term is :-/1, the item is a declaration, otherwise it is a clause. There are three types of clauses. If the top-level functor of the item is :-/2, the item is a rule. If the top-level functor is -->/2, the item is a DCG rule. Otherwise, the item is a fact. There are two types of rules and facts. If the top-level functor of the head of a rule is =/2, the rule is a function rule, otherwise it is a predicate rule. If the top-level functor of the head of a fact is =/2, the fact is a function fact, otherwise it is a predicate fact.


Next: , Previous: Items, Up: Syntax

2.6 Declarations

The allowed declarations are:

     :- type
     :- pred
     :- func
     :- inst
     :- mode
     :- typeclass
     :- instance
     :- pragma
     :- promise
     :- module
     :- interface
     :- implementation
     :- import_module
     :- use_module
     :- include_module
     :- end_module

The type, pred and func declarations are used for the type system, the inst and mode declarations are for the mode system, the pragma declarations are for the C interface, and for compiler hints about inlining, and the remainder are for the module system. They are described in more detail in their respective chapters.


Next: , Previous: Declarations, Up: Syntax

2.7 Facts

A function fact is an item of the form Head = Result. A predicate fact is an item of the form Head, where the top-level functor of Head is not :-/1, :-/2, -->/2, or =/2. In both cases, the Head term must not be a variable. The top-level functor of the Head determines which predicate or function the fact belongs to; the predicate or function must have been declared in a preceding pred or func declaration in this module. The Result (if any) and the arguments of the Head must be valid data-terms (optionally annotated with a mode qualifier; see Different clauses for different modes).

A fact is equivalent to a rule whose body is true.


Next: , Previous: Facts, Up: Syntax

2.8 Rules

A function rule is an item of the form Head = Result :- Body. A predicate rule is an item of the form Head :- Body where the top-level functor of Head is not =/2. In both cases, the Head term must not be a variable. The top-level functor of the Head determines which predicate or function the clause belongs to; the predicate or function must have been declared in a preceding pred or func declaration in this module. The Result and the arguments of the Head must be valid data-terms (optionally annotated with a mode qualifier; see Different clauses for different modes). The Body must be a valid goal.


Next: , Previous: Rules, Up: Syntax

2.9 Goals

A goal is a term of one of the following forms:

some Vars Goal
An existential quantification. Vars must be a list of variables. Goal must be a valid goal.

Each existential quantification introduces a new scope. The variables in Vars are local to the goal Goal: for each variable named in Vars, any occurrences of variables with that name in Goal are considered to name a different variable than any variables with the same name that occur outside of the existential quantification.

Operationally, existential quantification has no effect, so apart from its effect on variable scoping, some Vars Goal is the same as Goal.

Mercury's rules for implicit quantification (see Implicit quantification) mean that variables are often implicitly existentially quantified. There is usually no need to write existential quantifiers explicitly.

all Vars Goal
A universal quantification. Vars must be a list of variables. Goal must be a valid goal. This is an abbreviation for not (some Vars not Goal).
Goal1, Goal2
A conjunction. Goal1 and Goal2 must be valid goals.
Goal1 ; Goal2
where Goal1 is not of the form Goal1a -> Goal1b: a disjunction. Goal1 and Goal2 must be valid goals.
true
The empty conjunction. Always succeeds.
fail
The empty disjunction. Always fails.
not Goal
\+ Goal
A negation. The two different syntaxes have identical semantics. Goal must be a valid goal. Both forms are equivalent to if Goal then fail else true.
Goal1 => Goal2
An implication. This is an abbreviation for not (Goal1, not Goal2).
Goal1 <= Goal2
A reverse implication. This is an abbreviation for not (Goal2, not Goal1).
Goal1 <=> Goal2
A logical equivalence. This is an abbreviation for (Goal1 => Goal2), (Goal1 <= Goal2).
if CondGoal then ThenGoal else ElseGoal
CondGoal -> ThenGoal ; ElseGoal
An if-then-else. The two different syntaxes have identical semantics. CondGoal, ThenGoal, and ElseGoal must be valid goals. Note that the “else” part is not optional.

The declarative semantics of an if-then-else is given by ( CondGoal, ThenGoal ; not(CondGoal), ElseGoal), but the operational semantics are different, and it is treated differently for the purposes of determinism inference (see Determinism). Operationally, it executes the CondGoal, and if that succeeds, then execution continues with the ThenGoal; otherwise, i.e. if CondGoal fails, it executes the ElseGoal. Note that CondGoal can be nondeterministic — unlike Prolog, Mercury's if-then-else does not commit to the first solution of the condition if the condition succeeds.

Term1 = Term2
A unification. Term1 and Term2 must be valid data-terms.
Term1 \= Term2
An inequality. Term1 and Term2 must be valid data-terms. This is an abbreviation for not (Term1 = Term2).
call(Closure)
call(Closure1, Arg1)
call(Closure2, Arg1, Arg2)
call(Closure3, Arg1, Arg2, Arg3)
...
A higher-order predicate call. The closure and arguments must be valid data-terms. call(Closure) just calls the specified closure. The other forms append the specified arguments onto the argument list of the closure before calling it. See Higher-order.
Var
Var(Arg1)
Var(Arg2)
Var(Arg2, Arg3)
...
A higher-order predicate call. Var must be a variable. The semantics are exactly the same as for the corresponding higher-order call using the call/N syntax, i.e. call(Var), call(Var, Arg1), etc.
aditi_bulk_delete(...)
aditi_bulk_insert(...)
aditi_bulk_modify(...)
aditi_delete(...)
aditi_insert(...)
These goal forms are used for the Aditi database interface. See Aditi update syntax.
Call
Any goal which does not match any of the above forms must be a predicate call. The top-level functor of the term determines the predicate called; the predicate must be declared in a pred declaration in the module or in the interface of an imported module. The arguments must be valid data-terms.


Next: , Previous: Goals, Up: Syntax

2.10 State variables

Clauses may use state variables as a shorthand for naming intermediate values in a sequence. That is, where in the plain syntax one might write

     	main(IO0, IO) :-
     		io.write_string("The answer is ", IO0, IO1),
     		io.write_int(calculate_answer(...), IO1, IO2),
     		io.nl(IO3, IO).

using state variable syntax one could write

     	main(!IO) :-
     		io.write_string("The answer is ", !IO),
     		io.write_int(calculate_answer(...), !IO),
     		io.nl(!IO).

A state variable is written !.X or !:X, denoting the “current” or “next” value of the sequence labelled X. An argument !X is shorthand for two state variable arguments !.X, !:X; that is, p(..., !X, ...) is parsed as p(..., !.X, !:X, ...).

Within each clause, a transformation converts state variables into sequences of ordinary logic variables. The syntactic conversion is described in terms of the notional transform function defined next.

The transformation is applied once for each state variable X with some fresh variables which we shall call ThisX and NextX.

The expression substitute(Term, X, ThisX, NextX) stands for a copy of Term with free occurrences of !.X replaced with ThisX and occurrences of !:X replaced with NextX (a free occurrence is one not bound by the head of a clause or lambda or by explicit quantification.)

State variables obey the special scope rules. A state variable X must be explicitly introduced either in the head of the clause or lambda (in which case it may appear as either or both of !.X or !:X) or in an explicit quantification (in which case it must appear as !X.) A state variable X in the enclosing scope of a lambda or if-then-else expression may only be referred to as !.X (unless the enclosing X is masked by a more local state variable of the same name.)

For instance, the following goal employing an if-then-else expression

     	p((if q(!X), r(!X) then A else B), !X)

is illegal because it implicitly refers to !:X in the condition of the if-then-else expression. However

     	p((if some[!X] (q(!X), r(!X)) then A else B), !X)

is acceptable because the state variable X is locally scoped to the condition and then-goal of the if-then-else expression, hence !:X may appear therein.

There are three restrictions concerning state variables in lambdas: first, !X is not a legitimate function result, since it stands for two arguments, rather than one; second, !X may not appear as a parameter term in the head of a lambda since there is no syntax for specifying the modes of the two implied parameters; third, !X may not appear as an argument in a function application since this would not make sense given the usual interpretation of state variables and functions.

Head :- Body
          transform((Head :- Body), X, ThisX, NextX) =
          substitute(Head, X, ThisX, NextX) :- transform(Body, X, ThisX, NextX)
     

Head --> Body
          transform((Head --> Body), X, ThisX, NextX) =
          substitute(Head, X, ThisX, NextX) :- transform(Body, X, ThisX, NextX)
     

Goal1, Goal2
          transform((Goal1, Goal2), X, ThisX, NextX) =
          transform(Goal1, X, ThisX, TmpX), transform(Goal2, X, TmpX, NextX)
     

for some fresh variable TmpX.

Goal1 ; Goal2
          transform((Goal1 ; Goal2), X, ThisX, NextX) =
          transform(Goal1, X, ThisX, NextX) ; transform(Goal2, X, ThisX, NextX)
     

not Goal
\+ Goal
A negation. The two different syntaxes have identical semantics.
          transform((not Goal), X, ThisX, NextX) =
          not transform(Goal1, X, ThisX, DummyX), NextX = ThisX
     

for some fresh variable DummyX.

if Goal1 then Goal2 else Goal3
Goal1 -> Goal2 ; Goal3
An if-then-else. The two different syntaxes have identical semantics.
          transform((if Goal1 then Goal2 else Goal3), X, ThisX, NextX) =
          if transform(Goal1, X, ThisX, TmpX) then transform(Goal2, X, TmpX,  NextX)
                                              else transform(Goal3, X, ThisX, NextX)
     

for some fresh variable TmpX.

Goal1 => Goal2
Goal2 <= Goal1
An implication. The two different syntaxes have identical semantics.
          transform((Goal1 => Goal2), X, ThisX, NextX) =
          transform(Goal1, X, ThisX, TmpX) => transform(Goal2, X, TmpX,  NextX),
          NextX = ThisX
     

for some fresh variable TmpX.

all Vars Goal
          transform((all Vars Goal), X, ThisX, NextX) =
          all Vars transform(Goal, X, ThisX, DummyX), NextX = ThisX
     

for some fresh variable DummyX.

some Vars Goal
          transform((some Vars Goal), X, ThisX, NextX) =
          some Vars transform(Goal, X, ThisX, NextX)
     

Call_or_Unification
If !:X does not appear in Call_or_Unification then
          transform(Call_or_Unification, X, ThisX, NextX) =
          substitute(Call_or_Unification, X, ThisX, NextX), NextX = ThisX
     

If !:X does appear in Call_or_Unification then

          transform(Call_or_Unification, X, ThisX, NextX) =
          substitute(Call_or_Unification, X, ThisX, NextX)
     

This transformation can lead to the introduction of chains of unifications for variables that do not otherwise play a role in the definition. Such chains are removed transparently.

The following code fragments illustrate appropriate use of state variable syntax.

Threading the IO state
          main(!IO) :-
          	io.write_string("The 100th prime is ", !IO),
          	X = prime(100),
          	io.write_int(X, !IO),
          	io.nl(!IO).
     

Handling accumulators (1)
          foldl2(_, [], !A, !B).
          
          foldl2(P, [X | Xs], !A, !B) :-
          	P(X, !A, !B),
          	foldl2(P, Xs, !A, !B).
     

Handling accumulators (2)
          iterate_while2(P, F, !A, !B) :-
          	( if P(!.A, !.B) then
          		F(!A, !B),
          		iterate_while2(P, F, !A, !B)
          	  else
          	  	true
          	).
     


Next: , Previous: State variables, Up: Syntax

2.11 DCG-rules

(DCG notation is intended for writing parsers and sequence generators in a particular style; in the past it has also been used to thread an implicit state variable, typically the IO state, through code. As a matter of style, we recommend that in future DCG notation be reserved for writing parsers and sequence generators and that state variable syntax be used for passing state threads.)

DCG-rules in Mercury have identical syntax and semantics to DCG-rules in Prolog.

A DCG-rule is an item of the form Head --> Body. The Head term must not be a variable. A DCG-rule is an abbreviation for an ordinary rule with two additional implicit arguments appended to the arguments of Head. These arguments are fresh variables which we shall call V_in and V_out. The Body must be a valid DCG-goal, and is an abbreviation for an ordinary goal. The next section defines a mathematical function DCG-transform(V_in, V_out, DCG-goal) which specifies the semantics of how DCG goals are transformed into ordinary goals. (The DCG-transform function is purely for the purposes of exposition, to define the semantics — it is not part of the language.)


Next: , Previous: DCG-rules, Up: Syntax

2.12 DCG-goals

A DCG-goal is a term of one of the following forms:

some Vars DCG-goal
A DCG existential quantification. Vars must be a list of variables. DCG-goal must be a valid DCG-goal.

Semantics:

          transform(V_in, V_out, some Vars DCG_goal) =
          some Vars transform(V_in, V_out, DCG_goal)
     

all Vars DCG-goal
A DCG universal quantification. Vars must be a list of variables. DCG-goal must be a valid DCG-goal.

Semantics:

          transform(V_in, V_out, all Vars DCG_goal) =
          all Vars transform(V_in, V_out, DCG_goal)
     

DCG-goal1, DCG-goal2
A DCG sequence. Intuitively, this means “parse DCG-goal1 and then parse DCG-goal2” or “do DCG-goal1 and then do DCG-goal2”. (Note that the only way this construct actually forces the desired sequencing is by the modes of the implicit DCG arguments.) DCG-goal1 and DCG-goal2 must be valid DCG-goals.

Semantics:

          transform(V_in, V_out, (DCG-goal1, DCG-goal2)) =
          (transform(V_in, V_new, DCG_goal1),
           transform(V_new, V_out, DCG_goal2))
     

where V_new is a fresh variable.

DCG-goal1 ; DCG-goal2
A disjunction. DCG-goal1 and DCG-goal2 must be valid goals. DCG-goal1 must not be of the form DCG-goal1a -> DCG-goal1b. (If it is, then the goal is an if-then-else, not a disjunction.)

Semantics:

          transform(V_in, V_out, (DCG_goal1 ; DCG_goal2)) =
          ( transform(V_in, V_out, DCG_goal1)
          ; transform(V_in, V_out, DCG_goal2) )
     

{ Goal }
A brace-enclosed ordinary goal. Goal must be a valid goal.

Semantics:

          transform(V_in, V_out, { Goal }) = (Goal, V_out = V_in)
     
[Term, ...]
A DCG input match. Unifies the implicit DCG input variable V_in, which must have type list(_), with a list whose initial elements are the terms specified and whose tail is the implicit DCG output variable V_out. The terms must be valid data-terms.

Semantics:

          transform(V_in, V_out, [Term1, ...]) = (V_in = [Term, ... | V_Out])
     

[]
The null DCG goal (an empty DCG input match). Equivalent to { true }.

Semantics:

          transform(V_in, V_out, []) = (V_out = V_in)
     

not DCG-goal
\+ DCG-goal
A DCG negation. The two different syntaxes have identical semantics. Goal must be a valid goal.

Semantics:

          transform(V_in, V_out, not DCG_goal) =
          (not transform(V_in, V_new, DCG_goal), V_out = V_in)
     

where V_new is a fresh variable.

if CondGoal then ThenGoal else ElseGoal
CondGoal -> ThenGoal ; ElseGoal
A DCG if-then-else. The two different syntaxes have identical semantics. CondGoal, ThenGoal, and ElseGoal must be valid DCG-goals.

Semantics:

          transform(V_in, V_out, if CondGoal then ThenGoal else ElseGoal) =
          if transform(V_in, V_cond, CondGoal) then
                  transform(V_cond, V_out, ThenGoal)
          else
                  transform(V_in, V_out, ElseGoal)
     

=(Term)
A DCG unification. Unifies Term with the implicit DCG argument. Term must be a valid data-term.

Semantics:

          transform(V_in, V_out, =(Term)) = (Term = V_in, V_out = V_in)
     

:=(Term)
A DCG output unification. Unifies Term with the implicit DCG output argument, ignoring the input DCG argument. Term must be a valid data-term.

Semantics:

          transform(V_in, V_out, :=(Term)) = (V_out = Term)
     

Term =^ field_list
A DCG field selection. Unifies Term with the result of applying the field selection field_list to the implicit DCG argument. Term must be a valid data-term. field_list must be a valid field list. See Record syntax.

Semantics:

          transform(V_in, V_out, Term =^ field_list) =
                  (Term = V_in ^ field_list, V_out = V_in)
     

^ field_list := Term
A DCG field update. Replaces a field in the implicit DCG argument. Term must be a valid data-term. field_list must be a valid field list. See Record syntax.

Semantics:

          transform(V_in, V_out, ^ field_list := Term) =
                  (V_out = V_in ^ field_list := Term)
     

DCG-call
Any term which does not match any of the above forms must be a DCG predicate call. If the term is a variable Var, it is treated as if it were call(Var). Then, the two implicit DCG arguments are appended to the specified arguments.

Semantics:

          transform(V_in, V_out, p(A1, ..., AN)) =
          p(A1, ..., AN, V_in, V_out)
     


Next: , Previous: DCG-goals, Up: Syntax

2.13 Data-terms

Syntactically, a data-term is just a term.

There are a couple of differences from Prolog. The first one is that double-quoted strings are atomic in Mercury, they are not abbreviations for lists of character codes. The second is that Mercury provides several extensions to Prolog's term syntax: Mercury terms may contain record field selection and field update expressions, conditional (if-then-else) expressions, function applications, higher-order function applications, lambda expressions, and explicit type qualifications.

A data-term is either a variable, a data-functor, or a special data-term. A special data-term is a conditional expression, a record syntax expression, a lambda expression, a higher-order function application, or an explicit type qualification.


Next: , Up: Data-terms

2.13.1 Data-functors

A data-functor is an integer, a float, a string, a character literal (any single-character name), a name, or a compound data-term. A compound data-term is a compound term which does not match the form of a special data-term (see Data-terms), and whose arguments are data-terms. If a data-functor is a name or a compound data-term, its top-level functor must name a function, predicate, or data constructor declared in the program or in the interface of an imported module.


Next: , Previous: Data-functors, Up: Data-terms

2.13.2 Record syntax

Record syntax provides a convenient way to select or update fields of data constructors, independent of the definition of the constructor. Record syntax expressions are transformed into sequences of calls to field selection or update functions (see Field access functions).

A field specifier is a name or a compound data-term. A field list is a list of field specifiers separated by ^. field, field1 ^ field2 and field1(A) ^ field2(B, C) are all valid field lists.

If the top-level functor of a field specifier is field/N, there must be a visible selection function field/(N + 1). If the field specifier occurs in a field update expression, there must also be a visible update function named 'field :='/(N + 2).

Record syntax expressions have one of the following forms. There are also record syntax DCG goals (see DCG-goals), which provide similar functionality to record syntax expressions, except that they act on the DCG arguments of a DCG clause.

Term ^ field_list
A field selection. For each field specifier in field_list, apply the corresponding selection function in turn.

Term must be a valid data-term. field_list must be a valid field list.

A field selection is transformed using the following rules:

          transform(Term ^ Field(Arg1, ...)) = Field(Arg1, ..., Term).
          transform(Term ^ Field(Arg1, ...) ^ Rest) =
                          transform(Field(Arg1, ..., Term) ^ Rest).
     

Examples:

Term ^ field is equivalent to field(Term).

Term ^ field(Arg) is equivalent to field(Arg, Term).

Term ^ field1(Arg1) ^ field2(Arg2, Arg3) is equivalent to field2(Arg2, Arg3, field1(Arg1, Term)).

Term ^ field_list := FieldValue
A field update, returning a copy of Term with the value of the field specified by field_list replaced with FieldValue.

Term must be a valid data-term. field_list must be a valid field list.

A field update is transformed using the following rules:

          transform(Term ^ Field(Arg1, ...) := FieldValue) =
                          'Field :='(Arg1, ..., Term, FieldValue)).
          
          transform(Term0 ^ Field(Arg1, ...) ^ Rest := FieldValue) = Term :-
                  OldFieldValue = Field(Arg1, ..., Term0),
                  NewFieldValue = transform(OldFieldValue ^ Rest := FieldValue),
                  Term = 'Field :='(Arg1, ..., Term0, NewFieldValue).
     

Examples:

Term ^ field := FieldValue is equivalent to 'field :='(Term, FieldValue).

Term ^ field(Arg) := FieldValue is equivalent to 'field :='(Arg, Term, FieldValue).

Term ^ field1(Arg1) ^ field2(Arg2) := FieldValue is equivalent to the code

          OldField1 = field1(Arg1, Term),
          NewField1 = 'field2 :='(Arg2, OldField1, FieldValue),
          Result = 'field1 :='(Arg1, Term, NewField1)
     


Next: , Previous: Record syntax, Up: Data-terms

2.13.3 Unification expressions

A unification expression is an expression of the form

     X @ Y

where X and Y are data-terms.

The meaning of a unification expression is that the arguments are unified, and the expression is equivalent to the unified value.

The strict sequential operational semantics (see Semantics) of an expression X @ Y is that the expression is replaced by a fresh variable Z, and immediately after Z is evaluated, the conjunction Z = X, Z = Y is evaluated.

For example

     p(X @ f(_, _), X).

is equivalent to

     p(H1, H2) :-
             H1 = X,
             H1 = f(_, _),
             H2 = X.

Unification expressions are most useful when writing switches (see Determinism checking and inference). The arguments of a unification expression are examined when checking for switches. The arguments of an equivalent user-defined function would not be.


Next: , Previous: Unification expressions, Up: Data-terms

2.13.4 Conditional expressions

A conditional expression is an expression of either of the two following forms

     (if Goal then Expression1 else Expression2)
     (Goal -> Expression1 ; Expression2)

Goal is a goal; Expression1 and Expression2 are both data-terms. The semantics of a conditional expression is that if Goal is true, then the expression has the meaning of Expression1, else the expression has the meaning of Expression2.

If Goal takes the form some [X, Y, Z] ... then the scope of X, Y, and Z includes Expression1.


Next: , Previous: Conditional expressions, Up: Data-terms

2.13.5 Lambda expressions

A lambda expression is a compound term of one of the following forms

     lambda([Arg1::Mode1, Arg2::Mode2, ...] is Det, Goal)
     pred(Arg1::Mode1, Arg2::Mode2, ...) is Det :- Goal
     pred(Arg1::Mode1, Arg2::Mode2, ..., DCGMode0, DCGMode1) is Det --> DCGGoal
     func(Arg1::Mode1, Arg2::Mode2, ...) = (Result::Mode) is Det :- Goal
     func(Arg1, Arg2, ...) = (Result) is Det :- Goal
     func(Arg1, Arg2, ...) = Result :- Goal

where Arg1, Arg2, ... are zero or more data-terms, Result is a data-term, Mode1, Mode2, ... are zero or more modes (see Modes), DCGMode0 and DCGMode1 are modes (see Modes), Det is a determinism (see Determinism), Goal is a goal (see Goals), and DCGGoal is a DCG Goal (see DCG-goals). The :- Goal part is optional; if it is not specified, then :- true is assumed. A lambda expression denotes a higher-order predicate or function term whose value is the predicate or function of the specified arguments determined by the specified goal. See Higher-order.

A lambda expression introduces a new scope: any variables occurring in the arguments Arg1, Arg2, ... are locally quantified, i.e. any occurrences of variables with that name in the lambda expression are considered to name a different variable than any variables with the same name that occur outside of the lambda expression. For variables which occur in Result or Goal, but not in the arguments, the usual Mercury rules for implicit quantification apply (see Implicit quantification).

The form of lambda expression using lambda as its top level functor is deprecated; please use the form using pred instead.

The form of lambda expression using --> as its top level functor is a syntactic abbreviation: an expression of the form

     pred(Var1::Mode1, Var2::Mode2, ..., DCGMode0, DCGMode1) is Det --> DCGGoal

is equivalent to

     pred(Var1::Mode1, Var2::Mode2, ...,
         DCGVar0::DCGMode0, DCGVar1::DCGMode1) is Det :- Goal

where DCGVar0 and DCGVar1 are fresh variables, and Goal is the result of DCG-transform(DCGVar0, DCGVar1, DCGGoal) where DCG-transform is the function specified in DCG-goals.


Next: , Previous: Lambda expressions, Up: Data-terms

2.13.6 Higher-order function applications

A higher-order function application is a compound term of one of the following two forms

     apply(Func, Arg1, Arg2, ..., ArgN)
     FuncVar(Arg1, Arg2, ..., ArgN)

where N >= 0, Func is a term of type func(T1, T2, ..., Tn) = T, FuncVar is a variable of that type, and Arg1, Arg2, ..., ArgN are terms of types T1, T2, ..., Tn. The type of the higher-order function application term is T. It denotes the result of applying the specified function to the specified arguments. See Higher-order.


Previous: Higher-order function applications, Up: Data-terms

2.13.7 Explicit type qualification

Explicit type qualifications are occasionally useful to resolve ambiguities that can arise from overloading or polymorphic types.

An explicit type qualification expression is a term of the form

     with_type(Term, Type)

or equivalently, as it is more commonly written,

     Term `with_type` Type

Term must be a valid data-term. Type must be a valid type (see Types).

An explicit type qualification expression constrains the specified term to have the specified type. Apart from that, the meaning of an explicit type qualification expression is just the same as the specified Term.


Next: , Previous: Data-terms, Up: Syntax

2.14 Variable scoping

There are three sorts of variables in Mercury: ordinary variables, type variables, and inst variables.

Variables occurring in types are called type variables. Variables occurring in insts or modes are called inst variables. Variables that occur in data-terms, and that are not inst variables or type variables, are called ordinary variables.

(Type variables can occur in data-terms in the right-hand [Type] operand of an explicit type qualification. Inst variables can occur in data-terms in the right-hand [Mode] operand of an explicit mode qualification. Apart from that, all other variables in data-terms are ordinary variables.)

The three different variable sorts occupy different namespaces: there is no semantic relationship between two variables of different sorts (e.g. a type variable and an ordinary variable) even if they happen to share the same name. (However, as a matter of programming style, it is generally a bad idea to use the same name for variables of different sorts in the same clause.)

The scope of ordinary variables is the clause or declaration in which they occur, unless they are quantified, either explicitly (see Goals) or implicitly (see Implicit quantification).

The scope of type variables in a predicate or function's type declaration extends over any explicit type qualifications (see Explicit type qualification) in the clauses for that predicate or function, and over pragma type_spec (see Type specialization) declarations for that predicate or function, so that explicit type qualifications and pragma type_spec declarations can refer to those type variables. The scope of any type variables in an explicit type qualification which do not occur in the predicate or function's type declaration is the clause in which they occur.

The scope of inst variables is the clause or declaration in which they occur.


Next: , Previous: Variable scoping, Up: Syntax

2.15 Implicit quantification

The rule for implicit quantification in Mercury is not the same as the usual one in mathematical logic. In Mercury, variables that do not occur in the head of a clause are implicitly existentially quantified around their closest enclosing scope (in a sense to be made precise in the following paragraphs). This allows most existential quantifiers to be omitted, and leads to more concise code.

An occurrence of a variable is in a negated context if it is in a negation, in a universal quantification, in the condition of an if-then-else, in an inequality, or in a lambda expression.

Two goals are parallel if they are different disjuncts of the same disjunction, or if one is the “else” part of an if-then-else and the other goal is either the “then” part or the condition of the if-then-else, or if they are the goals of disjoint (distinct and non-overlapping) lambda expressions.

If a variable occurs in a negated context and does not occur outside of that negated context other than in parallel goals (and in the case of a variable in the condition of an if-then-else, other than in the “then” part of the if-then-else), then that variable is implicitly existentially quantified inside the negation.


Previous: Implicit quantification, Up: Syntax

2.16 Elimination of double negation

The treatment of inequality, universal quantification, implication, and logical equivalence as abbreviations can cause the introduction of double negations which could make otherwise well-formed code mode-incorrect. To avoid this problem, the language specifies that after syntax analysis and implicit quantification, and before mode analysis is performed, the implementation must delete any double negations and must replace any negations of conjunctions of negations with disjunctions. (Both of these transformations preserve the logical meaning and type-correctness of the code, and they preserve or improve mode-correctness: they never transform code fragments that would be well-moded into ones that would be ill-moded.)


Next: , Previous: Syntax, Up: Top

3 Types

The type system is based on many-sorted logic, and supports polymorphism, type classes (see Type classes), and existentially quantified types (see Existential types).


Next: , Up: Types

3.1 Builtin types

Certain special types are builtin, or are defined in the Mercury library:

Primitive types: char, int, float, string.
There is a special syntax for constants of type int, float, and string. (For char, the standard syntax suffices.)
Predicate types: pred, pred(T), pred(T1, T2), ...
Function types: (func) = T, func(T1) = T,
func(T1, T2) = T, ...
These higher-order function and predicate types are used to pass procedure addresses and closures to other predicates. See Higher-order.
Tuple types: {}, {T}, {T1, T2}, ....
A tuple type is equivalent to a discriminated union type (see Discriminated unions) with declaration
          :- type {Arg1, Arg2, ..., ArgN}
                  ---> { {Arg1, Arg2, ..., ArgN} }.
     

The universal type: univ.
The type univ is defined in the standard library module std_util, along with the predicates type_to_univ/2 and univ_to_type/2. With those predicates, any type can be converted to the universal type and back again. The universal type is useful for situations where you need heterogeneous collections.
The “state-of-the-world” type: io.state.
The type io.state is defined in the standard library module io, and represents the state of the world. Predicates which perform I/O are passed the old state of the world and produce a new state of the world. In this way, we can give a declarative semantics to code that performs I/O.


Next: , Previous: Builtin types, Up: Types

3.2 User-defined types

New types can be introduced with :- type declarations. There are several categories of derived types:


Next: , Up: User-defined types

3.2.1 Discriminated unions

These encompass both enumeration and record types in other languages. A derived type is defined using :- type type ---> body. (Note there are three dashes in that arrow. It should not be confused with the two-dash arrow used for DCGs or the one-dash arrow used for if-then-else.) If the type term is a functor of arity zero (i.e. one having zero arguments), it names a monomorphic type. Otherwise, it names a polymorphic type; the arguments of the functor must be distinct type variables. The body term is defined as a sequence of constructor definitions separated by semi-colons.

Ordinarily, each constructor definition must be a functor whose arguments (if any) are types. Ordinary discriminated union definitions must be transparent: all type variables occurring in the body must also occur in the type.

However, constructor definitions can optionally be existentially typed. In that case, the functor will be preceded by an existential type quantifier and can optionally be followed by an existential type class constraint. For details, see Existential types. Existentially typed discriminated union definitions need not be transparent.

The arguments of constructor definitions may be labelled. These labels cause the compiler to generate functions which can be used to conveniently select and update fields of a term in a manner independent of the definition of the type (see Field access functions). A labelled argument has the form fieldname :: Type. It is an error for two fields in the same module to have the same label.

Here are some examples of discriminated union definitions:

     :- type fruit
             --->    apple
             ;       orange
             ;       banana
             ;       pear.
     
     :- type strange
             --->    foo(int)
             ;       bar(string).
     
     :- type employee
             --->    employee(
                            name        :: string,
                            age         :: int,
                            department  :: string
                     ).
     
     :- type tree
             --->    empty
             ;       leaf(int)
             ;       branch(tree, tree).
     
     :- type list(T)
             --->    []
             ;       [T | list(T)].
     
     :- type pair(T1, T2)
             --->    T1 - T2.

If the body of a discriminated union type definition contains a term whose top-level functor is ';'/2, the semi-colon is normally assumed to be a separator. This makes it difficult to define a type whose constructors include ';'/2. To allow this, curly braces can be used to quote the semi-colon. It is then also necessary to quote curly braces. The following example illustrates this:

     :- type tricky
             --->    { int ; int }
             ;       { { int } }.

This defines a type with two constructors, ';'/2 and '{}'/1, whose argument types are all int. We recommend against using constructors named '{}' because of the possibility of confusion with the builtin tuple types.

Each discriminated union type definition introduces a distinct type. Mercury considers two discriminated union types that have the same bodies to be distinct types (name equivalence). Having two different definitions of a type with the same name and arity in the same module is an error.

Constructors may be overloaded among different types: there may be any number of constructors with a given name and arity, so long as they all have different types. However, there must not be more than one constructor with the same name, arity, and result type in the same module. (There is no particularly good reason for this restriction; in the future we may allow several such functors as long as they have different argument types.) Note that excessive overloading of constructors can slow down type checking and can make the program confusing for human readers, so overloading should not be over-used.


Next: , Previous: Discriminated unions, Up: User-defined types

3.2.2 Equivalence types

These are type abbreviations. They are defined using == as follows. They may be polymorphic.

     :- type money == int.
     :- type assoc_list(KeyType, ValueType)
             == list(pair(KeyType, ValueType)).

Equivalence type definitions must be transparent. Unlike discriminated union type definitions, equivalence type definitions must not be cyclic; that is, the type on the left hand side of the == (assoc_list and money in the examples above) must not occur on the right hand side of the ==.

Mercury treats an equivalence type as an abbreviation for the type on the right hand side of the definition; the two are equivalent in all respects in scopes where the equivalence type is visible.


Previous: Equivalence types, Up: User-defined types

3.2.3 Abstract types

These are types whose implementation is hidden. The type declarations

     :- type t1.
     :- type t2(T1, T2).

declare types t1/0 and t2/2 to be abstract types. Such declarations are only useful in the interface section of a module. This means that the type names will be exported, but the constructors (functors) for these types will not be exported. The implementation section of a module must have give the definition of all the abstract types named in the interface section of the module. Abstract types may be defined as either discriminated union types or as equivalence types.


Next: , Previous: User-defined types, Up: Types

3.3 Predicate and function type declarations

The argument types of each predicate must be explicitly declared with a :- pred declaration. The argument types and return type of each function must be explicitly declared with a :- func declaration. For example:

     :- pred is_all_uppercase(string).
     
     :- func strlen(string) = int.

Predicates and functions can be polymorphic; that is, their declarations can include type variables. For example:

     :- pred member(T, list(T)).
     
     :- func length(list(T)) = int.

A predicate or function can by declared to have a given higher-order type (see Higher-order) by using `with_type` in the type declaration. This is useful where several predicates or functions need to have the same type signature, which often occurs for typeclass method implementations (see Type classes), and for predicates to be passed as higher-order terms.

For example,

     :- type foldl_pred(T, U) == pred(T, U, U).
     :- type foldl_func(T, U) == (func(T, U) = U).
     
     :- pred p(int) `with_type` foldl_pred(T, U).
     :- func f(int) `with_type` foldl_func(T, U).

is equivalent to

     :- pred p(int, T, U, U).
     :- pred f(int, T, U) = U.

Type variables in predicate and function declarations are implicitly universally quantified by default; that is, the predicate or function may be called with arguments and (in the case of functions) return value whose actual types are any instance of the types specified in the declaration. For example, the function length/1 declared above could be called with the argument having type list(int), or list(float), or list(list(int)), etc.

Type variables in predicate and function declarations can also be existentially quantified; this is discussed in Existential types.

There must only be one predicate with a given name and arity in each module, and only one function with a given name and arity in each module. It is an error to declare the same predicate or function twice.

There must be at least one clause defined for each declared predicate or function, except for those defined using the foreign language interface (see Foreign language interface and C interface). However, Mercury implementations are permitted to provide a method of processing Mercury programs in which such errors are not reported until and unless the predicate or function is actually called. (The University of Melbourne Mercury implementation provides this with its --allow-stubs option. This can be useful during program development, since it allows you to execute parts of a program while the program's implementation is still incomplete.)

Note that a predicate defined using DCG notation (see DCG-rules) will appear to be defined with two fewer arguments than it is declared with. It will also appear to be called with two fewer arguments when called from predicates defined using DCG notation. However, when called from an ordinary predicate or function, it must have all the arguments it was declared with.

The compiler infers the types of data-terms, and in particular the types of variables and overloaded constructors, functions, and predicates. A type assignment is an assignment of a type to every variable and of a particular constructor, function, or predicate to every name in a clause. A type assignment is valid if it satisfies the following conditions.

Each constructor in a clause must have been declared in at least one visible type declaration. The type assigned to each constructor term must match one of the type declarations for that constructor, and the types assigned to the arguments of that constructor must match the argument types specified in that type declaration.

The type assigned to each function call term must match the return type of one of the :- func declarations for that function, and the types assigned to the arguments of that function must match the argument types specified in that type declaration.

The type assigned to each predicate argument must match the type specified in one of the :- pred declarations for that predicate. The type assigned to each head argument in a predicate clause must exactly match the argument type specified in the corresponding :- pred declaration.

The type assigned to each head argument in a function clause must exactly match the argument type specified in the corresponding :- func declaration, and the type assigned to the result term in a function clause must exactly match the result type specified in the corresponding :- func declaration.

The type assigned to each data-term with an explicit type qualification (see Explicit type qualification) must match the type specified by the type qualification expression2.

(Here “match” means to be an instance of, i.e. to be identical to for some substitution of the type parameters, and “exactly match” means to be identical up to renaming of type parameters.)

One type assignment A is said to be more general than another type assignment B if there is a binding of the type parameters in A that makes it identical (up to renaming of parameters) to B. If there is more than one valid type assignment, the compiler must choose the most general one. If there are two valid type assignments which are not identical up to renaming and neither of which is more general than the other, then there is a type ambiguity, and compiler must report an error. A clause is type-correct if there is a unique (up to renaming) most general valid type assignment. Every clause in a Mercury program must be type-correct.


Next: , Previous: Predicate and function type declarations, Up: Types

3.4 Field access functions

Fields of constructors of discriminated union types may be labelled (see Discriminated unions). These labels cause the compiler to generate functions which can be used to select and update fields of a term in a manner independent of the definition of the type.

The Mercury language includes syntactic sugar to make it more convenient to select and update fields inside nested terms (see Record syntax) and to select and update fields of the DCG arguments of a clause (see DCG-goals).


Next: , Up: Field access functions

3.4.1 Field selection

     field(Term)

Each field label field in a constructor causes generation of a field selection function field/1, which takes a data-term of the same type as the constructor and returns the value of the labelled field, failing if the top-level constructor of the argument is not the constructor containing the field.

If the declaration of the field is in the interface section of the module, the corresponding field selection function is also exported from the module.

By default, this function has no declared modes — the modes are inferred at each call to the function. However, the type and modes of this function may be explicitly declared, in which case it will have only the declared modes.

To create a higher-order term from a field selection function, an explicit lambda expression must be used, unless a single mode declaration is supplied for the field selection function.


Next: , Previous: Field selection, Up: Field access functions

3.4.2 Field update

     'field :='(Term, ValueTerm)

Each field label field in a constructor causes generation of a field update function 'field :='/2. The first argument of this function is a data-term of the same type as the constructor. The second argument is a data-term of the same type as the labelled field. The return value is a copy of the first argument with value of the labelled field replaced by the second argument. 'field :='/2 fails if the top-level constructor of the first argument is not the constructor containing the labelled field.

If the declaration of the field is in the interface section of the module, the corresponding field update function is also exported from the module.

By default, this function has no declared modes — the modes are inferred at each call to the function. However, the type and modes of this function may be explicitly declared, in which case it will have only the declared modes.

To create a higher-order term from a field update function, an explicit lambda expression must be used, unless a single mode declaration is supplied for the field update function.

Some fields cannot be updated using field update functions. For the constructor unsettable/2 below, neither field may be updated because the resulting term would not be well-typed. A future release may allow multiple fields to be updated by a single expression to avoid this problem.

     :- type unsettable
             ---> some [T] unsettable(
                     unsettable1 :: T,
                     unsettable2 :: T
             ).


Next: , Previous: Field update, Up: Field access functions

3.4.3 User-supplied field access function declarations

Type and mode declarations for compiler-generated field access functions for fields of constructors local to a module may be placed in the interface section of the module. This allows the implementation of a type to be hidden while still allowing client modules to use record syntax to manipulate values of the type. Supplying a type declaration and a single mode declaration also allows higher-order terms to be created from a field access function without using explicit lambda expressions.

Declarations for field access functions for fields occurring in the interface section of a module must also occur in the interface section.

Declarations and clauses for field access functions can also be supplied for fields which are not a part of any type. This is useful when the data structures of a program change so that a value which was previously stored as part of a type is now computed each time it is requested. It also allows record syntax to be used for type class methods.

User-declared field access functions may take extra arguments. For example, the Mercury standard library module map contains the following functions:

     :- func elem(K, map(K, V)) = V is semidet.
     :- func 'elem :='(K, map(K, V), V) = map(K, V).

Field access syntax may be used at the top-level of func and mode declarations and in the head of clauses. For instance:

     :- func map(K, V) ^ elem(K) = V.
     :- mode in        ^ in      = out is semidet.
     Map ^ elem(Key) = map.lookup(Map, Key).
     
     :- func (map(K, V) ^ elem(K) := V)  = V.
     :- mode (in        ^ in      := in) = out is semidet.
     (Map ^ elem(Key) := Value) = map.set(Map, Key, Value).

The Mercury standard library modules array and bt_array define similar functions.


Previous: User-supplied field access function declarations, Up: Field access functions

3.4.4 Field access examples

The examples make use of the following type declarations:

     :- type type1
             ---> type1(
                     field1 :: type2,
                     field2 :: string
             ).
     
     :- type type2
             ---> type2(
                     field3 :: int,
                     field4 :: int
             ).
     

The compiler generates some field access functions for field1. The functions generated for the other fields are similar.

     +:- func type1 ^ field1 = type2.
     +type1(Field1, _) ^ field1 = Field1.
     
     +:- func (type1 ^ field1 := type2) = type1.
     +(type1(_, Field2) ^ field1 := Field1) = type1(Field1, Field2).

Using these functions and the syntactic sugar described in Record syntax, programmers can write code such as

     +:- func type1 ^ increment_field3 = type1.
     
     +Term0 ^ increment_field3 =
             Term0 ^ field1 ^ field3 := Term0 ^ field1 ^ field3 + 1.

The compiler expands this into

     incremental_field3(Term0) = Term :-
             OldField3 = field3(field1(Term0)),
     
             OldField1 = field1(Term0),
             NewField1 = 'field3 :='(OldField1, OldField3 + 1),
             Term = 'field1 :='(Term0, NewField1).

The field access functions defined in the Mercury standard library module map can be used as follows:

     :- func update_field_in_map(map(int, type1), int, string)
                     = map(int, type1) is semidet.
     
     update_field_in_map(Map, Index, Value) =
                     Map ^ elem(Index) ^ field2 := Value.


Previous: Field access functions, Up: Types

3.5 Solver types

Solver types are an experimental addition to the language supporting the implementation of constraint solvers. A program may place constraints on and between variables of a solver type, limiting the values those variables may take on before they are actually bound. For example, if X and Y are variables belonging to a constrained integer solver type, we might place constraints upon them such that X > 3 + Y and Y =< 7. A later attempt to unify Y with 10 will fail (it would violate the second constraint); similarly an attempt to unify X with 5 and Y with 4 would fail (it would violate the first constraint).


Next: , Up: Solver types

3.5.1 The any inst

Variables with solver types can have one of three possible insts: free, ground or any. A variable with a solver type with inst any may not (yet) be semantically ground, in the following sense: if a variable is semantically ground then the set of values it unifies with form an equivalence class; if a variable is non-ground then the set of values it unifies with do not form an equivalence class.

More formally, X is ground if for values Y and Z that unify with X, it is the case that Y and Z also unify with each other. X is non-ground if there are values Y and Z that unify with X, but which do not unify with each other.

A non-solver type value will have inst any if it is constructed using one or more inst any values.

The built-in modes ia and oa are equivalent to in(any) and out(any) respectively.


Next: , Previous: The any inst, Up: Solver types

3.5.2 Abstract solver type declarations

The type declarations

     :- solver type t1.
     :- solver type t2(T1, T2).

declare types t1/0 and t2/2 to be abstract solver types. Abstract solver type declarations are identical to ordinary abstract type declarations except for the solver keyword.


Next: , Previous: Abstract solver type declarations, Up: Solver types

3.5.3 Solver type definitions

A solver type definition takes the following form:

     :- solver type solver_type
             where   representation is representation_type,
                     initialisation is initialisation_pred,
                     ground         is ground_inst,
                     any            is any_inst,
                     equality       is equality_pred,
                     comparison     is comparison_pred.

The representation and initialisation attributes are mandatory (initialization is allowed as a synonym for initialisation), ground_inst and any_inst default to ground, the equality and comparison attributes are optional (although a solver type without equality would not be very useful), and attributes must appear in the order shown.

The representation_type is the type used to implement the solver_type. A two-tier scheme of this kind is necessary for a number of reasons, including

The initialisation_pred is the name of a predicate defined in the same module as the solver type, with the following signature:

     :- pred initialisation_pred(solver_type::out(any)) is det.

Calls to this predicate are inserted automatically by the compiler when a free solver_type variable has to be given inst any. (The initialisation predicate is responsible for registering the new, unbound variable with the corresponding constraint solver state.)

The ground_inst is the inst associated with representation_type values denoting ground solver_type values.

The any_inst is the inst associated with representation_type values denoting any solver_type values.

The compiler constructs three impure functions for converting between solver_type values and representation_type values (name is the function symbol used to name solver_type and arity is the number of type parameters it takes):

     :- impure func 'representation of ground name/arity'(solver_type) =
                             representation_type.
     :-        mode 'representation of ground name/arity'(in) =
                             out(ground_inst) is det.
     
     :- impure func 'representation of any name/arity'(solver_type) =
                             representation_type.
     :-        mode 'representation of any name/arity'(in(any)) =
                             out(any_inst) is det.
     
     :- impure func 'representation to ground name/arity'(representation_type) =
                             solver_type.
     :-        mode 'representation to ground name/arity'(in(ground_inst)) =
                             out is det.
     
     :- impure func 'representation to any name/arity'(representation_type) =
                             solver_type.
     :-        mode 'representation to any name/arity'(in(any_inst)) =
                             out(any) is det.

These functions are impure because of the semantic gap issue mentioned above.

These functions are constructed in-line as part of a source-to-source transformation, hence it is an error to define a solver type in the interface section of a module.

If solver_type is exported then it is a requirement that representation_type, initialisation_pred, and, if specified, equality_pred and comparison_pred are also exported from the same module.


Next: , Previous: Solver type definitions, Up: Solver types

3.5.4 Implementing solver types

A solver type is an abstraction, implemented using a combination of a private representation type and a constraint store.

The constraint store is an (impure) piece of state used to keep track of the extant constraints on variables of the solver type. This will typically be implemented using foreign code.

It is important that changes to the constraint store are properly trailed (see Trailing) so that changes can be undone on backtracking.

The solver type implementation should provide functions and predicates to


Next: , Previous: Implementing solver types, Up: Solver types

3.5.5 Solver type constraints and negated contexts

It is (a currently unchecked) error to place a constraint on a solver type variable in a negated context. The reason for this is that the compiler does not understand what constraints mean and therefore it cannot enforce their negation outside a negated context.

For this reason, soundness of a program is not guaranteed if constraints are placed on solver type variables inside negations or the conditions of if-then-else goals. Particular care is needed since unification is a primary form of constraint.


Previous: Solver type constraints and negated contexts, Up: Solver types

3.5.6 Polymorphic solver types

Under the current design, the compiler does not have enough information to initialise variables belonging to polymorphic parameters that are themselves expected to be solver types. At some point we intend to lift this restriction.


Next: , Previous: Types, Up: Top

4 Modes


Next: , Up: Modes

4.1 Insts, modes, and mode definitions

The mode of a predicate, or function, is a mapping from the initial state of instantiation of the arguments of the predicate, or the arguments and result of a function, to their final state of instantiation. To describe states of instantiation, we use information provided by the type system. Types can be viewed as regular trees with two kinds of nodes: or-nodes representing types and and-nodes representing constructors. The children of an or-node are the constructors that can be used to construct terms of that type; the children of an and-node are the types of the arguments of the constructors. We attach mode information to the or-nodes of type trees.

An instantiatedness tree is an assignment of an instantiatedness — either free or bound — to each or-node of a type tree, with the constraint that all descendants of a free node must be free.

A term is approximated by an instantiatedness tree if for every node in the instantiatedness tree,

When an instantiatedness tree tells us that a variable is bound, there may be several alternative function symbols to which it could be bound. The instantiatedness tree does not tell us which of these it is bound to; instead for each possible function symbol it tells us exactly which arguments of the function symbol will be free and which will be bound. The same principle applies recursively to these bound arguments.

Mercury's mode system allows users to declare names for instantiatedness trees using declarations such as

     :- inst listskel == bound( [] ; [free | listskel] ).

This instantiatedness tree describes lists whose skeleton is known but whose elements are distinct variables. As such, it approximates the term [A,B] but not the term [H|T] (only part of the skeleton is known), the term [A,2] (not all elements are variables), or the term [A,A] (the elements are not distinct variables).

As a shorthand, the mode system provides free and ground as names for instantiatedness trees all of whose nodes are free and bound respectively (with the exception of solver type values which may be semantically ground, but be defined in terms of non-ground solver type values; see Solver types for more detail). The shape of these trees is determined by the type of the variable to which they apply.

A more concise, alternative syntax exists for bound instantiatedness trees:

     :- inst maybeskel ---> no ; yes(ground).

which is equivalent to writing

     :- inst maybeskel == bound(no ; yes(ground)).

As execution proceeds, variables may become more instantiated. A mode mapping is a mapping from an initial instantiatedness tree to a final instantiatedness tree, with the constraint that no node of the type tree is transformed from bound to free. Mercury allows the user to specify mode mappings directly by expressions such as inst1 >> inst2, or to give them a name using declarations such as

     :- mode m == inst1 >> inst2.

It is also possible to write mode declarations using :: and -> instead of == and >> respectively, however this syntax is deprecated and may not be supported in future.

Two standard shorthand modes are provided, corresponding to the standard notions of inputs and outputs:

     :- mode in == ground >> ground.
     :- mode out == free >> ground.

Prolog fans who want to use the symbols + and - can do so by simply defining them using a mode declaration:

     :- mode (+) == in.
     :- mode (-) == out.

These two modes are enough for most functions and predicates. Nevertheless, Mercury's mode system is sufficiently expressive to handle more complex data-flow patterns, including those involving partially instantiated data structures. (The current implementation does not handle partially instantiated data structures yet.)

For example, consider an interface to a database that associates data with keys, and provides read and write access to the items it stores. To represent accesses to the database over a network, you would need declarations such as

     :- type operation
             --->    lookup(key, data)
             ;       set(key, data).
     :- inst request
     	--->	lookup(ground, free)
             ;       set(ground, ground).
     :- mode create_request == free >> request.
     :- mode satisfy_request == request >> ground.

inst and mode declarations can be parametric. For example, the following declaration

     :- inst maybeskel(Inst) ---> no ; yes(Inst).

defines the inst listskel(Inst) to be a list skeleton whose elements have inst Inst; you can the use insts such as listskel(listskel(free)), which represents the instantiation state of a list of lists of free variables. The standard library provides the parametric modes

     :- mode in(Inst) == Inst >> Inst.
     :- mode out(Inst) == free >> Inst.

so that for example the mode create_request defined above could have be defined as

     :- mode create_request == out(request).

There must not be more than one inst definition with the same name and arity in the same module. Similarly, there must not be more than one mode definition with the same name and arity in the same module.


Next: , Previous: Insts modes and mode definitions, Up: Modes

4.2 Predicate and function mode declarations

A predicate mode declaration assigns a mode mapping to each argument of a predicate. A function mode declaration assigns a mode mapping to each argument of a function, and a mode mapping to the function result. Each mode of a predicate or function is called a procedure. For example, given the mode names defined by

     :- mode out_listskel ==
             free >> listskel.
     :- mode in_listskel ==
             listskel >> listskel.

the (type and) mode declarations of the function length and predicate append are as follows:

     :- func length(list(T)) = int.
     :- mode length(in_listskel) = out.
     :- mode length(out_listskel) = in.
     
     :- pred append(list(T), list(T), list(T)).
     :- mode append(in, in, out).
     :- mode append(out, out, in).

Note that functions may have more than one mode, just like predicates; functions can be reversible.

Alternately, the mode declarations for length could use the standard library modes in/1 and out/1:

     :- func length(list(T)) = int.
     :- mode length(in(listskel)) = out.
     :- mode length(out(listskel)) = in.

As for type declarations, a predicate or function can be defined to have a given higher-order inst (see Higher-order modes) by using `with_inst` in the mode declaration.

For example,

     :- inst foldl_pred == (pred(in, in, out) is det).
     :- inst foldl_func == (func(in, in) = out is det).
     
     :- mode p(in) `with_inst` foldl_pred.
     :- mode f(in) `with_inst` foldl_func.

is equivalent to

     :- mode p(in, in, in, out) is det.
     :- mode f(in, in, in) = out is det.

(is det is explained in Determinism.)

If a predicate or function has only one mode, the pred and mode declaration can be combined:

     :- func length(list(T)::in) = (int::out).
     :- pred append(list(T)::in, list(T)::in, list(T)::out).
     
     :- pred p `with_type` foldl_pred(T, U) `with_inst` foldl_pred.

If there is no mode declaration for a function, the compiler assumes a default mode for the function in which all the arguments have mode in and the result of the function has mode out. (However, there is no requirement that a function have such a mode; if there is any explicit mode declaration, it overrides the default.)

A function or predicate mode declaration is an assertion by the programmer that for all possible argument terms and (if applicable) result term for the function or predicate that are approximated (in our technical sense) by the initial instantiatedness trees of the mode declaration and all of whose free variables are distinct, if the function or predicate succeeds then the resulting binding of those argument terms and (if applicable) result term will in turn be approximated by the final instantiatedness trees of the mode declaration, with all free variables again being distinct. We refer to such assertions as mode declaration constraints. These assertions are checked by the compiler, which rejects programs if it cannot prove that their mode declaration constraints are satisfied.

Note that with the usual definition of append, the mode

     :- mode append(in_listskel, in_listskel, out_listskel).

would not be allowed, since it would create aliasing between the different arguments — on success of the predicate, the list elements would be free variables but they would not be distinct.

In Mercury it is always possible to call a procedure with an argument that is more bound than the initial inst specified for that argument in the procedure's mode declaration. In such cases, the compiler will insert additional unifications to ensure that the argument actually passed to the procedure will have the inst specified. For example, if the predicate p/1 has mode p(out), you can still call p(X) if X is ground. The compiler will transform this code to p(Y), X = Y where Y is a fresh variable. It is almost as if the predicate p/1 has another mode p(in); we call such modes “implied modes”.

To make this concept precise, we introduce the following definition. A term satisfies an instantiatedness tree if for every node in the instantiatedness tree,

The mode set for a predicate or function is the set of mode declarations for the predicate or function. A mode set is an assertion by the programmer that the predicate should only be called with argument terms that satisfy the initial instantiatedness trees of one of the mode declarations in the set (i.e. the specified modes and the modes they imply are the only allowed modes for this predicate or function). We refer to the assertion associated with a mode set as the mode set constraint; these are also checked by the compiler.

A predicate or function p is well-moded with respect to a given mode declaration if given that the predicates and functions called by p all satisfy their mode declaration constraints, there exists an ordering of the conjuncts in each conjunction in the clauses of p such that

We say that a predicate or function is well-moded if it is well-moded with respect to all the mode declarations in its mode set, and we say that a program is well-moded if all its predicates and functions are well-moded.

The mode analysis algorithm checks one procedure at a time. It abstractly interprets the definition of the predicate or function, keeping track of the instantiatedness of each variable, and selecting a mode for each call and unification in the definition. To ensure that the mode set constraints of called predicates and functions are satisfied, the compiler may reorder the elements of conjunctions; it reports an error if no satisfactory order exists. Finally it checks that the resulting instantiatedness of the procedure's arguments is the same as the one given by the procedure's declaration.

The mode analysis algorithm annotates each call with the mode used.


Next: , Previous: Predicate and function mode declarations, Up: Modes

4.3 Constrained polymorphic modes

Mode declarations for predicates and functions may also have inst parameters. However, such parameters must be constrained to be compatible with some other inst. In a predicate or function mode declaration, an inst of the form InstParam =< Inst, where InstParam is a variable and Inst is an inst, states that InstParam is constrained to be compatible with Inst, that is, InstParam represents some inst that can be used anywhere where Inst is required. If an inst parameter occurs more than once in a declaration, it must have the same constraint on each occurrence.

For example, in the mode declaration

     	:- mode append(in(list_skel(I =< ground)), in(list_skel(I =< ground)),
     	        out(list_skel(I =< ground))).

I is an inst parameter which is constrained to be ground. If append is called with the first two arguments having an inst of, say, list_skel(bound(f)) then after append returns, all three arguments will have inst list_skel(bound(f)). If the mode of append had been simply

     	:- mode append(in(list_skel(ground)), in(list_skel(ground)),
     	        out(list_skel(ground))).

then we would only have been able to infer an inst of list_skel(ground) for the third argument, not the more specific inst.

Note that attempting to call append when the first two arguments do not have ground insts (e.g. list_skel(bound(g(free)))) is a mode error because it violates the constraint on the inst parameter.

To avoid having to repeat a constraint everywhere that an inst parameter occurs, it is possible to list the constraints after the rest of the mode declaration, following a <=. E.g. the above example could have been written as

     	:- mode append(in(list_skel(I)), in(list_skel(I)), out(list_skel(I)))
     	              <= I =< ground.

Also, if the constraint on an inst parameter is ground then it is not necessary to give the constraint in the declaration. The example can be further shortened to

     	:- mode append(in(list_skel(I)), in(list_skel(I)), out(list_skel(I))).

Constrained polymorphic modes are particularly useful when passing objects with higher-order types to polymorphic predicates since they allow the higher-order mode information to be retained (see Higher-order).


Previous: Constrained polymorphic modes, Up: Modes

4.4 Different clauses for different modes

Because the compiler automatically reorders conjunctions to satisfy the modes, it is often possible for a single clause to satisfy different modes. However, occasionally reordering of conjunctions is not sufficient; you may want to write different code for different modes.

For example, the usual code for list append

     	append([], Ys, Ys).
     	append([X|Xs], Ys, [X|Zs]) :- append(Xs, Ys, Zs).

works fine in most modes, but is not very satisfactory for the append(out, in, in) mode of append, because although every call in this mode only has at most one solution, the compiler's determinism inference will not be able to infer that. This means that using the usual code for append in this mode will be inefficient, and the overly conservative determinism inference may cause spurious determinism errors later.

For this mode, it is better to use a completely different algorithm:

     	append(Prefix, Suffix, List) :-
     		list.length(List, ListLength),
     		list.length(Suffix, SuffixLength),
     		PrefixLength = ListLength - SuffixLength,
     		list.split_list(PrefixLength, List, Prefix, Suffix).

However, that code doesn't work in the other modes of append.

To handle such cases, you can use mode annotations on clauses, which indicate that particular clauses should only be used for particular modes. To specify that a clause only applies to a given mode, each argument Arg of the clause head should be annotated with the corresponding argument mode Mode, using the :: mode qualification operator, i.e. Arg :: Mode.

For example, if append was declared as

     	:- pred append(list(T), list(T), list(T)).
     	:- mode append(in, in, out).
     	:- mode append(out, out, in).
     	:- mode append(in, out, in).
     	:- mode append(out, in, in).

then you could implement it as

     	append(L1::in,  L2::in,  L3::out) :- usual_append(L1, L2, L3).
     	append(L1::out, L2::out, L3::in)  :- usual_append(L1, L2, L3).
     	append(L1::in,  L2::out, L3::in)  :- usual_append(L1, L2, L3).
     	append(L1::out, L2::in,  L3::in)  :- other_append(L1, L2, L3).
     
     	usual_append([], Ys, Ys).
     	usual_append([X|Xs], Ys, [X|Zs]) :- usual_append(Xs, Ys, Zs).
     
     	other_append(Prefix, Suffix, List) :-
     		list.length(List, ListLength),
     		list.length(Suffix, SuffixLength),
     		PrefixLength = ListLength - SuffixLength,
     		list.split_list(PrefixLength, List, Prefix, Suffix).

This language feature can be used to write “impure” code that doesn't have any consistent declarative semantics. For example, you can easily use it to write something similar to Prolog's (in)famous var/1 predicate:

     	:- mode var(in).
     	:- mode var(free>>free).
     	var(_::in) :- fail.
     	var(_::free>>free) :- true.

As you can see, in this case the two clauses are not equivalent.

Because of this possibility, predicates or functions which are defined using different code for different modes are by default assumed to be impure; the programmer must either (1) carefully ensure that the logical meaning of the clauses is the same for all modes, in which case a pragma promise_pure declaration can be used or (2) declare the predicate or function as impure. See Impurity.

In the example with append above, the two ways of implementing append do have the same declarative semantics, so we can safely use the first approach:

     	:- pragma promise_pure(append/3).

In the example with var/1 above, the two clauses have different semantics, so the predicate must be declared as impure:

     	:- impure pred var(T).


Next: , Previous: Modes, Up: Top

5 Unique modes

Mode declarations can also specify so-called “unique modes”. Mercury's unique modes are similar to “linear types” in some functional programming languages such as Clean. They allow you to specify when there is only one reference to a particular value, and when there will be no more references to that value. If the compiler knows there will be no more references to a value, it can perform “compile-time garbage collection” by automatically inserting code to deallocate the storage associated with that value. Even more importantly, the compiler can also simply reuse the storage immediately, for example by destructively updating one element of an array rather than making a new copy of the entire array in order to change one element. Unique modes are also the mechanism Mercury uses to provide declarative I/O.

We have not yet implemented unique modes fully, and the details are still in a state of flux. So the following should be considered tentative.


Next: , Up: Unique modes

5.1 Destructive update

In addition to the insts mentioned above (free, ground, and bound(...)), Mercury also provides “unique” insts unique and unique(...) which are like ground and bound(...) respectively, except that they carry the additional constraint that there can only be one reference to the corresponding value. There is also an inst dead which means that there are no references to the corresponding value, so the compiler is free to generate code that reuses that value. There are three standard modes for manipulation unique values:

     % unique output
     :- mode uo == free >> unique.
     
     % unique input
     :- mode ui == unique >> unique.
     
     % destructive input
     :- mode di == unique >> dead.

Mode uo is used to create a unique value. Mode ui is used to inspect a unique value without losing its uniqueness. Mode di is used to deallocate or reuse the memory occupied by a value that will not be used.

Note that a value is not considered unique if it might be needed on backtracking. This means that unique modes are generally only useful for code whose determinism is det or cc_multi (see Determinism).

Unlike bound instantiatedness trees, there is no alternative syntax for unique instantiatedness trees.


Next: , Previous: Destructive update, Up: Unique modes

5.2 Backtrackable destructive update

“Well it just so happens that your friend here is only mostly dead.
There's a big difference between mostly dead and all dead...
Now, mostly dead is slightly alive.
Now, all dead — well, with all dead, there's usually only one thing that you can do.”

“What's that?”

“Go through his clothes and look for loose change!”

— from the movie “The Princess Bride”.

To allow for backtrackable destructive updates — that is, updates whose effect is undone on backtracking, perhaps by recording the overwritten values on a “trail” so that they can be restored after backtracking — Mercury also provides “mostly unique” modes. The insts mostly_unique and mostly_dead are equivalent to unique and dead, except that only references which will be encountered during forward execution are counted — it is OK for mostly_unique or mostly_dead values to be needed again on backtracking.

Mercury defines some standard modes for manipulating “mostly unique” values, just as it does for unique values:

     % mostly unique output
     :- mode muo == free >> mostly_unique.
     
     % mostly unique input
     :- mode mui == mostly_unique >> mostly_unique.
     
     % mostly destructive input
     :- mode mdi == mostly_unique >> mostly_dead.


Previous: Backtrackable destructive update, Up: Unique modes

5.3 Limitations of the current implementation

The implementation of the mode analysis algorithm is not quite complete; as a result, it is not possible to use nested unique modes, i.e. modes in which anything but the top level of a variable is unique. If you do, you will get unique mode errors when you try to get a unique field of a unique data structure. It is also not possible to use unique-input modes; only destructive-input and unique-output modes work.

The Mercury compiler does not (yet) reuse dead values. The only destructive update in the current implementation occurs in library modules, e.g. for I/O and arrays. We do however plan to implement structure reuse and compile-time garbage collection in the very near future.


Next: , Previous: Unique modes, Up: Top

6 Determinism


Next: , Up: Determinism

6.1 Determinism categories

For each mode of a predicate or function, we categorise that mode according to how many times it can succeed, and whether or not it can fail before producing its first solution.

If all possible calls to a particular mode of a predicate or function which return to the caller (calls which terminate, do not throw an exception and do not cause a fatal runtime error)

If no possible calls to a particular mode of a predicate or function can return to the caller, then that mode has a determinism of erroneous.

The determinism annotation erroneous is used on the library predicates require.error/1 and exception.throw/1, but apart from that determinism annotations erroneous and failure are generally not needed.

To summarize:

                     Maximum number of solutions
     Can fail?       0               1               > 1
     no              erroneous       det             multi
     yes             failure         semidet         nondet

(Note: the “Can fail?” column here indicates only whether the procedure can fail before producing at least one solution; attempts to find a second solution to a particular call, e.g. for a procedure with determinism multi, are always allowed to fail.)

The determinism of each mode of a predicate or function is indicated by an annotation on the mode declaration. For example:

     :- pred append(list(T), list(T), list(T)).
     :- mode append(in, in, out) is det.
     :- mode append(out, out, in) is multi.
     :- mode append(in, in, in) is semidet.
     
     :- func length(list(T)) = int.
     :- mode length(in) = out is det.
     :- mode length(in(list_skel)) = out is det.
     :- mode length(in) = in is semidet.

An annotation of det or multi is an assertion that for every value each of the inputs, there exists at least one value of the outputs for which the predicate is true, or (in the case of functions) for which the function term is equal to the result term. Conversely, an annotation of det or semidet is an assertion that for every value each of the inputs, there exists at most one value of the outputs for which the predicate is true, or (in the case of functions) for which the function term is equal to the result term. These assertions are called the mode-determinism assertions; they can play a role in the semantics, because in certain circumstances they may allow an implementation to perform optimizations that would not otherwise be allowed, such as optimizing away a goal with no outputs even though it might infinitely loop.

If the mode of the predicate is given in the :- pred declaration rather than in a separate :- mode declaration, then the determinism annotation goes on the :- pred declaration (and similarly for functions). In particular, this is necessary if a predicate does not have any argument variables. If the determinism declaration is given on a :- func declaration without the mode, the function is assumed to have the default mode (see Modes for more information on default modes of functions).

For example:

     :- pred loop(int::in) is erroneous.
     loop(X) :- loop(X).
     
     :- pred p is det.
     p.
     
     :- pred q is failure.
     q :- fail.

If there is no mode declaration for a function, then the default mode for that function is considered to have been declared as det. If you want to write a partial function, i.e. one whose determinism is semidet, then you must explicitly declare the mode and determinism.

In Mercury, a function is supposed to be a true mathematical function of its arguments; that is, the value of the function's result should be determined only by the values of its arguments. Hence, for any mode of a function that specifies that all the arguments are fully input (i.e. for which the initial inst of all the arguments is a ground inst), the determinism of that mode can only be det, semidet, erroneous, or failure.

The determinism categories form this lattice:

                  erroneous
                   /     \
               failure   det
                  \     /   \
                  semidet  multi
                      \     /
                       nondet

The higher up this lattice a determinism category is, the more the compiler knows about the number of solutions of procedures of that determinism.


Next: , Previous: Determinism categories, Up: Determinism

6.2 Determinism checking and inference

The determinism of goals is inferred from the determinism of their component parts, according to the rules below. The inferred determinism of a procedure is just the inferred determinism of the procedure's body.

For procedures that are local to a module, the determinism annotations may be omitted; in that case, their determinism will be inferred. (To be precise, the determinism of procedures without a determinism annotation is defined as the least fixpoint of the transformation which, given an initial assignment of the determinism det to all such procedures, applies those rules to infer a new determinism assignment for those procedures.)

It is an error to omit the determinism annotation for procedures that are exported from their containing module.

If a determinism annotation is supplied for a procedure, the declared determinism is compared against the inferred determinism. If the declared determinism is greater than or not comparable to the inferred determinism (in the partial ordering above), it is an error. If the declared determinism is less than the inferred determinism, it is not an error, but the implementation may issue a warning.

The determinism category of each goal is inferred according to the following rules. These rules work with the two components of determinism category: whether the goal can fail without producing a solution, and the maximum number of solutions of the goal (0, 1, or more). If the inference process below reports that a goal can succeed more than once, but the goal generates no outputs that are visible from outside the goal, and the goal is not impure (see Impurity), then the final determinism of the goal will be based on the goal succeeding at most once, since the compiler will implicitly prune away any duplicate solutions.

Calls
The determinism category of a call is the determinism declared or inferred for the called mode of the called procedure.
Unifications
The determinism of a unification is either det, semidet, or failure, depending on its mode.

A unification that assigns the value of one variable to another is deterministic. A unification that constructs a structure and assigns it to a variable is also deterministic. A unification that tests whether a variable has a given top function symbol is semideterministic, unless the compiler knows the top function symbol of that variable, in which case its determinism is either det or failure depending on whether the two function symbols are the same or not. A unification that tests two variables for equality is semideterministic, unless the compiler knows that the two variables are aliases for one another, in which case the unification is deterministic, or unless the compiler knows that the two variables have different function symbols in the same position, in which case the unification has a determinism of failure.

The compiler knows the top function symbol of a variable if the previous part of the procedure definition contains a unification of the variable with a function symbol, or if the variable's type has only one function symbol.

Conjunctions
The determinism of the empty conjunction (the goal true) is det. The conjunction (A, B) can fail if either A can fail, or if A can succeed at least once, and B can fail. The conjunction can succeed at most zero times if either A or B can succeed at most zero times. The conjunction can succeed more than once if either A or B can succeed more than once and both A and B can succeed at least once. (If e.g. A can succeed at most zero times, then even if B can succeed many times the maximum number of solutions of the conjunction is still zero.) Otherwise, i.e. if both A and B succeed at most once, the conjunction can succeed at most once.
Switches
A disjunction is a switch if each disjunct has near its start a unification that tests the same bound variable against a different function symbol. For example, consider the common pattern
          (
                  L = [], empty(Out)
          ;
                  L = [H|T], nonempty(H, T, Out)
          )
     

If L is input to the disjunction, then the disjunction is a switch on L.

A switch can fail if the various arms of the switch do not cover all the function symbols in the type of the switched-on variable, or if the code in some arms of the switch can fail, bearing in mind that in each arm of the switch, the unification that tests the switched-on variable against the function symbol of that arm is considered to be deterministic. A switch can succeed several times if some arms of the switch can succeed several times, possibly because there are multiple disjuncts that test the switched-on variable against the same function symbol. A switch can succeed at most zero times only if all arms of the switch can succeed at most zero times.

Only unifications may occur before the test of the switched-on variable in each disjunct. Tests of the switched-on variable may occur within existential quantification goals.

The following example is a switch.

          (
                  Out = 1, L = []
          ;
                  some [H, T] (
                          L = [H|T],
                          nonempty(H, T, Out)
                  )
          )
     

The following example is not a switch because the call in the first disjunct occurs before the test of the switched-on variable.

          (
                  empty(Out), L = []
          ;
                  L = [H|T], nonempty(H, T, Out)
          )
     

Disjunctions
The determinism of the empty disjunction (the goal fail) is failure. A disjunction (A ; B) that is not a switch can fail if both A and B can fail. It can succeed at most zero times if both A and B can succeed at most zero times. It can succeed at most once if one of A and B can succeed at most once and the other can succeed at most zero times. Otherwise, i.e. if either A or B can succeed more than once, or if both A and B can succeed at least once, it can succeed more than once.
If-then-else
If the condition of an if-then-else cannot fail, the if-then-else is equivalent to the conjunction of the condition and the “then” part, and its determinism is computed accordingly. Otherwise, an if-then-else can fail if either the “then” part or the “else” part can fail. It can succeed at most zero times if the “else” part can succeed at most zero times and if at least one of the condition and the “then” part can succeed at most zero times. It can succeed more than once if any one of the condition, the “then” part and the “else” part can succeed more than once.
Negations
If the determinism of the negated goal is erroneous, then the determinism of the negation is erroneous. If the determinism of the negated goal is failure, the determinism of the negation is det. If the determinism of the negated goal is det or multi, the determinism of the negation is failure. Otherwise, the determinism of the negation is semidet.


Next: , Previous: Determinism checking and inference, Up: Determinism

6.3 Replacing compile-time checking with run-time checking

Note that “perfect” determinism inference is an undecidable problem, because it requires solving the halting problem. (For instance, in the following example

     :- pred p(T, T).
     :- mode p(in, out) is det.
     
     p(A, B) :-
             (
                     something_complicated(A, B)
             ;
                     B = A
             ).

p/2 can have more than one solution only if something_complicated can succeed.) Sometimes, the rules specified by the Mercury language for determinism inference will infer a determinism that is not as precise as you would like. However, it is generally easy to overcome such problems. The way to do this is to replace the compiler's static checking with some manual run-time checking. For example, if you know that a particular goal should never fail, but the compiler infers that goal to be semidet, you can check at runtime that the goal does succeed, and if it fails, call the library predicate error/1.

     :- pred q(T, T).
     :- mode q(in, out) is det.
     
     q(A, B) :-
             ( goal_that_should_never_fail(A, B0) ->
                     B = B0
             ;
                     error("goal_that_should_never_fail failed!")
             ).

The predicate error/1 has determinism erroneous, which means the compiler knows that it will never succeed or fail, so the inferred determinism for the body of q/2 is det. (Checking assumptions like this is good coding style anyway. The small amount of up-front work that Mercury requires is paid back in reduced debugging time.) Mercury's mode analysis knows that computations with determinism erroneous can never succeed, which is why it does not require the “else” part to generate a value for B. The introduction of the new variable B0 is necessary because the condition of an if-then-else is a negated context, and can export the values it generates only to the “then” part of the if-then-else, not directly to the surrounding computation. (If the surrounding computations had direct access to values generated in conditions, they might access them even if the condition failed.)


Next: , Previous: Replacing compile-time checking with run-time checking, Up: Determinism

6.4 Interfacing nondeterministic code with the real world

Normally, attempting to call a nondet or multi mode of a predicate from a predicate declared as semidet or det will cause a determinism error. So how can we call nondeterministic code from deterministic code? There are several alternative possibilities.

If you just want to see if a nondeterministic goal is satisfiable or not, without needing to know what variable bindings it produces, then there is no problem - determinism analysis considers nondet and multi goals with no non-local output variables to be semidet and det respectively.

If you want to use the values of output variables, then you need to ask yourself which one of possibly many solutions to a goal do you want? If you want all of them, you need to use the predicate solutions/2 in the standard library module std_util, which collects all of the solutions to a goal into a list — see Higher-order.

If you just want one solution and don't care which, the calling predicate should be declared nondet or multi. The nondeterminism should then be propagated up the call tree to the point at which it can be pruned. In Mercury, pruning can be achieved in several ways.

The first way is the one mentioned above: if a goal has no non-local output variables then the implementation will only attempt to satisfy the goal once. Any potential duplicate solutions will be implicitly pruned away.

The second way is to rely on the fact that the implementation will only seek a single solution to main/2, so alternative solutions to main/2 (and hence also to nondet or multi predicates called directly or indirectly from main/2) are implicitly pruned away. This is one way to achieve “don't care” style nondeterminism in Mercury.

The other situation in which you may want pruning and committed choice style nondeterminism is when you know that all the solutions returned will be equivalent. For example, you might want to find the maximum element in a set by iterating over the elements in the set. Iterating over the elements in a set in an unspecified order is a nondeterministic operation, but no matter which order you remove them, the maximum value in the set should be the same.

If you know that there will only ever be at most one distinct solution, then you can use the function promise_only_solution/1, which is defined as a builtin function in the Mercury standard library.

     :- func promise_only_solution(pred(T)) = T.
     :- mode promise_only_solution(pred(out) is cc_multi) = out is det.
     :- mode promise_only_solution(pred(out) is cc_nondet) = out is semidet.

A call to that function, e.g. promise_only_solution(Pred), constitutes a promise on the part of the caller that the argument Pred has at most one solution, i.e. that

     not some [X1, X2] (Pred(X1), Pred(X2), X1 \= X2)

holds. promise_only_solution(Pred) presumes that this assumption is satisfied, and returns the value of X for which Pred(X) is true, if any. If the assumption is not satisfied, then the behaviour is undefined.

Note that specifying a user-defined equivalence relation as the equality predicate for user-defined types (see User-defined equality and comparison) means that the promise_only_solution/1 function can be used to express more general forms of equivalence. For example, if you define a set type which represents sets as unsorted lists, you would want to define a user-defined equivalence relation for that type, which could sort the lists before comparing them. The promise_only_solution/1 function could then be used for sets even though the lists used to represent the sets might not be in the same order in every solution.


Previous: Interfacing nondeterministic code with the real world, Up: Determinism

6.5 Committed choice nondeterminism

In addition to the determinism annotations described earlier, there are “committed choice” versions of multi and nondet, called cc_multi and cc_nondet. These can be used instead of multi or nondet if all calls to that mode of the predicate (or function) occur in a context in which only one solution is needed.

Such single-solution contexts are determined as follows.

The compiler will check that all calls to a committed-choice mode of a predicate (or function) do indeed occur in a single-solution context.

You can declare two different modes of a predicate (or function) which differ only in “cc-ness” (i.e. one being multi and the other cc_multi, or one being nondet and the other cc_nondet). In that case, the compiler will select the appropriate one for each call depending on whether the call comes from a single-solution context or not. Calls from single-solution contexts will call the committed choice version, while calls which are not from single-solution contexts will call the backtracking version.

There are several reasons to use committed choice determinism annotations. One reason is for efficiency: committed choice annotations allow the compiler to generate much more efficient code. Another reason is for doing I/O, which is allowed only in det or cc_multi predicates, not in multi predicates. Another is for dealing with types that use non-canonical representations (see User-defined equality and comparison). And there are a variety of other applications.


Next: , Previous: Determinism, Up: Top

7 User-defined equality and comparison

When defining abstract data types, often it is convenient to use a non-canonical representation — that is, one for which a single abstract value may have more than one different possible concrete representations. For example, you may wish to implement an abstract type set by representing a set as an (unsorted) list.

     :- module set_as_unsorted_list.
     :- interface.
     :- type set(T).
     
     :- implementation.
     :- import_module list.
     :- type set(T) ---> set(list(T)).

In this example, the concrete representations set([1,2]) and set([2,1]) would both represent the same abstract value, namely the set containing the elements 1 and 2.

For types such as this, which do not have a canonical representation, the standard definition of equality is not the desired one; we want equality on sets to mean equality of the abstract values, not equality of their representations. To support such types, Mercury allows programmers to specify a user-defined equality predicate for user-defined types:

     :- type set(T) ---> set(list(T))
             where equality is set_equals.

Here set_equals is the name of a user-defined predicate that is used for equality on the type set(T). It could for example be defined in terms of a subset predicate.

     :- pred set_equals(set(T)::in, set(T)::in) is semidet.
     set_equals(S1, S2) :-
             subset(S1, S2),
             subset(S2, S1).

A comparison predicate can also be supplied.

     :- type set(T) ---> set(list(T))
             where equality is set_equals, comparison is set_compare.
     
     :- pred set_compare(builtin.comparison_result::uo,
                     set(T)::in, set(T)::in) is det.
     set_compare(promise_only_solution(set_compare_2(Set1, Set2)), Set1, Set2).
     
     :- pred set_compare_2(set(T)::in, set(T)::in,
                     builtin.comparison_result::uo) is cc_multi.
     set_compare_2(set(List1), set(List2), Result) :-
             builtin.compare(Result, list.sort(List1), list.sort(List2)).

If a comparison predicate is supplied and the unification predicate is omitted, a unification predicate is generated by the compiler in terms of the comparison predicate. For the set example, the generated predicate would be:

     set_equals(S1, S2) :-
             set_compare((=), S1, S2).

If a unification predicate is supplied without a comparison predicate, the compiler will generate a comparison predicate which throws an exception of type require.software_error when called.

A type declaration for a type foo(T1, ..., TN) may contain a where equality is equalitypred specification only if it declares a discriminated union type or a foreign type (see Using foreign types from Mercury) and the following conditions are satisfied:

Types with user-defined equality can only be used in limited ways. Because there multiple representations for the same abstract value, any attempt to examine the representation of such a value is a conceptually non-deterministic operation. In Mercury this is modelled using committed choice nondeterminism.

The semantics of specifying where equality is equalitypred on the type declaration for a type T are as follows:

A type declaration for a type foo(T1, ..., TN) may contain a where comparison is comparepred specification only if it declares a discriminated union type or a foreign type (see Using foreign types from Mercury) and the following conditions are satisfied:

For each type for which the declaration has a where comparison is comparepred specification, any calls to the standard library predicate builtin.compare/3 with arguments of that type are evaluated as if they were calls to comparepred.

A type declaration may contain a where equality is equalitypred, comparison is comparepred specification only if in addition to the conditions above, all [X, Y] (comparepred((=), X, Y) <=> equalitypred(X, Y)). The compiler is not required to check this.


Next: , Previous: User-defined equality and comparison, Up: Top

8 Higher-order programming

Mercury supports higher-order functions and predicates with currying, closures, and lambda expressions. (To be pedantic, it would be more accurate to say that Mercury supports higher-order procedures: in Mercury, when you construct a higher-order term, you only get one mode of a predicate or function; if you want multiple modes, you must pass multiple higher-order procedures.)


Next: , Up: Higher-order

8.1 Creating higher-order terms

To create a higher-order predicate or function term, you can use a lambda expression, or, if the predicate or function has only one mode and it is not a zero-arity function, you can just use its name. For example, if you have declared a predicate

     :- pred sum(list(int), int).
     :- mode sum(in, out) is det.

the following three unifications have the same effect:

     X = lambda([List::in, Length::out] is det, sum(List, Length))
     Y = (pred(List::in, Length::out) is det :- sum(List, Length))
     Z = sum

In the above example, the type of X, Y, and Z is pred(list(int), int), which means a predicate of two arguments of types list(int) and int respectively.

The syntax using lambda is deprecated; please use the syntax using pred instead. [The syntax using lambda was supported to enable programs to work in both Mercury and Prolog, because the syntax using pred can't be easily emulated in Prolog. Now that we have implemented better debugging environments for Mercury, there is no need for this.]

Similarly, given

     :- func scalar_product(int, list(int)) = list(int).
     :- mode scalar_product(in, in) = out is det.

the following three unifications have the same effect:

     X = (func(Num, List) = NewList :- NewList = scalar_product(Num, List))
     Y = (func(Num::in, List::in) = (NewList::out) is det
             :- NewList = scalar_product(Num, List))
     Z = scalar_product

In the above example, the type of X, Y, and Z is func(int, list(int)) = list(int), which means a function of two arguments, whose types are int and list(int), with a return type of int. As with :- func declarations, if the modes and determinism of the function are omitted in a higher-order function term, then the modes default to in for the arguments, out for the function result, and the determinism defaults to det.

If the predicate or function has more than one mode, you must use an explicit lambda expression to specify which mode you want.

You can also create higher-order function terms of non-zero arity and higher-order predicate terms by “currying”, i.e. specifying the first few arguments to a predicate or function, but leaving the remaining arguments unspecified. For example, the unification

     Sum123 = sum([1,2,3])

binds Sum123 to a higher-order predicate term of type pred(int). Similarly, the unification

     Double = scalar_product(2)

binds Double to a higher-order function term of type func(list(int)) = list(int).

For higher-order predicate expressions that thread an accumulator pair, we have syntax that allows you to use DCG notation in the goal of the expression. For example,

     Pred = (pred(Strings::in, Num::out, di, uo) is det -->
         io.write_string("The strings are: "),
         { list.length(Strings, Num) },
         io.write_strings(Strings),
         io.nl
     )

is equivalent to

     Pred = (pred(Strings::in, Num::out, IO0::di, IO::uo) is det :-
         io.write_string("The strings are: ", IO0, IO1),
         list.length(Strings, Num),
         io.write_strings(Strings, IO1, IO2),
         io.nl(IO2, IO)
     )

Higher-order function terms of zero arity can only be created using an explicit lambda expression; you have to use e.g. (func) = foo rather than plain foo, because the latter denotes the result of evaluating the function, rather than the function itself.

Note that when constructing a higher-order term, you cannot just use the name of a builtin language construct such as =, \=, call, or apply, and nor can such constructs be curried. Instead, you must either use an explicit lambda expression, or you must write a forwarding predicate or function. For example, instead of

     list.filter([1,2,3], \=(2), List)

you must write either

     list.filter([1,2,3], (pred(X::in) is semidet :- X \= 2), List)

or

     list.filter([1,2,3], not_equal(2), List)

where you have defined not_equal using

     :- pred not_equal(T::in, T::in) is semidet.
     not_equal(X, Y) :- X \= Y.

Another case when this arises is when want to curry a higher-order term. Suppose, for example, that you have a higher-order predicate term OldPred of type pred(int, char, float), and you want to construct a new higher-order predicate term NewPred of type pred(char, float) from OldPred by supplying a value for just the first argument. The solution is the same: use an explicit lambda expression or a forwarding predicate. In either case, the body of the lambda expression or the forwarding predicate must contain a higher-order call with all the arguments supplied.


Next: , Previous: Creating higher-order terms, Up: Higher-order

8.2 Calling higher-order terms

Once you have created a higher-order predicate term (sometimes known as a closure), the next thing you want to do is to call it. For predicates, you use the builtin goal call/N:

call(Closure)
call(Closure1, Arg1)
call(Closure2, Arg1, Arg2)
...
A higher-order predicate call. call(Closure) just calls the specified higher-order predicate term. The other forms append the specified arguments onto the argument list of the closure before calling it.

For example, the goal

     call(Sum123, Result)

would bind Result to the sum of [1, 2, 3], i.e. to 6.

For functions, you use the builtin expression apply/N:

apply(Closure)
apply(Closure1, Arg1)
apply(Closure2, Arg1, Arg2)
...
A higher-order function application. Such a term denotes the result of invoking the specified higher-order function term with the specified arguments.

For example, given the definition of Double above, the goal

     List = apply(Double, [1, 2, 3])

would be equivalent to

     List = scalar_product(2, [1, 2, 3])

and so for a suitable implementation of the function scalar_product/2 this would bind List to [2, 4, 6].

One extremely useful higher-order predicate in the Mercury standard library is solutions/2, which has the following declaration:

     :- pred solutions(pred(T), list(T)).
     :- mode solutions(pred(out) is nondet, out) is det.

The term which you pass to solutions/2 is a higher-order predicate term. You can pass the name of a one-argument predicate, or you can pass a several-argument predicate with all but one of the arguments supplied (a closure). The declarative semantics of solutions/2 can be defined as follows:

     solutions(Pred, List) is true iff
             all [X] (call(Pred, X) <=> list.member(X, List))
             and List is sorted.

where call(Pred, X) invokes the higher-order predicate term Pred with argument X, and where list.member/2 is the standard library predicate for list membership. In other words, solutions(Pred, List) finds all the values of X for which call(Pred, X) is true, collects these solutions in a list, sorts the list, and returns that list as its result. Here's an example: the standard library defines a predicate list.perm(List0, List)

     :- pred list.perm(list(T), list(T)).
     :- mode list.perm(in, out) is nondet.

which succeeds iff List is a permutation of List0. Hence the following call to solutions

     solutions(list.perm([3,1,2]), L)

should return all the possible permutations of the list [3,1,2] in sorted order:

     L = [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]].

See also unsorted_solutions/2 and solutions_set/2, which are defined in the standard library module std_util and documented in the Mercury Library Reference Manual.


Previous: Calling higher-order terms, Up: Higher-order

8.3 Higher-order modes

In Mercury, the mode and determinism of a higher-order predicate or function term are part of that term's inst, not its type. This allows a single higher-order predicate to work on argument predicates of different modes and determinism, which is particularly useful for library predicates such as list.map and list.foldl.

The language contains builtin inst values

     pred is Determinism
     pred(Mode) is Determinism
     pred(Mode1, Mode2) is Determinism
     ...
     (func) = Mode is Determinism
     func(Mode1) = Mode is Determinism
     func(Mode1, Mode2) = Mode is Determinism
     ...

These insts represent the instantiation state of variables bound to higher-order predicate and function terms with the appropriate mode and determinism. For example, pred(out) is det represents the instantiation state of being bound to a higher-order predicate term which is det and accepts one output argument; the term sum([1,2,3]) from the example above is one such higher-order predicate term which matches this instantiation state.

As a convenience, the language also contains builtin mode values of the same name (and they are what we have been using in the examples up to now). These modes map from the corresponding inst to itself. It is as if they were defined by

     :- mode (pred is Determinism) == in(pred is Determinism).
     :- mode (pred(Inst) is Determinism) ==
         in(pred(Inst) is Determinism).
     ...

using the parametric inst in/1 mentioned in Modes which maps an inst to itself.

If you want to define a predicate which returns a higher-order predicate term, you would use a mode such as free >> pred(...) is ..., or out(pred(...) is ... ). For example:

     :- pred foo(pred(int)).
     :- mode foo(free >> pred(out) is det) is det.
     
     foo(sum([1,2,3])).

Note that in Mercury it is an error to attempt to unify two higher-order terms. This is because equivalence of higher-order terms is undecidable in the general case.

For example, given the definition of foo above, the goal

             foo((pred(X::out) is det :- X = 6))

is illegal. If you really want to compare higher-order predicates for equivalence, you must program it yourself; for example, the above goal could legally be written as

             P = (pred(X::out) is det :- X = 6),
             foo(Q),
             all [X] (call(P, X) <=> call(Q, X)).

Note that the compiler will only catch direct attempts at higher-order unifications; indirect attempts (via polymorphic predicates, for example (list.append([], [P], [Q]) may result in an error at run-time rather than at compile-time.

In order to call a higher-order term, the compiler must know its higher-order inst. This can cause problems when higher-order terms are placed into a polymorphic collection type and then extracted, since the declared mode for the extraction will typically be out and the higher-order inst information will be lost. To partially alleviate this problem, and to make higher-order functional programming easier, if the term to be called has a function type, but no higher-order inst information, we assume that it has the default higher-order function inst func(in, ..., in) = out is Determinism.

As a consequence of this, it is a mode error to pass a higher-order function term that does not match this standard mode to somewhere where its higher-order inst information may be lost, such as to a polymorphic predicate where the argument mode is in.


Next: , Previous: Higher-order, Up: Top

9 Modules


Next: , Up: Modules

9.1 The module system

The Mercury module system is relatively simple and straightforward.

Each module must start with a :- module ModuleName declaration, specifying the name of the module.

An :- interface. declaration indicates the start of the module's interface section: this section specifies the entities that are exported by this module. Mercury provides support for abstract data types, by allowing the definition of a type to be kept hidden, with the interface only exporting the type name. The interface section may contain definitions of types, type classes, data constructors, instantiation states, and modes, and declarations for abstract data types, abstract type class instances, functions, predicates, and (sub-)modules. The interface section may not contain definitions for functions or predicates (i.e. clauses), or definitions of (sub-)modules.

An :- implementation. declaration indicates the start of the module's implementation section. Any entities declared in this section are local to the module (and its sub-modules) and cannot be used by other modules. The implementation section must contain definitions for all abstract data types, abstract instance declarations, functions, predicates, and sub-modules exported by the module, as well as for all local types, type class instances, functions, predicates, and sub-modules. The implementation section can be omitted if it is empty.

The module may optionally end with a :- end_module ModuleName declaration; the name specified in the end_module must be the same as that in the corresponding module declaration.

If a module wishes to make use of entities exported by other modules, then it must explicitly import those modules using one or more :- import_module Modules or :- use_module Modules declarations, in order to make those declarations visible. In both cases, Modules is a comma-separated list of fully-qualified module names. These declarations may occur either in the interface or the implementation section. If the imported entities are used in the interface section, then the corresponding import_module or use_module declaration must also be in the interface section. If the imported entities are only used in the implementation section, the import_module or use_module declaration should be in the implementation section.

The names of predicates, functions, constructors, constructor fields, types, modes, insts, type classes, and (sub-)modules can be explicitly module qualified using the . operator, e.g. module.name or module.submodule.name. This is useful both for readability and for resolving name conflicts. Uses of entities imported using use_module declarations must be explicitly module qualified.

Currently we also support __ and : as alternative module qualifiers, so you can write module__name or module:name instead of module.name. We intend to change the use of : to being a type qualifier instead in a future version, its use as a module qualifier is therefore deprecated.

Certain optimizations require information or source code for predicates defined in other modules to be as effective as possible. At the moment, inlining and higher-order specialization are the only optimizations that the Mercury compiler can perform across module boundaries.

One module must export a predicate main/2, which must be declared as either

     :- pred main(io.state::di, io.state::uo) is det.

or

     :- pred main(io.state::di, io.state::uo) is cc_multi.

(or any declaration equivalent to one of the two above).

Mercury has a standard library which includes modules for lists, stacks, queues, priority queues, sets, bags (multi-sets), maps (dictionaries), random number generation, input/output and filename and directory handling. See the Mercury Library Reference Manual for details.


Next: , Previous: The module system, Up: Modules

9.2 An example module.

For illustrative purposes, here is the definition of a simple module for managing queues:

     :- module queue.
     :- interface.
     
     % Declare an abstract data type.
     
     :- type queue(T).
     
     % Declare some predicates which operate on the abstract data type.
     
     :- pred empty_queue(queue(T)).
     :- mode empty_queue(out) is det.
     :- mode empty_queue(in) is semidet.
     
     :- pred put(queue(T), T, queue(T)).
     :- mode put(in, in, out) is det.
     
     :- pred get(queue(T), T, queue(T)).
     :- mode get(in, out, out) is semidet.
     
     :- implementation.
     
     % Queues are implemented as lists. We need the `list' module
     % for the declaration of the type list(T), with its constructors
     % '[]'/0 % and '.'/2, and for the declaration of the predicate
     % list.append/3.
     
     :- import_module list.
     
     % Define the queue ADT.
     
     :- type queue(T) == list(T).
     
     % Declare the exported predicates.
     
     empty_queue([]).
     
     put(Queue0, Elem, Queue) :-
              list.append(Queue0, [Elem], Queue).
     
     get([Elem | Queue], Elem, Queue).
     
     :- end_module queue.
     


Previous: An example module, Up: Modules

9.3 Sub-modules

As mentioned above, modules may contain sub-modules. There are two kinds of sub-modules, called nested sub-modules and separate sub-modules; the difference is that nested sub-modules are defined in the same source file as the containing module, whereas separate sub-modules are defined in separate source files. Implementations should support separate compilation of separate sub-modules.

A module may not contain more than one sub-module with the same name.


Next: , Up: Sub-modules

9.3.1 Nested sub-modules

Nested sub-modules within a module are delimited by matching :- module and :- end_module declarations. (Note that :- end_module for nested sub-modules are mandatory, not optional, even if the nested sub-module is the last thing in the source file. Also note that the module name in a :- module or :- end_module declaration need not be fully-qualified.) The sequence of items thus delimited is known as a sub-module item sequence.

The interface and implementation parts of a nested sub-module may be specified in two different sub-module declarations. If a sub-module item sequence includes an interface section, then it is a declaration of that sub-module; if it includes an implementation section, then it is a definition of that sub-module; and if includes both, then it is both declaration and definition.

It is an error to declare a sub-module twice, or to define it twice. It is an error to define a sub-module without declaring it. As mentioned earlier, it is an error to define a sub-module in the interface section of its parent module.

If a sub-module is declared but not explicitly defined, then there is an implicit definition with an empty implementation section for that sub-module (this will result in an error, if the interface section includes declarations but not definitions for any types, predicates, modes, or (doubly) nested sub-modules).


Next: , Previous: Nested sub-modules, Up: Sub-modules

9.3.2 Separate sub-modules

Separate sub-modules are declared using :- include_module Modules declarations. Each :- include_module declaration specifies a comma-separated list of sub-modules.

     :- include_module Module1, Module2, ..., ModuleN.

Each of the named sub-modules in an :- include_module declaration must be defined in a separate source file. The mapping between module names and source file names is implementation-defined. (For a module named foo.bar.baz, The University of Melbourne Mercury implementation requires the source to be located in a file named foo.bar.baz.m, bar.baz.m, or baz.m.) The separate source file must contain the declaration (interface) and definition (implementation) of the sub-module. It must start with a :- module declaration which matches that in the :- include_module declaration in the parent, followed by the interface and (if necessary) implementation sections, and it may optionally end with a :- end_module declaration. (Note: the module names in the :- module, :- end_module, and :- include_module declarations need not be fully-qualified. However, if the file name used for a particular module does not include all the module qualifiers, then the University of Melbourne Mercury implementation requires the module name in the :- module declaration for that module to be fully qualified.)

The semantics of separate sub-modules are identical to those of nested sub-modules. The procedure to transform a separate sub-module into a nested sub-module is as follows:

  1. Replace the :- include_module submodule declaration with the interface section of the sub-module enclosed within :- module submodule and :- end_module submodule declarations.
  2. Place the implementation section of the sub-module enclosed within :- module submodule and :- end_module submodule declarations in the implementation section of the parent module.

For example

     :- module x.
     :- interface.
     :- include_module y.
     :- end_module x.

is equivalent to

     :- module x.
     :- interface.
         :- module y.
         % interface section of module y
         :- end_module y.
     :- implementation.
         :- module y.
         % implementation section of module y
         :- end_module y.
     :- end_module x.


Next: , Previous: Separate sub-modules, Up: Sub-modules

9.3.3 Visibility rules

Any declarations in the parent module, including those in the parent module's implementation section, are visible in the parent's sub-modules, including indirect sub-modules (i.e. sub-sub-modules, etc.). Similarly, declarations in the interfaces of any modules imported using an :- import_module or a :- use_module in the parent module are visible in the parent's sub-modules, including indirect sub-modules.

Declarations in a child module are not visible in the parent module, or in “sibling” modules (other children of the same parent), or in other unrelated modules unless the child is explicitly imported using an :- import_module or :- use_module declaration. It is an error to import a module without importing all of its parent modules.

Note that a sub-module for which the :- module or :- include_module declaration occurs only in the implementation section of the parent module may only be imported or used by its parent module or by sub-modules of its parent module.

Note that as mentioned previously, all :- import_module and :- use_module declarations must use fully-qualified module names.


Previous: Visibility rules, Up: Sub-modules

9.3.4 Implementation bugs and limitations

The current implementation of sub-modules has a couple of minor limitations.


Next: , Previous: Modules, Up: Top

10 Type classes

Mercury supports constrained polymorphism in the form of type classes. Type classes allow the programmer to write predicates and functions which operate on variables of any type (or sequence of types) for which a certain set of operations is defined.


Next: , Up: Type classes

10.1 Typeclass declarations

A type class is a name for a set of types (or a set of sequences of types) for which certain predicates and/or functions, called the methods of that type class, are defined. A typeclass declaration defines a new type class, and specifies the set of predicates and/or functions that must be defined on a type (or sequence of types) for it (them) to be considered to be an instance of that type class.

The typeclass declaration gives the name of the type class that it is defining, the names of the type variables which are parameters to the type class, and the operations (i.e. methods) which form the interface of the type class.

For example,

     :- typeclass point(T) where [
             % coords(Point, X, Y):
             %       X and Y are the cartesian coordinates of Point
             pred coords(T, float, float),
             mode coords(in, out, out) is det,
     
             % translate(Point, X_Offset, Y_Offset) = NewPoint:
             %       NewPoint is Point translated X_Offset units in the X direction
             %       and Y_Offset units in the Y direction
             func translate(T, float, float) = T
     ].

declares the type class point, which represents points in two dimensional space.

pred, func and mode declarations are the only legal declarations inside a typeclass declaration. The mode and determinism of type class methods must be explicitly declared or (for functions) defaulted, not inferred. In other words, for each predicate declared in a type class, there must be at least one mode declaration, and each mode declaration in a type class must include an explicit determinism annotation. Functions with no explicit mode declaration get the usual default mode (see Modes): all arguments have mode in, the result has mode out, and the determinism is det.

The number of parameters to the type class (e.g. T) is not limited. For example, the following is allowed:

     :- typeclass a(T1, T2) where [...].

The parameters must be distinct variables. Each typeclass declaration must have at least one parameter.

It is OK for a typeclass declaration to declare no methods, e.g.

     :- typeclass foo(T) where [].

There must not be more than one type class declaration with the same name and arity in the same module.


Next: , Previous: Typeclass declarations, Up: Type classes

10.2 Instance declarations

Once the interface of the type class has been defined in the typeclass declaration, we can use an instance declaration to define how a particular type (or sequence of types) satisfies the interface declared in the typeclass declaration.

An instance declaration has the form

     :- instance classname(typename(typevar, ...), ...)
             where [methoddefinition, methoddefinition, ...].

An instance declaration gives a type for each parameter of the type class. Each of these types must be either a type with no arguments, or a polymorphic type whose arguments are all distinct type variables. For example int, list(T) and bintree(K,V) are allowed, but T, list(int) and bintree(T,T) are not. The types in an instance declaration must not be abstract types which are elsewhere defined as equivalence types. A program may not contain more than one instance declaration for a particular type (or sequence of types, in the case of a multi-parameter type class) and typeclass. These restrictions ensure that there are no overlapping instance declarations, i.e. for each typeclass there is at most one instance declaration that may be applied to any type (or sequence of types).

Each methoddefinition entry in the where [...] part of an instance declaration defines the implementation of one of the class methods for this instance. There are two ways of defining methods. The first way is to define a method by giving the name of the predicate or function which implements that method. In this case, the methoddefinition must have one of the following forms:

     pred(methodname/arity) is predname
     func(methodname/arity) is funcname

The predname or funcname must name a function or predicate of the specified arity whose type, modes, determinism, and purity are at least as permissive as the declared type, modes, determinism, and purity of the class method with the specified methodname and arity, after the types of the arguments in the instance declaration have been substituted in place of the parameters in the type class declaration.

The second way of defining methods is by listing the clauses for the definition inside the instance declaration. A methoddefinition can be a clause. These clauses are just like the clauses used to define ordinary predicates or functions (see Items), and so they can be facts, rules, or DCG rules. The only difference is that in instance declarations, clauses are separated by commas rather than being terminated by periods, and so rules and DCG rules in instance declarations must normally be enclosed in parentheses. As with ordinary predicates, you can have more than one clause for each method. The clauses must satisfy the declared type, modes, determinism and purity for the method, after the types of the arguments in the instance declaration have been substituted in place of the parameters in the type class declaration.

These two ways are mutually exclusive: each method must be defined either by a single naming definition (using the pred(...) is predname or func(...) is funcname form), or by a set of one or more clauses, but not both.

Here's an example of an instance declaration and the different kinds of method definitions that it can contain:

     :- typeclass foo(T) where [
     	func method1(T, T) = int,
     	func method2(T) = int,
     	pred method3(T::in, int::out) is det,
     	pred method4(T::in, io.state::di, io.state::uo) is det,
     	func method5(bool, T) = T
     ].
     
     :- instance foo(int) where [
     	% method defined by naming the implementation
     	func(method1/2) is (+),
     
     	% method defined by a fact
     	method2(X) = X + 1,
     
     	% method defined by a rule
     	(method3(X, Y) :- Y = X + 2),
     
     	% method defined by a DCG rule
     	(method4(X) --> io.print(X), io.nl),
     
     	% method defined by multiple clauses
     	method5(no, _) = 0,
     	(method5(yes, X) = Y :- X + Y = 0)
     ].

Each instance declaration must define an implementation for every method declared in the corresponding typeclass declaration. It is an error to define more than one implementation for the same method within a single instance declaration.

Any call to a method must have argument types (and in the case of functions, return type) which are constrained to be a member of that method's type class, or which match one of the instance declarations visible at the point of the call. A method call will invoke the predicate or function specified for that method in the instance declaration that matches the types of the arguments to the call.

Note that even if a type class has no methods, an explicit instance declaration is required for a type to be considered an instance of that type class.

Here's an example of some code using an instance declaration:

     :- type coordinate
             ---> coordinate(
                     float,           % X coordinate
                     float            % Y coordinate
             ).
     
     :- instance point(coordinate) where [
             pred(coords/3) is coordinate_coords,
             func(translate/3) is coordinate_translate
     ].
     
     
     :- pred coordinate_coords(coordinate, float, float).
     :- mode coordinate_coords(in, out, out) is det.
     
     coordinate_coords(coordinate(X, Y), X, Y).
     
     :- func coordinate_translate(coordinate, float, float) = coordinate.
     
     coordinate_translate(coordinate(X, Y), Dx, Dy) = coordinate(X + Dx, Y + Dy).

We have now made the coordinate type an instance of the point type class. If we introduce a new type, coloured_coordinate which represents a point in two dimensional space with a colour associated with it, it can also become an instance of the type class:

     :- type rgb
             ---> rgb(
                     int,
                     int,
                     int
             ).
     
     :- type coloured_coordinate
             ---> coloured_coordinate(
                     float,
                     float,
                     rgb
             ).
     
     :- instance point(coloured_coordinate) where [
             pred(coords/3) is coloured_coordinate_coords,
             func(translate/3) is coloured_coordinate_translate
     ].
     
     
     :- pred coloured_coordinate_coords(coloured_coordinate, float, float).
     :- mode coloured_coordinate_coords(in, out, out) is det.
     
     coloured_coordinate_coords(coloured_coordinate(X, Y, _), X, Y).
     
     :- func coloured_coordinate_translate(coloured_coordinate, float, float)
             = coloured_coordinate.
     
     coloured_coordinate_translate(coloured_coordinate(X, Y, Colour), Dx, Dy)
             = coloured_coordinate(X + Dx, Y + Dy, Colour).

If we call translate/3 with the first argument having type coloured_coordinate, this will invoke coloured_coordinate_translate. Likewise, if we call translate/3 with the first argument having type coordinate, this will invoke coordinate_translate.

Further instances of the type class could be made, e.g. a type that represents the point using polar coordinates.


Next: , Previous: Instance declarations, Up: Type classes

10.3 Abstract typeclass declarations

Abstract typeclass declarations are typeclass declarations whose definitions are hidden. An abstract typeclass declaration has the same form as a typeclass declaration, but without the where[...] part. An abstract typeclass declaration defines a name for a set of (sequences of) types, but does not define what methods must be implemented for instances of the type class.

Like abstract type declarations, abstract typeclass declarations are only useful in the interface section of a module. Each abstract typeclass declaration must be accompanied by a corresponding non-abstract typeclass declaration that defines the methods for that type class.

Non-abstract instance declarations can only be made in scopes where the non-abstract typeclass declaration is visible.


Next: , Previous: Abstract typeclass declarations, Up: Type classes

10.4 Abstract instance declarations

Abstract instance declarations are instance declarations whose implementations are hidden. An abstract instance declaration has the same form as an instance declaration, but without the where [...] part. An abstract instance declaration declares that a sequence of types is an instance of a particular type class without defining how the type class methods are implemented for those types. Like abstract type declarations, abstract instance declarations are only useful in the interface section of a module. Each abstract instance declaration must be accompanied by a corresponding non-abstract instance declaration that defines how the type class methods are implemented.

Here's an example:

     :- module hashable.
     :- interface.
     :- import_module int, string.
     
     :- typeclass hashable(T) where [func hash(T) = int].
     :- instance hashable(int).
     :- instance hashable(string).
     
     :- implementation.
     
     :- instance hashable(int) where [func(hash/1) is hash_int].
     :- instance hashable(string) where [func(hash/1) is hash_string].
     
     :- func hash_int(int) = int.
     hash_int(X) = X.
     
     :- func hash_string(string) = int.
     hash_string(S) = H :-
             % use the standard library predicate string.hash/2
             string.hash(S, H).
     
     :- end_module hashable.


Next: , Previous: Abstract instance declarations, Up: Type classes

10.5 Type class constraints on predicates and functions

Mercury allows a type class constraint to appear as part of a predicate or function's type signature. This constrains the values that can be taken by type variables in the signature to belong to particular type classes.

A type class constraint has the form:

             <= Typeclass(Type, ...), ...

where Typeclass is the name of a type class and Type is a type. Any variable that appears in Type must also appear in the predicate's or function's type signature. Each type class constraint in a predicate or function declaration must contain at least one variable.

For example

     :- pred distance(P1, P2, float) <= (point(P1), point(P2)).
     :- mode distance(in, in, out) is det.
     
     distance(A, B, Distance) :-
             coords(A, Xa, Ya),
             coords(B, Xb, Yb),
             XDist = Xa - Xb,
             YDist = Ya - Yb,
             Distance = sqrt(XDist*XDist + YDist*YDist).

In the above example, the distance predicate is able to calculate the distance between any two points, regardless of their representation, as long as the coords operation has been defined. These constraints are checked at compile time.


Next: , Previous: Type class constraints on predicates and functions, Up: Type classes

10.6 Type class constraints on type class declarations

Type class constraints may also appear in typeclass declarations, meaning that one type class is a “superclass” of another.

The arguments of a constraint on a type class declaration must be either type variables or ground types. Each constraint must contain at least one variable argument and all variables that appear in the arguments must also be arguments to the type class in question.

For example, the following declares the ring type class, which describes types with a particular set of numerical operations defined:

     :- typeclass ring(T) where [
             func zero = (T::out) is det,               % '+' identity
             func one = (T::out) is det,                % '*' identity
             func plus(T::in, T::in) = (T::out) is det, % '+'/2 (forward mode)
             func mult(T::in, T::in) = (T::out) is det, % '*'/2 (forward mode)
             func negative(T::in) = (T::out) is det     % '-'/1 (forward mode)
     ].

We can now add the following declaration:

     :- typeclass euclidean(T) <= ring(T) where [
             func div(T::in, T::in) = (T::out) is det,
             func mod(T::in, T::in) = (T::out) is det
     ].

This introduces a new type class, euclidean, of which ring is a superclass. The operations defined by the euclidean type class are div, mod, as well as all those defined by the ring type class. Any type declared to be an instance of euclidean must also be declared to be an instance of ring.

Typeclass constraints on type class declarations gives rise to a superclass relation. This relation must be acyclic. That is, it is an error if a type class is its own (direct or indirect) superclass.


Previous: Type class constraints on type class declarations, Up: Type classes

10.7 Type class constraints on instance declarations

Typeclass constraints may also be placed upon instance declarations. The arguments of such constraints must be either type variables or ground types. Each constraint must contain at least one variable argument and all variables that appear in the arguments must be type variables that appear in the types in the instance declaration.

For example, consider the following declaration of a type class of types that may be printed:

     :- typeclass portrayable(T) where [
             pred portray(T::in, io.state::di, io.state::uo) is det
     ].

The programmer could declare instances such as

     :- instance portrayable(int) where [
             pred(portray/3) is io.write_int
     ].
     
     :- instance portrayable(char) where [
             pred(portray/3) is io.write_char
     ].

However, when it comes to writing the instance declaration for a type such as list(T), we want to be able print out the list elements using the portray/3 for the particular type of the list elements. This can be achieved by placing a type class constraint on the instance declaration, as in the following example:

     :- instance portrayable(list(T)) <= portrayable(T) where [
             pred(portray/3) is portray_list
     ].
     
     :- pred portray_list(list(T), io.state, io.state) <= portrayable(T).
     :- mode portray_list(in, di, uo) is det.
     
     portray_list([]) -->
             [].
     portray_list([X|Xs]) -->
     	portray(X),
     	io.write_char(' '),
     	portray_list(Xs).

For abstract instance declarations, the type class constraints on an abstract instance declaration must exactly match the type class constraints on the corresponding non-abstract instance declaration that defines that instance.


Next: , Previous: Type classes, Up: Top

11 Existential types

Existentially quantified type variables (or simply “existential types” for short) are useful tools for data abstraction. In combination with type classes, they allow you to write code in an “object oriented” style that is similar to the use of interfaces in Java or abstract base classes in C++.

Mercury supports existential type quantifiers on predicate and function declarations, and in data type definitions. You can put type class constraints on existentially quantified type variables.


Next: , Up: Existential types

11.1 Existentially typed predicates and functions


Next: , Up: Existentially typed predicates and functions

11.1.1 Syntax for explicit type quantifiers

Type variables in type declarations for polymorphic predicates or functions are normally universally quantified. However, it is also possible to existentially quantify such type variables, by using an explicit existential quantifier of the form some Vars before the pred or func declaration, where Vars is a list of variables.

For example:

     % Here the type variables `T' is existentially quantified
     :- some [T] pred foo(T).
     
     % Here the type variables `T1' and `T2' are existentially quantified.
     :- some [T1, T2] func bar(int, list(T1), set(T2)) = pair(T1, T2).
     
     % Here the type variable `T2' is existentially quantified,
     % but the type variables `T1' and `T3' are universally quantified.
     :- some [T2] pred foo(T1, T2, T3).

Explicit universal quantifiers, of the form all Vars, are also permitted on pred and func declarations, although they are not necessary, since universal quantification is the default. (If both universal and existential quantifiers are present, the universal quantifiers must precede the existential quantifiers.) For example:

     % Here the type variable `T2' is existentially quantified,
     % but the type variables `T1' and `T3' are universally quantified.
     :- all [T3] some [T2] pred foo(T1, T2, T3).


Next: , Previous: Syntax for explicit type quantifiers, Up: Existentially typed predicates and functions

11.1.2 Semantics of type quantifiers

If a type variable in the type declaration for a polymorphic predicate or function is universally quantified, this means the caller will determine the value of the type variable, and the callee must be defined so that it will work for all types which are an instance of its declared type.

For an existentially quantified type variable, the situation is the converse: the callee must determine the value of the type variable, and all callers must be defined so as to work for all types which are an instance of the called procedure's declared type.

When type checking a predicate or function, if a variable has a type that occurs as a universally quantified type variable in the predicate or function declaration, or a type that occurs as an existentially quantified type variable in the declaration of one of the predicates or functions that it calls, then its type is treated as an opaque type. This means that there are very few things which it is legal to do with such a variable — basically you can only pass it to another procedure expecting the same type, unify it with another value of the same type, put it in a polymorphic data structure, or pass it to a polymorphic procedure whose argument type is universally quantified. (Note, however, that the standard library includes some quite powerful procedures such as io.write which can be useful in this context.)

A non-variable type (i.e. a type that is not a type variable) is considered more general than an existentially quantified type variable. Type inference will therefore never infer an existentially quantified type for a predicate or function unless that predicate or function calls (directly or indirectly) a predicate or function which was explicitly declared to have an existentially quantified type.

Note that an existentially typed procedure is not allowed to have different types for its existentially typed arguments in different clauses (even mode-specific clauses) or in different subgoals of a single clause; however, the same effect can be achieved in other ways (see Some idioms using existentially quantified types).

For procedures involving calls to existentially-typed predicates or functions, the compiler's mode analysis must take account of the modes for type variables in all polymorphic calls. Universally quantified type variables have mode in, whereas existentially quantified type variables have mode out. As usual, the compiler's mode analysis will attempt to reorder the elements of conjunctions in order to satisfy the modes.


Next: , Previous: Semantics of type quantifiers, Up: Existentially typed predicates and functions

11.1.3 Examples of correct code using type quantifiers

Here are some examples of type-correct code using universal and existential types.

     /* simple examples */
     
     :- pred foo(T).
     foo(_).
     	% ok
     
     :- pred call_foo.
     call_foo :- foo(42).
     	% ok (T = int)
     
     :- some [T] pred e_foo(T).
     e_foo(X) :- X = 42.
     	% ok (T = int)
     
     :- pred call_e_foo.
     call_e_foo :- e_foo(_).
     	% ok
     
     /* examples using higher-order functions */
     
     :- func bar(T, T, func(T) = int) = int.
     bar(X, Y, F) = F(X) + F(Y).
     	% ok
     
     :- func call_bar = int.
     call_bar = bar(2, 3, (func(X) = X*X)).
     	% ok (T = int)
     	% returns 13 (= 2*2 + 3*3)
     
     :- some [T] pred e_bar(T, T, func(T) = int).
     :-          mode e_bar(out, out, out(func(in) = out is det)).
     e_bar(2, 3, (func(X) = X * X)).
     	% ok (T = int)
     
     :- func call_e_bar = int.
     call_e_bar = F(X) + F(Y) :- e_bar(X, Y, F).
     	% ok
     	% returns 13 (= 2*2 + 3*3)
     


Previous: Examples of correct code using type quantifiers, Up: Existentially typed predicates and functions

11.1.4 Examples of incorrect code using type quantifiers

Here are some examples of code using universal and existential types that contains type errors.

     /* simple examples */
     
     :- pred bad_foo(T).
     bad_foo(42).
     	% type error
     
     :- some [T] pred e_foo(T).
     e_foo(42).
     	% ok
     
     :- pred bad_call_e_foo.
     bad_call_e_foo :- e_foo(42).
     	% type error
     
     :- some [T] pred e_bar1(T).
     e_bar1(42).
     e_bar1(42).
     e_bar1(43).
     	% ok (T = int)
     
     :- some [T] pred bad_e_bar2(T).
     bad_e_bar2(42).
     bad_e_bar2("blah").
     	% type error (cannot unify types `int' and `string')
     
     :- some [T] pred bad_e_bar3(T).
     bad_e_bar3(X) :- e_foo(X).
     bad_e_bar3(X) :- e_foo(X).
     	% type error (attempt to bind type variable `T' twice)
     


Next: , Previous: Existentially typed predicates and functions, Up: Existential types

11.2 Existential class constraints

Existentially quantified type variables are especially useful in combination with type class constraints.

Type class constraints can be either universal or existential. Universal type class constraints are written using <=, as described in Type class constraints on predicates and functions; they signify a constraint that the caller must satisfy. Existential type class constraints are written in the same syntax as universal constraints, but using => instead of <=; they signify a constraint that the callee must satisfy. (If a declaration has both universal and existential constraints, then the existential constraints must precede the universal constraints.)

For example:

     % Here `c1(T2)' and `c2(T2)' are existential constraints,
     % and `c3(T1)' is a universal constraint,
     :- all [T1] some [T2] ((pred p(T1, T2) => (c1(T2), c2(T2))) <= c3(T1)).

Existential constraints must only constrain type variables that are explicitly existentially quantified. Likewise, universal constraints must only constrain type variables that are universally quantified, although in this case the quantification does not have to be explicit because universal quantification is the default (see Syntax for explicit type quantifiers).


Next: , Previous: Existential class constraints, Up: Existential types

11.3 Existentially typed data types

Type variables occurring in the body of a discriminated union type definition may be existentially quantified. Constructor definitions within discriminated union type definitions may be preceded by an existential type quantifier and followed by one or more existential type class constraints.

For example:

     % A simple heterogeneous list type
     :- type list_of_any
     	---> nil_any
     	;    some [T] cons_any(T, list_of_any).
     
     % A heterogeneous list type with a type class constraint
     :- typeclass showable(T) where [ func show(T) = string ].
     :- type showable_list
     	---> nil
     	;    some [T] (cons(T, showable_list) => showable(T)).
     
     % A different way of doing the same kind of thing, this
     % time using the standard type list(T).
     :- type showable ---> some [T] (s(T) => showable(T)).
     :- type list_of_showable == list(showable).
     
     % Here's an arbitrary example involving multiple
     % type variables and multiple constraints
     :- typeclass foo(T1, T2) where [ /* ... */ ].
     :- type bar(T)
     	---> f1
     	;    f2(T)
     	;    some [T]
     	     f4(T)
     	;    some [T1, T2]
     	     (f4(T1, T2, T) => showable(T1), showable(T2))
     	;    some [T1, T2]
     	     (f5(list(T1), T2) => fooable(T1, list(T2)))
     	.

Construction and deconstruction of existentially quantified data types are inverses: when constructing a value of an existentially quantified data type, the “existentially quantified” functor acts for purposes of type checking like a universally quantified function: the caller will determine the values of the type variables. Conversely, for deconstruction the functor acts like an existentially quantified function: the caller must be defined so as to work for all possible values of the existentially quantified type variables which satisfy the declared type class constraints.

In order to make this distinction clear to the compiler, whenever you want to construct a value using an existentially quantified functor, you must prepend new onto the functor name. This tells the compiler to treat it as though it were universally quantified: the caller can bind that functor's existentially quantified type variables to any type which satisfies the declared type class constraints. Conversely, any occurrence without the new prefix must be a deconstruction, and is therefore existentially quantified: the caller must not bind the existentially quantified type variables, but the caller is allowed to depend on those type variables satisfying the declared type class constraints, if any.

For example, the function make_list constructs a value of type list_of_showable containing a sequence of values of different types, all of which are instances of the showable class

     :- instance showable(int).
     :- instance showable(float).
     :- instance showable(string).
     
     :- func make_list = showable_list.
     make_list = List :-
     	Int = 42,
     	Float = 1.0,
     	String = "blah",
     	List =  'new cons'(Int,
     		'new cons'(Float,
     		'new cons'(String, nil))).

while the function process_list below applies the show method of the showable class to the values in such a list.

     :- func process_list(list_of_showable) = list(string).
     process_list(nil) = "".
     process_list(cons(Head, Tail)) = [show(Head) | process_list(Tail)].


Previous: Existentially typed data types, Up: Existential types

11.4 Some idioms using existentially quantified types

The standard library module std_util provides an abstract type named univ which can hold values of any type. You can form heterogeneous containers (containers that can hold values of different types at the same time) by using data structures that contain univs, e.g. list(univ).

The interface to std_util includes the following:

     % `univ' is a type which can hold any value.
     :- type univ.
     
     % The function univ/1 takes a value of any type and constructs
     % a `univ' containing that value (the type will be stored along
     % with the value)
     :- func univ(T) = univ.
     
     % The function univ_value/1 takes a `univ' argument and extracts
     % the value contained in the `univ' (together with its type).
     % This is the inverse of the function univ/1.
     :- some [T] func univ_value(univ) = T.

The univ type in the standard library is in fact a simple example of an existentially typed data type. It could be implemented as follows:

     :- implementation.
     :- type univ ---> some [T] mkuniv(T).
     univ(X) = 'new mkuniv'(X).
     univ_value(mkuniv(X)) = X.

An existentially typed procedure is not allowed to have different types for its existentially typed arguments in different clauses or in different subgoals of a single clause. For instance, both of the following examples are illegal:

     :- some [T] pred bad_example(string, T).
     bad_example("foo", 42).
     bad_example("bar", "blah").
     	% type error (cannot unify `int' and `string')
     
     :- some [T] pred bad_example2(string, T).
     bad_example2(Name, Value) :-
     	( Name = "foo", Value = 42
     	; Name = "bar", Value = "blah"
     	).
     	% type error (cannot unify `int' and `string')

However, using univ, it is possible for an existentially typed function to return values of different types at each invocation.

     :- some [T] pred good_example(string, T).
     good_example(Name, univ_value(Univ)) :-
     	( Name = "foo", Univ = univ(42)
     	; Name = "bar", Univ = univ("blah")
     	).

Using univ doesn't work if you also want to use type class constraints. If you want to use type class constraints, then you must define your own existentially typed data type, analogous to univ, and use that:

     :- type univ_showable ---> some [T] (mkshowable(T) => showable(T)).
     
     :- some [T] pred harder_example(string, T) => showable(T).
     harder_example(Name, Showable) :-
     	( Name = "bar", Univ = 'new mkshowable'(42)
     	; Name = "bar", Univ = 'new mkshowable'("blah")
     	),
     	Univ = mkshowable(Showable).

The issue can also arise for mode-specific clauses (see Different clauses for different modes). For instance, the following example is illegal:

     :- some [T] pred bad_example3(string, T).
     :-          mode bad_example3(in(bound("foo")), out) is det.
     :-          mode bad_example3(in(bound("bar")), out) is det.
     :- pragma promise_pure(bad_example3/2).
     bad_example3("foo"::in(bound("foo")), 42::out).
     bad_example3("bar"::in(bound("bar")), "blah"::out).
     	% type error (cannot unify `int' and `string')

The solution is similar, although in this case an intermediate predicate is required:

     :- some [T] pred good_example3(string, T).
     :-          mode good_example3(in(bound("foo")), out) is det.
     :-          mode good_example3(in(bound("bar")), out) is det.
     good_example3(Name, univ_value(Univ)) :-
     	good_example3_univ(Name, Univ).
     
     :- pred good_example3_univ(string, univ).
     :- mode good_example3_univ(in(bound("foo")), out) is det.
     :- mode good_example3_univ(in(bound("bar")), out) is det.
     :- pragma promise_pure(good_example3_univ/2).
     good_example3_univ("foo"::in(bound("foo")), univ(42)::out).
     good_example3_univ("bar"::in(bound("bar")), univ("blah")::out).


Next: , Previous: Existential types, Up: Top

12 Semantics

A legal Mercury program is one that complies with the syntax, type, mode, determinism, and module system rules specified in earlier chapters. If a program does not comply with those rules, the compiler must report an error.

For each legal Mercury program, there is an associated predicate calculus theory whose language is specified by the type declarations in the program and whose axioms are the completion of the clauses for all predicates in the program, plus the usual equality axioms extended with the completion of the equations for all functions in the program, plus axioms corresponding to the mode-determinism assertions (see Determinism), plus axioms specifying the semantics of library predicates and functions. The declarative semantics of a legal Mercury program is specified by this theory.

Mercury implementations must be sound: the answers they compute must be true in every model of the theory. Mercury implementations are not required to be complete: they may fail to compute an answer in finite time, or they may exhaust the resource limitations of the execution environment, even though an answer is provable in the theory. However, there are certain minimum requirements that they must satisfy with respect to completeness.

There is an operational semantics of Mercury programs called the strict sequential operational semantics. In this semantics, the program is executed top-down, starting from main/2, and function calls within a goal, conjunctions and disjunctions are all executed in depth-first left-to-right order. Conjunctions and function calls are “minimally” reordered as required by the modes: the order is determined by selecting the first mode-correct sub-goal (conjunct or function call), executing that, then selecting the first of the remaining sub-goals which is now mode-correct, executing that, and so on. (There is no interleaving of different individual conjuncts or function calls, however; the sub-goals are reordered, not split and interleaved.) Function application is strict, not lazy.

Mercury implementations are required to provide a method of processing Mercury programs which is equivalent to the strict sequential operational semantics.

There is another operational semantics of Mercury programs called the strict commutative operational semantics. This semantics is equivalent to the strict sequential operational semantics except that there is no requirement that function calls, conjunctions and disjunctions be executed left-to-right; they may be executed in any order, and may even be interleaved. Furthermore, the order may even be different each time a particular goal is entered.

As well as providing the strict sequential operational semantics, Mercury implementations may optionally provide additional implementation-defined operational semantics, provided that any such implementation-defined operational semantics are at least as complete as the strict commutative operational semantics. An implementation-defined semantics is “at least as complete” as the strict commutative semantics if and only if the implementation-defined semantics guarantees to compute an answer in finite time for any program for which an answer would be computed in finite time for all possible executions under the strict commutative semantics (i.e. for all possible orderings of conjunctions and disjunctions).

Thus, to summarize, there are in fact a variety of different operational semantics for Mercury. In one of them, the strict sequential semantics, there is no nondeterminism — the behaviour is always specified exactly. Programs are executed top-down using SLDNF (or something equivalent), mode analysis does “minimal” reordering (in a precisely defined sense), function calls, conjunctions and disjunctions are executed depth-first left-to-right, and function evaluation is strict. All implementations are required to support the strict sequential semantics, so that a program which works on one implementation using this semantics will be guaranteed to work on any other implementation. However, implementations are also allowed to support other operational semantics, which may have non-determinism, so long as they are sound with respect to the declarative semantics, and so long as they meet a minimum level of completeness (they must be at least as complete as the strict commutative semantics, in the sense that every program which terminates for all possible orderings must also terminate in any implementation-defined operational semantics).

This compromise allows Mercury to be used in several different ways. Programmers who care more about ease of programming and portability than about efficiency can use the strict sequential semantics, and can then be guaranteed that if their program works on one correct implementation, it will work on all correct implementations. Compiler implementors who want to write optimizing implementations that do lots of clever code reorderings and other high-level transformations or that want to offer parallelizing implementations which take maximum advantage of parallelism can define different semantic models. Programmers who care about efficiency more than portability can write code for these implementation-defined semantic models. Programmers who care about efficiency and portability can achieve this by writing code for the commutative semantics. Of course, this is not quite as easy as using the strict sequential semantics, since it is in general not sufficient to test your programs on just one implementation if you are to be sure that it will be able to use the maximally efficient operational semantics on any implementation. However, if you do write code which works for all possible executions under commutative semantics (i.e. for all possible orderings of conjunctions and disjunctions), then you can be guaranteed that it will work correctly on every implementation, under every possible implementation-defined semantics.

The University of Melbourne Mercury implementation offers eight different semantics, which can be selected with different combinations of the --no-reorder-conj, --no-reorder-disj, and --fully-strict options. (The --fully-strict option prevents the compiler from improving completeness by optimizing away infinite loops or calls to require.error/1 or exception.throw/1.) The default semantics are the commutative semantics. Enabling all of these options gives you the strict sequential semantics. Enabling just some of them gives you a semantics somewhere in between.

Future implementations of Mercury may wish to offer other operational semantics. For example, they may wish to provide semantics in which function evaluation is lazy, rather than strict; semantics with a guaranteed fair search rule; and so forth.


Next: , Previous: Semantics, Up: Top

13 Foreign language interface

This chapter documents the new foreign language interface. This is intended as a successor to the existing C interface for Mercury, which is documented in C interface. However, the new foreign language interface is not yet complete (it does not yet include equivalents to pragma import and pragma export in the C interface) and is not as well tested as the existing C interface. Furthermore, it is possible that incompatible changes will be needed in future versions of this interface.

In view of this, we currently support both the old C interface and the new foreign language interface. We advise people writing new code to use the new foreign language interface, but existing code that uses the old C interface can continue to do so, and we do not recommended rewriting such code at this point in time.


Next: , Up: Foreign language interface

13.1 Calling foreign code from Mercury

Mercury procedures can be implemented using fragments of foreign language code using pragma foreign_proc.


Next: , Up: Calling foreign code from Mercury

13.1.1 pragma foreign_proc

A declaration of the form

     :- pragma foreign_proc("Lang", Pred(Var1::Mode1, Var2::Mode2, ...),
             Attributes, Foreign_Code).

or

     :- pragma foreign_proc("Lang", Func(Var1::Mode1, Var2::Mode2, ...) = (Var::Mode),
             Attributes, Foreign_Code).

means that any calls to the specified mode of Pred or Func will result in execution of the foreign code given in Foreign_Code written in language Lang, if Lang is selected as the foreign language code by this implementation. See the “Foreign Language Interface” chapter of the Mercury User's Guide, for more information about how the implementation selects the appropriate foreign_proc to use.

The foreign code fragment may refer to the specified variables (Var1, Var2, ..., and Var) directly by name. It is an error for a variable to occur more than once in the argument list. These variables will have foreign language types corresponding to their Mercury types, as determined by language and implementation specific rules.

All foreign_proc implementations are assumed to be impure. If they are actually pure or semipure, they must be explicitly promised as such by the user (either by using foreign language attributes specified below, or a promise_pure or promise_semipure pragma as specified in Impurity.

Additional restrictions on the foreign language interface code depend on the foreign language and compilation options. For more information, including the list of supported foreign languages and the strings used to identify them, see the language specific information in the “Foreign Language Interface” chapter of the Mercury User's Guide.

If there is a pragma foreign_proc declaration for any mode of a predicate or function, then there must be either a clause or a pragma foreign_proc declaration for every mode of that predicate or function.

Here's an example of code using pragma foreign_proc: The following code defines a Mercury function sin/1 which calls the C function sin() of the same name.

     :- func sin(float) = float.
     :- pragma foreign_proc("C", sin(X::in) = (Sin::out),
             [may_call_mercury],
             "Sin = sin(X);").

If the foreign language code does not recursively invoke Mercury code, as in the above example, then you can use will_not_call_mercury in place of may_call_mercury in the declarations above. This allows the compiler to use a slightly more efficient calling convention. (If you use this form, and the C code does invoke Mercury code, then the behaviour is undefined — your program may misbehave or crash.)

If there are both Mercury definitions and foreign_proc definitions for a procedure and/or foreign_proc definitions for different languages, it is implementation defined which definition is used.

For pure and semipure procedures, the declarative semantics of the foreign_proc definitions must be the same as that of the Mercury code. The only thing that is allowed to differ is the efficiency (including the possibility of non-termination) and the order of solutions.


Previous: pragma foreign_proc, Up: Calling foreign code from Mercury

13.1.2 Foreign code attributes

As described above, pragma foreign_proc declarations may include a list of attributes describing properties of the given foreign function or code. All Mercury implementations must support the attributes listed below. They may also support additional attributes.

The attributes which must be supported by all implementations are as follows:

may_call_mercury/will_not_call_mercury
This attribute declares whether or not execution inside this foreign language code may call back into Mercury or not. The default, in case neither is specified, is may_call_mercury. Specifying will_not_call_mercury may allow the compiler to generate more efficient code. If you specify will_not_call_mercury, but the foreign language code does invoke Mercury code, then the behaviour is undefined.
promise_pure/promise_semipure
This attribute promises that the purity of the given predicate or function definition is pure or semipure. It is equivalent to a corresponding pragma promise_pure or pragma promise_semipure declaration (see Impurity). If omitted, the clause specified by the foreign_proc is assumed to be impure.
thread_safe/not_thread_safe
This attribute declares whether or not it is safe for multiple threads to execute this foreign language code concurrently. The default, in case neither is specified, is not_thread_safe. If the foreign language code is declared thread_safe, then the Mercury implementation is permitted to execute the code concurrently from multiple threads without taking any special precautions. If the foreign language code is declared not_thread_safe, then the Mercury implementation must not invoke the code concurrently from multiple threads. If the Mercury implementation does use multithreading, then it must take appropriate steps to prevent this. (The experimental multithreaded version of the current University of Melbourne Mercury implementation protects not_thread_safe code using a mutex: C code that is not thread-safe has code inserted around it to obtain and release a mutex. All non-thread-safe foreign language code shares a single mutex.)

Additional attributes which are supported by the Melbourne Mercury compiler are as follows:

tabled_for_io
This attribute should be attached to foreign procedures that do I/O. It tells the debugger to make calls to the foreign procedure idempotent. This allows the debugger to safely retry across such calls and also allows safe declarative debugging of code containing such calls. For more information see the I/O tabling section of the Mercury user guide. If the foreign procedure contains gotos or static variables then the pragma no_inline directive should also be given (see pragma c_code).
terminates/does_not_terminate
This attribute specifies the termination properties of the given predicate or function definition. It is equivalent to the corresponding pragma terminates or pragma does_not_terminate declaration. If omitted, the termination property of the procedure is determined by the value of the may_call_mercury/will_not_call_mercury attribute. See Termination analysis for more details.
max_stack_size(Size)
This attribute declares the maximum stack usage of a particular piece of code. The unit that Size is measured in depends upon foreign language being used. Currently this attribute is only used (and is in fact required) by the IL foreign language interface, and is measured in units of stack items.
will_not_throw_exception
This attribute promises that the given predicate or function will not make calls back to Mercury that may result in an exception being thrown. It is an error to apply this attribute to procedures that have determinism erroneous. This attribute is ignored for code that is declared as not making calls back to Mercury via the will_not_call_mercury attribute. Note: Predicates or functions that have polymorphic arguments but do not explicitly throw an exception, via a call to exception.throw/1 or require.error/1, may still throw exceptions because they may be called with arguments whose types have user-defined equality or comparison predicates. If these user-defined equality or comparison predicates throw exceptions then unifications or comparisons involving these types may also throw exceptions. As such, we recommend that only implementors of the Mercury system use this annotation for polymorphic predicates and functions.


Next: , Previous: Using foreign types from Mercury, Up: Foreign language interface

13.2 Data passing conventions

For each supported foreign language, we explain how to map a Mercury type to a type in that foreign language. We also map the Mercury parameter passing convention to the foreign language's parameter passing convention.


Next: , Up: Data passing conventions

13.2.1 C data passing conventions

This section is currently documented in Passing data to and from C.


Next: , Previous: C data passing conventions, Up: Data passing conventions

13.2.2 IL and C# data passing conventions

The Mercury types int, float, char, and string are mapped to the Common Language Runtime types System.Int32, System.Double, System.Char and System.String respectively, which correspond to the C# types int, double, char, and string, and to the IL assembler types int, float64, char and string.

Mercury variables whose type is a type variable will be passed as System.Object. Mercury array types are mapped to CLR array types. When compiling with --no-high-level-data, all other Mercury variables are passed as System.Object[]. When compiling with --high-level-data, Mercury variables whose type is a Mercury discriminated union type will be passed as a CLR type whose type name is determined from the Mercury type name (ignoring any type parameters) followed by an underscore and then the type arity, expressed as a decimal integer. Mercury module qualifiers are converted to CLR namespace qualifiers. For example the Mercury type foo.bar.baz/1 will be passed as the CLR type foo.bar.baz_1. Note an extra namespace qualifier, mercury, will be prepended to the beginning of names residing in the Mercury standard library. Mercury variables whose type is a Mercury equivalence type will be passed as the representation of the right hand side of the equivalence type.

This mapping is subject to change and you should try to avoid writing code that relies heavily upon a particular representation of Mercury terms.

Mercury arguments declared with input modes are passed by value to the IL or C# function. For output arguments, the Mercury implementation will pass to the IL or C# function a reference to the location in which to store the result; for example, a Mercury output argument of type int would map to a C# ref int function parameter. (Note that we map to ref int, not out int; for procedures that can fail, output arguments only need to be set if the procedure succeeds.) If the Mercury procedure can fail, then its IL or C# function should return a truth value of type bool (i.e. System.Bool) indicating success or failure: true indicates success, and false indicates failure. If the Mercury procedure is a Mercury function that cannot fail, and the function result has an output mode, then the IL function should return the Mercury function result value. Otherwise the function result is appended as an extra argument. Arguments of type io.state or store.store(_) are not passed or returned at all. (The reason for this is that these types represent mutable state, and in IL modifications to mutable state are done via side effects, rather than argument passing.)


Previous: IL and C# data passing conventions, Up: Data passing conventions

13.2.3 Java data passing conventions

The Mercury types int, float, char, and string are mapped to the Java types int, double, char and java.lang.String respectively.

Mercury variables whose type is a type variable will be passed as java.lang.Object. Mercury array types are mapped to Java array types. When compiling with --no-high-level-data, all other Mercury variables are passed as java.lang.Object[]. When compiling with --high-level-data, Mercury variables whose type is a Mercury discriminated union type will be passed as a Java type whose type name is determined from the Mercury type name (ignoring any type parameters) followed by an underscore and then the type arity, expressed as a decimal integer. Mercury module qualifiers are converted to Java namespace qualifiers. For example the Mercury type foo.bar.baz/1 will be passed as the Java type foo.bar.baz_1. Note an extra namespace qualifier, mercury, will be prepended to the beginning of names residing in the Mercury standard library. Mercury variables whose type is a Mercury equivalence type will be passed as the representation of the right hand side of the equivalence type.

This mapping is subject to change and you should try to avoid writing code that relies heavily upon a particular representation of Mercury terms.

Mercury arguments declared with input modes are passed by value to the corresponding Java function. If the Mercury procedure is a function whose result has an input mode, then the Mercury function result is appended to the list of input parameters, so that the Mercury function result becomes the last parameter to the corresponding Java function.

The result of the Java function will be a sequence comprised of the outputs of the Mercury procedure. If the Mercury procedure's determinism indicates that it can fail, then its first output will be a boolean success indicator, with a true value denoting success and a false value denoting failure. The remaining outputs will be the subsequence of the Mercury procedure's arguments and function result (if any) that have output modes.

If the sequence has length zero then the return type of the Java function is void. If the sequence has length one then the return type of the Java function is the type of the item in question. Otherwise, the Java function will return an array of type java.lang.Object[].

Arguments of type io.state or store.store(_) are not passed or returned at all. (The reason for this is that these types represent mutable state, and in Java modifications to mutable state are done via side effects, rather than argument passing.)


Next: , Previous: Calling foreign code from Mercury, Up: Foreign language interface

13.3 Using foreign types from Mercury

Types defined in a foreign language can be accessed in Mercury using a declaration of the form

     :- pragma foreign_type(Lang, MercuryTypeName, ForeignTypeDescriptor).

This defines MercuryTypeName as a synonym for type ForeignTypeDescriptor defined in the foreign language Lang. You must declare MercuryTypeName using a (possibly abstract) :- type declaration as usual. The pragma foreign_type must not have wider visibility than the type declaration (if the pragma foreign_type declaration is in the interface, the :- type declaration must be also).

ForeignTypeDescriptor defines how the Mercury type is mapped for a particular foreign language. Specific syntax is given in the language specific information below.

MercuryTypeName is treated as an abstract type at all times in Mercury code. However, if MercuryTypeName is one of the parameters of a foreign_proc for Lang, and the pragma foreign_type declaration is visible to the foreign_proc, it will be passed to that foreign_proc as specified by ForeignTypeDescriptor.

Multiple foreign language definitions may be given for the same type — the appropriate definition will be used for the appropriate language (see the language specific information below for details). All definitions must have the same visibility. A Mercury definition, which must define a discriminated union type, may also be given. The constructors for the type will only be visible in Mercury clauses for predicates or functions with pragma foreign_proc clauses for all of the languages for which there are foreign_type declarations for the type.

You can also associate assertions about the properties of the foreign type with the foreign_type declaration, using the following syntax:

     :- pragma foreign_type(Lang, MercuryTypeName, ForeignTypeDescriptor, [ForeignTypeAssertion, ...]).

Currently, two kinds of assertions are supported. The can_pass_as_mercury_type assertion states that on the C backends, values of the given type can be passed to and from Mercury code without boxing, via simple casts, which is faster. This requires the type to be either an integer type or a pointer type, and requires its size to be the same as or less than the size of Mercury word. Since deciding whether a C type satisfies this condition requires knowledge of the Mercury implementation, and violations are very likely to result in the generated executable silently doing the wrong thing, we do not recommend the use of assertions unless you are an implementor of the Mercury system. The stable assertion is meaningful only in the presence of the can_pass_as_mercury_type assertion. It states that either the C type is an integer type, or it is a pointer type pointing to memory that will never change. Together, these assertions are sufficient to allow tabling and the compare_representation primitive to work on values of such types.

As with discriminated union types, programmers can specify the unification and/or comparison predicates to use for values of the type using the following syntax (see User-defined equality and comparison):

     :- pragma foreign_type(Lang, MercuryTypeName, ForeignTypeDescriptor)
             where equality is EqualityPred, comparison is ComparePred.

You can use Mercury foreign language interfacing declarations which specify language X to interface to types that are actually written in a different language Y provided that X and Y have compatible interface conventions. Support for this kind of compatibility is described in the language specific information below.


Next: , Previous: Data passing conventions, Up: Foreign language interface

13.4 Adding foreign declarations

Foreign language declarations (such as type declarations, header file inclusions or macro definitions) can be included in the Mercury source file as part of a foreign_decl declaration of the form

     :- pragma foreign_decl("Lang", DeclCode).

This declaration will have effects equivalent to including the specified DeclCode in an automatically-generated source file of the specified programming language, in a place appropriate for declarations, and linking that source file with the Mercury program (after having compiled it with a compiler for the specified programming language, if appropriate).

Entities declared in pragma foreign_decl declarations are visible in pragma foreign_code, pragma foreign_type, and pragma foreign_proc declarations that specify the same foreign language and occur in the same Mercury module.

By default, the contents of pragma foreign_decl declarations are also visible in the same kinds of declarations in other modules that import the module containing the pragma foreign_decl declaration. This is because they may be required to make sense of types defined using pragma foreign_type and/or predicates defined using pragma foreign_code in the containing module, and these may be visible in other modules, especially in the presence of intermodule optimization,

If you do not want the contents of a pragma foreign_decl declaration to be visible in foreign language code in other modules, you can use the following variant of the declaration:

     :- pragma foreign_decl("Lang", local, DeclCode).

Note: currently only the C backends support this variant of the pragma foreign_decl declaration.

To make the declarations for Mercury predicates or functions exported to a foreign language using a pragma export declaration visible to foreign code in a pragma foreign_code or pragma foreign_proc declaration, use a declaration of the form

     :- pragma foreign_import_module("Lang", ImportedModule).

where ImportedModule is the name of the module containing the pragma export declarations.

If Lang is "C" this is equivalent to

     :- pragma foreign_decl("C", "#include ""ImportedModule.mh""").

where ImportedModule.mh is the automatically generated header file containing the C declarations for the predicates and functions exported to C.

pragma foreign_import_module should be used instead of the explicit #include because pragma foreign_import_module tells the implementation that ImportedModule.mh must be built before the object file for the module containing the pragma foreign_import_module declaration.

A cycle of pragma foreign_import_module, where the language is "MC++", "C#", or "Java", is not permitted.


Next: , Previous: Adding foreign declarations, Up: Foreign language interface

13.5 Adding foreign definitions

Definitions of foreign language entities (such as functions or global variables) may be included using a declaration of the form

     :- pragma foreign_code("Lang", Code).

This declaration will have effects equivalent to including the specified Code in an automatically-generated source file of the specified programming language, in a place appropriate for definitions, and linking that source file with the Mercury program (after having compiled it with a compiler for the specified programming language, if appropriate).

Entities declared in pragma foreign_code declarations are visible in pragma foreign_proc declarations that specify the same foreign language and occur in the same Mercury module.


Previous: Adding foreign definitions, Up: Foreign language interface

13.6 Language specific bindings

All Mercury implementations should support interfacing with C. The set of other languages supported is implementation-defined. A suitable compiler or assembler for the foreign language must be available on the system.

The University of Melbourne Mercury implementation supports interfacing with the following languages:

C
Use the string "C" to set the foreign language to C.
C#
Use the string "C#" to set the foreign language to C#.
IL
Use the string "IL" to set the foreign language to IL. IL (sometimes also known as CIL or MSIL) is the Intermediate Language for the .NET Common Language Runtime.
Java
Use the string "Java" to set the foreign language to Java.
Managed C++
Use the string "MC++" to set the foreign language to Managed C++. Managed C++ is C++ with Microsoft's extensions to support interfacing with and generating code for the .NET Common Language Runtime.


Next: , Up: Language specific bindings

13.6.1 Interfacing with C


Next: , Up: Interfacing with C
13.6.1.1 Using pragma foreign_proc for C

The input and output variables will have C types corresponding to their Mercury types, as determined by the rules specified in C data passing conventions.

The C code fragment may declare local variables, but it should not declare any labels or static variables unless there is also a Mercury pragma no_inline declaration for the procedure. The reason for this is that otherwise the Mercury implementation may inline the procedure by duplicating the C code fragment for each call. If the C code fragment declared a static variable, inlining it in this way could result in the program having multiple instances of the static variable, rather than a single shared instance. If the C code fragment declared a label, inlining it in this way could result in an error due to the same label being defined twice inside a single C function.

C code in a pragma foreign_proc declaration for any procedure whose determinism indicates that it can fail must assign a truth value to the macro SUCCESS_INDICATOR. For example:

     :- pred string.contains_char(string, character).
     :- mode string.contains_char(in, in) is semidet.
     
     :- pragma foreign_proc("C",
     	string.contains_char(Str::in, Ch::in),
             [will_not_call_mercury, promise_pure],
             "SUCCESS_INDICATOR = (strchr(Str, Ch) != NULL);").

SUCCESS_INDICATOR should not be used other than as the target of an assignment. (For example, it may be #defined to a register, so you should not try to take its address.) Procedures whose determinism indicates that they cannot fail should not access SUCCESS_INDICATOR.

Arguments whose mode is input will have their values set by the Mercury implementation on entry to the C code. If the procedure succeeds, the C code must set the values of all output arguments. If the procedure fails, the C code need only set SUCCESS_INDICATOR to false (zero).


Next: , Previous: Using pragma foreign_proc for C, Up: Interfacing with C
13.6.1.2 Using pragma foreign_decl for C

Any macros, function prototypes, or other C declarations that are used in foreign_code, foreign_type or foreign_proc pragmas must be included using a foreign_decl declaration of the form

     :- pragma foreign_decl("C", HeaderCode).

HeaderCode can be a C #include line, for example

     :- pragma foreign_decl("C", "#include <math.h>")

or

     :- pragma foreign_decl("C", "#include ""tcl.h""").

or it may contain any C declarations, for example

     :- pragma foreign_decl("C", "
             extern int errno;
             #define SIZE 200
             struct Employee {
                     char name[SIZE];
             };
             extern int bar;
             extern void foo(void);
     ").

Mercury automatically includes certain headers such as <stdlib.h>, but you should not rely on this, as the set of headers which Mercury automatically includes is subject to change.

If a Mercury predicate or function exported using a pragma export declaration is to be used within a :- pragma foreign_code or :- pragma foreign_proc declaration the header file for the module containing the pragma export declaration should be included using a pragma foreign_import_module declaration, for example

     :- pragma foreign_import_module("C", exporting_module).


Next: , Previous: Using pragma foreign_decl for C, Up: Interfacing with C
13.6.1.3 Using pragma foreign_code for C

Definitions of C functions or global variables may be included using a declaration of the form

     :- pragma foreign_code("C", Code).

For example,

     :- pragma foreign_code("C", "
             int bar = 42;
             void foo(void) {}
     ").

Such code is copied verbatim into the generated C file.


Previous: Using pragma foreign_code for C, Up: Interfacing with C
13.6.1.4 Using pragma foreign_type for C

A C pragma foreign_type declaration has the form:

     :- pragma foreign_type("C", MercuryTypeName, "CForeignType").

For example,

     :- pragma foreign_type("C", long_double, "long double").

The CForeignType can be any C type name that obeys the following restrictions. Function types, array types, and incomplete types are not allowed. The type name must be such that when declaring a variable in C of that type, that no part of the type name is required after the variable name. (This rule prohibits, for example, function pointer types such as void (*)(void). However, it would be OK to use a typedef name which was defined as a function pointer type.)

C preprocessor directives (such as #if) may not be used in CForeignType. (You can however use a typedef name that refers to a type defined in a pragma foreign_decl declaration, and the pragma foreign_decl declaration may contain C preprocessor directives.)

With --gc accurate, foreign_types which are C pointer types must not point to the Mercury heap.

If the MercuryTypeName is the type of a parameter of a procedure defined using pragma foreign_proc, it will be passed to the foreign_proc's foreign language code as CForeignType.

Furthermore, any Mercury procedure exported with pragma export will use CForeignType as the type for any parameters whose Mercury type is MercuryTypeName.

Also see the section on using C pointers (see Using C pointers) for information on how to use the c_pointer type with the C interface.


Next: , Previous: Interfacing with C, Up: Language specific bindings

13.6.2 Interfacing with C#


Next: , Up: Interfacing with C#
13.6.2.1 Using pragma foreign_type for C#

There is currently no direct support for using C# types from Mercury; however, the types for IL are compatible with C#, and so the foreign_type support for IL can be used instead. See the section on using pragma foreign_type for IL (see Using pragma foreign_type for IL).


Next: , Previous: Using pragma foreign_type for C#, Up: Interfacing with C#
13.6.2.2 Using pragma foreign_proc for C#

The C# code from C# pragma foreign_proc declarations will be placed in the bodies of static member functions of an automatically-generated C# class. Since such C# code will become part of a static member function, it must not refer to the this keyword. It may however refer to static member variables or static member functions declared with pragma foreign_code.

The input and output variables for a C# pragma foreign_proc will have C# types corresponding to their Mercury types. The exact rules for mapping Mercury types to C# types are described in IL and C# data passing conventions.

C# code in a pragma foreign_proc declaration for any procedure whose determinism indicates that it can fail must assign a value of type bool to the variable SUCCESS_INDICATOR. For example:

     :- pred string.contains_char(string, character).
     :- mode string.contains_char(in, in) is semidet.
     
     :- pragma foreign_proc("C#",
     	string.contains_char(Str::in, Ch::in),
             [will_not_call_mercury, promise_pure],
             "SUCCESS_INDICATOR = (Str.IndexOf(Ch) != -1);").

C# code for procedures whose determinism indicates that they cannot fail should not access SUCCESS_INDICATOR.

Arguments whose mode is input will have their values set by the Mercury implementation on entry to the C# code. If the procedure succeeds, the C# code must set the values of all output arguments. If the procedure fails, the C# code need only set SUCCESS_INDICATOR to false.


Next: , Previous: Using pragma foreign_proc for C#, Up: Interfacing with C#
13.6.2.3 Using pragma foreign_decl for C#

pragma foreign_decl declarations for C# can be used to provide any top-level C# declarations (e.g. using declarations or auxiliary class definitions) which are needed by C# code in pragma foreign_proc declarations in that module.

For example:

     :- pragma foreign_decl("C#", "
     	using System;
     ").
     :- pred hello(io.state::di, io.state::uo) is det.
     :- pragma foreign_proc("C#",
     	hello(_IO0::di, _IO::uo),
     	[will_not_call_mercury],
     "
     	// here we can refer directly to Console rather than System.Console
     	Console.WriteLine(""hello world"");
     ").


Previous: Using pragma foreign_decl for C#, Up: Interfacing with C#
13.6.2.4 Using pragma foreign_code for C#

The C# code from pragma foreign_proc declarations for C# will be placed in the bodies of static member functions of an automatically-generated C# class. pragma foreign_code can be used to define additional members of this automatically-generated class, which can then be referenced by pragma foreign_proc declarations for C# from that module.

For example:

     :- pragma foreign_code("C#", "
     	static int counter = 0;
     ").
     
     :- impure pred incr_counter is det.
     :- pragma foreign_proc("C#", incr_counter,
     	[will_not_call_mercury], "counter++;").
     
     :- semipure func get_counter = int.
     :- pragma foreign_proc("C#",
     	get_counter = (Result::out),
     	[will_not_call_mercury, promise_semipure],
     	"Result = counter;").


Next: , Previous: Interfacing with C#, Up: Language specific bindings

13.6.3 Interfacing with IL


Next: , Up: Interfacing with IL
13.6.3.1 Using pragma foreign_type for IL

The IL pragma foreign_type declaration has the form:

     :- pragma foreign_type("IL", MercuryTypeName, DotNetForeignType).

If the MercuryTypeName is the type of a parameter of a procedure defined using pragma foreign_proc for any of the .NET CLR languages, it will be passed to the foreign_proc's foreign language code as DotNetForeignType.

Furthermore, any Mercury procedure exported with pragma export will use DotNetForeignType as the .NET CLR parameter type for parameters whose Mercury type is MercuryTypeName.

The IL assembler syntax is used to specify type names for all the .NET CLR languages (IL, C# and Managed C++) supported by Mercury's foreign_proc mechanism. This syntax is documented in the ECMA specifications for .NET. The .NET CLR backend supports reference and value types using pragma foreign_type, where DotNetForeignType is specified using the syntax "class [AssemblyName]ReferenceTypeName" for reference types and "valuetype [AssemblyName]ValueTypeName" for value types. Note that extra whitespace is not handled — there should only be a single space between the class keyword and the assembly specifier. Value types which have a special name (such as int32) can be named using their valuetype syntax or their special name; they will be marshalled using their special name (as is required by section 7.2 of Partition II of the ECMA CLI documentation).

For example:

     :- type xmldoc.
     :- pragma foreign_type("IL", xmldoc,
     	"class [System.Xml]System.Xml.XmlDocument").
     :- type int32.
     :- pragma foreign_type("IL", int32,
     	"valuetype [mscorlib]System.Int32").

ensures that on the .NET CLR backend the Mercury type xmldoc is marshalled by the backend as System.Xml.XmlDocument from assembly System.Xml, and that the Mercury type int32 will be marshalled by the backend as the CLR type int32.

The following example shows how one can use the marshalled data from C#.

     :- pred loadxml(string::in, xmldoc::di, xmldoc::uo) is det.
     
     :- pragma foreign_proc("C#", load(String::in, XML0::di, XML::uo),
             [will_not_call_mercury, promise_pure],
     "
         XML0.LoadXml(String);
         XML = XML0;
     ").


Next: , Previous: Using pragma foreign_type for IL, Up: Interfacing with IL
13.6.3.2 Using pragma foreign_proc for IL

Variables can be accessed from IL by using ldloc (for input parameters) and stloc (for output parameters). Do not use ret or jmp instructions or tail calls within the handwritten IL code. The stack must be empty at the end of the IL code.

     :- pred add(int::in, int::in, int::out) det.
     :- pragma foreign_proc("IL", add(X::in, Y::in, Z::out), [max_stack_size(2)], "
     	ldloc X
     	ldloc Y
     	add
     	stloc Z
     ").

IL code for procedures whose determinism indicates they can fail is currently not supported.

Arguments whose mode is input will have their values set by the Mercury implementation on entry to the IL code. If the procedure succeeds, the IL code must set the values of all output arguments.

Each of the head variables will be represented by the Common Language Runtime types as specified in IL and C# data passing conventions.


Next: , Previous: Using pragma foreign_proc for IL, Up: Interfacing with IL
13.6.3.3 Using pragma foreign_decl for IL

pragma foreign_decl is currently not supported for IL.


Previous: Using pragma foreign_decl for IL, Up: Interfacing with IL
13.6.3.4 Using pragma foreign_code for IL

pragma foreign_code is currently not supported for IL.


Next: , Previous: Interfacing with IL, Up: Language specific bindings

13.6.4 Interfacing with Java


Next: , Up: Interfacing with Java
13.6.4.1 Using pragma foreign_type for Java

A Java pragma foreign_type declaration has the form:

     :- pragma foreign_type("Java", MercuryTypeName, "JavaType").

The JavaType can be any accessible Java type.

The effect of this declaration is that Mercury values of type MercuryTypeName will be passed to and from Java foreign_procs as having type JavaType.


Next: , Previous: Using pragma foreign_type for Java, Up: Interfacing with Java
13.6.4.2 Using pragma foreign_proc for Java

The Java code from Java pragma foreign_proc declarations will be placed in the bodies of static member functions of an automatically-generated Java class. Since such Java code will become part of a static member function, it must not refer to the this keyword. It may however refer to static member variables or static member functions declared with pragma foreign_code.

The input and output variables for a Java pragma foreign_proc will have Java types corresponding to their Mercury types. The exact rules for mapping Mercury types to Java types are described in Java data passing conventions.

The Java code in a pragma foreign_proc declaration for a procedure whose determinism indicates that it can fail must assign a value of type boolean to the variable succeeded. For example:

     :- pred string.contains_char(string, character).
     :- mode string.contains_char(in, in) is semidet.
     
     :- pragma foreign_proc("Java",
     	string.contains_char(Str::in, Ch::in),
             [will_not_call_mercury, promise_pure],
             "succeeded = (Str.IndexOf(Ch) != -1);").

Java code for procedures whose determinism indicates that they cannot fail should not refer to the succeeded variable.

Arguments whose mode is input will have their values set by the Mercury implementation on entry to the Java code. With our current implementation, the Java code must set the values of all output variables, even if the procedure fails (i.e. sets the succeeded variable to false).


Next: , Previous: Using pragma foreign_proc for Java, Up: Interfacing with Java
13.6.4.3 Using pragma foreign_decl for Java

pragma foreign_decl declarations for Java can be used to provide any top-level Java declarations (e.g. import declarations or auxiliary class definitions) which are needed by Java code in pragma foreign_proc declarations in that module.

For example:

     :- pragma foreign_decl("Java", "
     import javax.swing.*;
     import java.awt.*;
     
     class MyApplet extends JApplet {
         public void init() {
             JLabel label = new JLabel(""Hello, world"");
             label.setHorizontalAlignment(JLabel.CENTER);
             getContentPane().add(label);
         }
     }
     ").
     :- pred hello(io.state::di, io.state::uo) is det.
     :- pragma foreign_proc("Java",
     	hello(_IO0::di, _IO::uo),
     	[will_not_call_mercury],
     "
     	MyApplet app = new MyApplet();
     	// ...
     ").


Previous: Using pragma foreign_decl for Java, Up: Interfacing with Java
13.6.4.4 Using pragma foreign_code for Java

The Java code from pragma foreign_proc declarations for Java will be placed in the bodies of static member functions of an automatically-generated Java class. pragma foreign_code can be used to define additional members of this automatically-generated class, which can then be referenced by pragma foreign_proc declarations for Java from that module.

For example:

     :- pragma foreign_code("Java", "
     	static int counter = 0;
     ").
     
     :- impure pred incr_counter is det.
     :- pragma foreign_proc("Java", incr_counter,
     	[will_not_call_mercury], "counter++;").
     
     :- semipure func get_counter = int.
     :- pragma foreign_proc("Java",
     	get_counter = (Result::out),
     	[will_not_call_mercury, promise_semipure],
     	"Result = counter;").


Previous: Interfacing with Java, Up: Language specific bindings

13.6.5 Interfacing with Managed C++


Next: , Up: Interfacing with Managed C++
13.6.5.1 Using pragma foreign_type for MC++

There is no direct support for using MC++ types from Mercury; however, the types for IL are compatible with MC++, and so the foreign_type support for IL can be used instead. See the section on using pragma foreign_type for IL (see Using pragma foreign_type for IL).


Next: , Previous: Using pragma foreign_type for MC++, Up: Interfacing with Managed C++
13.6.5.2 Using pragma foreign_proc for MC++

The MC++ code from MC++ pragma foreign_proc declarations will be copied into static member functions of an automatically-generated MC++ class. Since such MC++ code will become part of a static member function, it must not refer to the this keyword. It may however refer to static member variables or static member functions declared with pragma foreign_code.

The input and output variables for an MC++ pragma foreign_proc will have MC++ types corresponding to their Mercury types. The exact rules for mapping Mercury types to MC++ types are determined by the rules for mapping Mercury types to IL (see IL and C# data passing conventions) and the rules relating IL types to MC++ types.

MC++ code in a pragma foreign_proc declaration for any procedure whose determinism indicates that it can fail must assign a value of type bool to the variable SUCCESS_INDICATOR. For example:

     :- pred string.contains_char(string, character).
     :- mode string.contains_char(in, in) is semidet.
     
     :- pragma foreign_proc("MC++", string.contains_char(Str::in, Ch::in),
             [will_not_call_mercury, promise_pure],
             "SUCCESS_INDICATOR = (Str->IndexOf(Ch) != -1);").

MC++ code for procedures whose determinism indicates that they cannot fail should not access SUCCESS_INDICATOR.

Arguments whose mode is input will have their values set by the Mercury implementation on entry to the MC++ code. If the procedure succeeds, the MC++ code must set the values of all output arguments. If the procedure fails, the MC++ code need only set SUCCESS_INDICATOR to false.


Next: , Previous: Using pragma foreign_proc for MC++, Up: Interfacing with Managed C++
13.6.5.3 Using pragma foreign_decl for MC++

pragma foreign_decl declarations for MC++ can be used to provide any top-level MC++ declarations (e.g. #include or #using) which are needed by code in pragma foreign_proc declarations for MC++.

For example:

     :- pragma foreign_decl("MC++", "
     	#include <stdio.h>
     	#using <mscorlib.dll>
     
     	#define MY_CONSTANT 42
     	// ...
     ").


Previous: Using pragma foreign_decl for MC++, Up: Interfacing with Managed C++
13.6.5.4 Using pragma foreign_code for MC++

The MC++ code from pragma foreign_proc declarations for MC++ will be placed in the bodies of static member functions of an automatically-generated C# class. pragma foreign_code can be used to define additional members of this automatically-generated class, which can then be referenced by pragma foreign_proc declarations for MC++ from that module.

For example:

     :- pragma foreign_code("MC++", "
     	static int counter = 0;
     ").
     
     :- impure pred incr_counter is det.
     :- pragma foreign_proc("MC++", incr_counter,
     	[will_not_call_mercury], "counter++;").
     
     :- semipure func get_counter = int.
     :- pragma foreign_proc("MC++", get_counter = (Result::out),
     	[will_not_call_mercury, promise_semipure], "Result = counter;").


Next: , Previous: Foreign language interface, Up: Top

14 C interface

This chapter documents the original C interface. In the long term we are planning to phase out support for this interface in favour of the new foreign language interface documented in Foreign language interface.

The Mercury distribution includes a number of examples of the use of the C interface that show how to interface C++ with Mercury and how to set up Mmake files to automate the build process. See the samples/c_interface directory in the Mercury distribution.


Next: , Up: C interface

14.1 Calling C code from Mercury

There are two slightly different mechanisms for calling C code from Mercury: pragma import and pragma c_code. pragma import allows you to call C functions from Mercury. pragma c_code allows you to implement Mercury procedures using arbitrary fragments of C code. pragma import is usually simpler, but pragma c_code is a bit more flexible.


Next: , Up: Calling C code from Mercury

14.1.1 pragma import

A declaration of the form

     :- pragma import(Pred(Mode1, Mode2, ...),
                      Attributes, "C_Name").

or

     :- pragma import(Func(Mode1, Mode2, ...) = Mode,
                      Attributes, "C_Name").

imports a C function for use by Mercury. Pred or Func must specify the name of a previously declared Mercury predicate or function, and Mode1, Mode2, ..., and (for functions) Mode must specify one of the modes of that predicate or function. There must be no clauses for the specified Mercury procedure; instead, any calls to that procedure will be executed by calling the C function named C_Name. The Attributes argument is optional; if present, it specifies properties of the given C function (see C code attributes).

For example, the following code imports the C function cos() as the Mercury function cos/1:

     :- func cos(float) = float.
     :- pragma import(cos(in) = out, [will_not_call_mercury], "cos").

The interface to the C function for a given Mercury procedure is determined as follows. Mercury types are converted to C types and passed according to the rules in Passing data to and from C.

If you use pragma import for a polymorphically typed Mercury procedure, the compiler will prepend one type_info argument to the parameters passed to the C function for each polymorphic type variable in the Mercury procedure's type signature. The values passed in these arguments will be the same as the values that would be obtained using the Mercury type_of function in the Mercury standard library module std_util. These values may be useful in case the C function wishes to in turn call another polymorphic Mercury procedure (see Calling Mercury code from C).

You may not give a pragma import declaration for a procedure with determinism nondet or multi. (It is however possible to define a nondet or multi procedure using pragma c_code. See Nondet pragma c_code.)


Next: , Previous: pragma import, Up: Calling C code from Mercury

14.1.2 pragma c_code

A declaration of the form

     :- pragma c_code(Pred(Var1::Mode1, Var2::Mode2, ...),
             Attributes, C_Code).

or

     :- pragma c_code(Func(Var1::Mode1, Var2::Mode2, ...) = (Var::Mode),
             Attributes, C_Code).

means that any calls to the specified mode of Pred or Func will result in execution of the C code given in C_Code. The C code fragment may refer to the specified variables (Var1, Var2, ..., and Var) directly by name. These variables will have C types corresponding to their Mercury types, as determined by the rules specified in Passing data to and from C. It is an error for a variable to occur more than once in the argument list.

The C code fragment may declare local variables, but it should not declare any labels or static variables unless there is also a Mercury pragma no_inline declaration (see Inlining) for the procedure. The reason for this is that otherwise the Mercury implementation may inline the procedure by duplicating the C code fragment for each call. If the C code fragment declared a static variable, inlining it in this way could result in the program having multiple instances of the static variable, rather than a single shared instance. If the C code fragment declared a label, inlining it in this way could result in an error due to the same label being defined twice inside a single C function.

If there is a pragma import or pragma c_code declaration for a mode of a predicate or function, then there must not be any clauses for that predicate or function, and there must be a pragma c_code or pragma import declaration for every mode of the predicate or function.

For example, the following piece of code defines a Mercury function sin/1 which calls the C function sin() of the same name.

     :- func sin(float) = float.
     :- pragma c_code(sin(X::in) = (Sin::out),
             [may_call_mercury],
             "Sin = sin(X);").

If the C code does not recursively invoke Mercury code, as in the above example, then you can use will_not_call_mercury in place of may_call_mercury in the declarations above. This allows the compiler to use a slightly more efficient calling convention. (If you use this form, and the C code does invoke Mercury code, then the behaviour is undefined — your program may misbehave or crash.)

The C code in a pragma c_code declaration for any procedure whose determinism indicates that it can fail must assign a truth value to the macro SUCCESS_INDICATOR. For example:

     :- pred string.contains_char(string, character).
     :- mode string.contains_char(in, in) is semidet.
     
     :- pragma c_code(string.contains_char(Str::in, Ch::in),
             [will_not_call_mercury],
             "SUCCESS_INDICATOR = (strchr(Str, Ch) != NULL);").

SUCCESS_INDICATOR should not be used other than as the target of an assignment. (For example, it may be #defined to a register, so you should not try to take its address.) Procedures whose determinism indicates that they cannot fail should not access SUCCESS_INDICATOR.

Arguments whose mode is input will have their values set by the Mercury implementation on entry to the C code. If the procedure succeeds, the C code must set the values of all output arguments. If the procedure fails, the C code need only set SUCCESS_INDICATOR to false (zero).


Next: , Previous: pragma c_code, Up: Calling C code from Mercury

14.1.3 Nondet pragma c_code

For procedures that can return more than one result on backtracking, i.e. those with determinism nondet or multi, the form of pragma c_code declaration described previously does not suffice. Instead, you should use a declaration of the form shown below:

     :- pragma c_code(Pred(Var1::Mode1, Var2::Mode2, ...),
             Attributes, local_vars(LocalVars), first_code(FirstCode),
             retry_code(RetryCode), common_code(CommonCode)).

or

     :- pragma c_code(Func(Var1::Mode1, Var2::Mode2, ...) = (Var::Mode),
             Attributes, local_vars(LocalVars), first_code(FirstCode),
             retry_code(RetryCode), common_code(CommonCode)).

Here FirstCode, RetryCode, and CommonCode are all Mercury strings containing C code. FirstCode will be executed whenever the Mercury procedure is called. RetryCode will be executed whenever a given call to the procedure is re-entered on backtracking to find subsequent solutions. The common_code(CommonCode) argument is optional; if present, CommonCode will be executed after each execution of FirstCode or RetryCode.

The code that is executed on each call or retry should finish by executing one of the three macros FAIL, SUCCEED, or SUCCEED_LAST. The FAIL macro indicates that the call has failed; the call will not be retried. The SUCCEED macro indicates that the call has succeeded, and that there may be more solutions; the call may be retried on backtracking. The SUCCEED_LAST macro indicates that the call has succeeded, but that there are no more solutions after this one; the call will not be retried.

LocalVars is a sequence of struct member declarations which are used to hold any state which needs to be preserved in case of backtracking or passed between the different C code fragments. The code fragments FirstCode, RetryCode, and CommonCode may use the macro LOCALS, which is defined to be a pointer to a struct containing the fields specified by LocalVars, to access this saved state.

Note RetryCode and CommonCode may not access the input variables — only FirstCode should access the input variables. If RetryCode or CommonCode need to access any of the input variables, then FirstCode should copy the values needed to the LocalVars.

The following example shows how you can use a state variable to keep track of the next alternative to return.

     %
     % This example implements the equivalent of
     %     foo(X) :- X = 20 ; X = 10 ; X = 42 ; X = 99 ; fail.
     %
     :- pred foo(int).
     :- mode foo(out) is multi.
     :- pragma c_code(foo(X::out), [will_not_call_mercury, thread_safe],
             local_vars("
                     int state;
             "),
             first_code("
                     LOCALS->state = 1;
             "),
             retry_code("
                     LOCALS->state++;
             "),
             common_code("
                     switch (LOCALS->state) {
                             case 1: X = 20; SUCCEED; break;
                             case 2: X = 10; SUCCEED; break;
                             case 3: X = 42; SUCCEED; break;
                             case 4: X = 99; SUCCEED; break;
                             case 5: FAIL; break;
                     }
             ")
     ).

The next example is a more realistic example; it shows how you could implement the reverse mode of string.append, which returns all possible ways of splitting a string into two pieces, using pragma c_code.

     :- pred string.append(string, string, string).
     :- mode string.append(out, out, in) is multi.
     :- pragma c_code(string.append(S1::out, S2::out, S3::in),
                     [will_not_call_mercury, thread_safe],
             local_vars("
                     String s;
                     size_t len;
                     size_t count;
             "),
             first_code("
                     LOCALS->s = S3;
                     LOCALS->len = strlen(S3);
                     LOCALS->count = 0;
             "),
             retry_code("
                     LOCALS->count++;
             "),
             common_code("
                     S1 = copy_substring(LOCALS->s, 0, LOCALS->count);
                     S2 = copy_substring(LOCALS->s, LOCALS->count,
                                                    LOCALS->len);
                     if (LOCALS->count < LOCALS->len) {
                             SUCCEED;
                     } else {
                             SUCCEED_LAST;
                     }
             ")
     ).


Next: , Previous: Nondet pragma c_code, Up: Calling C code from Mercury

14.1.4 C code attributes

As described above, pragma import and pragma c_code declarations may include a list of attributes describing properties of the given C function or C code. All Mercury implementations must support the attributes listed below. They may also support additional attributes.

The attributes which must be supported by all implementations are as follows:

may_call_mercury/will_not_call_mercury
This attribute declares whether or not execution inside this C code may call back into Mercury or not. The default, in case neither is specified, is may_call_mercury. Specifying will_not_call_mercury may allow the compiler to generate more efficient code. If you specify will_not_call_mercury, but the C code does invoke Mercury code, then the behaviour is undefined.
thread_safe/not_thread_safe
This attribute declares whether or not it is safe for multiple threads to execute this C code concurrently. The default, in case neither is specified, is not_thread_safe. If the C code is declared thread_safe, then the Mercury implementation is permitted to execute the code concurrently from multiple threads without taking any special precautions. If the C code is declared not_thread_safe, then the Mercury implementation must not invoke the code concurrently from multiple threads. If the Mercury implementation does use multithreading, then it must take appropriate steps to prevent this. (The experimental multithreaded version of the current University of Melbourne Mercury implementation protects not_thread_safe code using a mutex: C code that is not thread-safe has code inserted around it to obtain and release a mutex. All non-thread-safe C code shares a single mutex.)


Previous: C code attributes, Up: Calling C code from Mercury

14.1.5 Purity and side effects

Note that procedures implemented in C using either pragma import or pragma c_code must still be “pure”, unless declared otherwise (see Impurity), and they must be type-correct and mode-correct. (Determinism-correctness is also required, but it follows from the rules already stated above.) Pure or semipure procedures may perform destructive update on their arguments only if those arguments have an appropriate unique mode declaration. Impure predicates may perform destructive update on data pointed to by C pointer arguments, even without unique modes. But they cannot destructively update the arguments themselves. Procedures may perform I/O only if their arguments include an io.state pair (see the io chapter of the Mercury Library Reference Manual), or if they are declared impure (see Impurity). The Mercury implementation is allowed to assume that these rules are followed, and to optimize accordingly. If the C code is not type-correct, mode-correct, determinism-correct, and purity-correct with respect to its Mercury declaration, then the behaviour is undefined.

For example, the following code defines a predicate c_write_string/3, which has a similar effect to the Mercury library predicate io.write_string/3:

     :- pred c_write_string(string, io.state, io.state).
     :- mode c_write_string(in, di, uo) is det.
     
     :- pragma c_code(c_write_string(S::in, IO0::di, IO::uo),
             [may_call_mercury],
             "puts(S); IO = IO0;").

In this example, the I/O is done via side effects inside the C code, but the Mercury interface includes io.state arguments to ensure that the predicate has a proper declarative semantics. If the io.state arguments were left off, then the Mercury implementation might apply undesirable optimizations (e.g. reordering, duplicate call elimination, tabling, lazy evaluation, ...) to this procedure, which could effect the behaviour of the program in unpredictable ways.

Impure C code relaxes some of these restrictions. Impure C code may perform I/O and although it cannot update its arguments directly (unless they have an appropriate unique mode, e.g. di) it may update something pointed to by its arguments. Impure C code procedures must still be type correct and mode correct.


Next: , Previous: Calling C code from Mercury, Up: C interface

14.2 Including C headers

Any macros, function prototypes, or other C declarations that are used in c_code pragmas must be included using a c_header_code declaration of the form

     :- pragma c_header_code(HeaderCode).

HeaderCode can be a C #include line, for example

     :- pragma c_header_code("#include <math.h>")

or

     :- pragma c_header_code("#include ""tcl.h""").

or it may contain any C declarations, for example

     :- pragma c_header_code("
             extern int errno;
             #define SIZE 200
             struct Employee {
                     char name[SIZE];
             }
             extern int bar;
             extern void foo(void);
     ").

Mercury automatically includes certain headers such as <stdlib.h>, but you should not rely on this, as the set of headers which Mercury automatically includes is subject to change.


Next: , Previous: Including C headers, Up: C interface

14.3 Including C code

Definitions of C functions or global variables may be included using a declaration of the form

     :- pragma c_code(Code).

For example,

     :- pragma c_code("
             int bar = 42;
             void foo(void) {}
     ").

Such code is copied verbatim into the generated C file.


Next: , Previous: Linking with C object files, Up: C interface

14.4 Calling Mercury code from C

It is also possible to export Mercury procedures to C, so that you can call Mercury code from C (or from other languages that can interface to C, e.g. C++).

A declaration of the form

     :- pragma export(Pred(Mode1, Mode2, ...), "C_Name_1").

or

     :- pragma export(Func(Mode1, Mode2, ...) = Mode, "C_Name_2").

exports a procedure for use by C.

For each Mercury module containing pragma export declarations, the Mercury implementation will automatically create a header file for that module which declares a C function C_Name() for each of the pragma export declarations. Each such C function is the C interface to the specified mode of the specified Mercury predicate or function.

The interface to a Mercury procedure is determined as follows. (The rules here are just the converse of the rules for pragma import). Mercury types are converted to C types according to the rules in Passing data to and from C. Input arguments are passed by value. For output arguments, the caller must pass the address in which to store the result. If the Mercury procedure can fail, then its C interface function returns a truth value indicating success or failure. If the Mercury procedure is a Mercury function that cannot fail, and the function result has an output mode, then the C interface function will return the Mercury function result value. Otherwise the function result is appended as an extra argument. Arguments of type io.state or store.store(_) are not passed at all. (The reason for this is that these types represent mutable state, and in C modifications to mutable state are done via side effects, rather than argument passing.)

Calling polymorphically typed Mercury procedures from C is a little bit more difficult than calling ordinary (monomorphically typed) Mercury procedures. The simplest method is to just create monomorphic forwarding procedures that call the polymorphic procedures, and export them, rather than exporting the polymorphic procedures.

If you do export a polymorphically typed Mercury procedure, the compiler will prepend one type_info argument to the parameter list of the C interface function for each polymorphic type variable in the Mercury procedure's type signature. The caller must arrange to pass in appropriate type_info values corresponding to the types of the other arguments passed. These type_info arguments can be obtained using the Mercury type_of function in the Mercury standard library module std_util.

It is an error to export Mercury procedures that have a determinism of multi or nondet to C.

To use the C declarations produced for pragma export declarations in C code within a Mercury module, use a pragma c_import_module declaration, for example

     :- pragma c_import_module(imported_module).

This is equivalent to

     :- pragma c_header_code("#include ""imported_module.h""").

but it tells the implementation that the object file for the module containing the pragma c_import_module declaration should not be built before imported_module.h is built.


Next: , Previous: Including C code, Up: C interface

14.5 Linking with C object files

A Mercury implementation should allow you to link with object files or libraries that were produced by compiling C code. The exact mechanism for linking with C object files is implementation-dependent. The following text describes how it is done for the University of Melbourne Mercury implementation.

To link an existing object file into your Mercury code, set the Mmake variable MLOBJS in the Mmake file in the directory in which you are working. To link an existing library into your Mercury code, set the Mmake variable MLLIBS. For example, the following will link in the object file my_functions.o from the current directory and the library file libfancy_library.a, or perhaps its shared version fancy_library.so, from the directory /usr/local/contrib/lib.

     MLOBJS = my_functions.o
     MLFLAGS = -R/usr/local/contrib/lib -L/usr/local/contrib/lib
     MLLIBS = -lfancy_library

As illustrated by the example, the values for MLFLAGS and MLLIBS variables are similar to those taken by the Unix linker.

For more information, see the “Libraries” chapter of the Mercury User's Guide, and the man pages for mmc and ml.


Next: , Previous: Calling Mercury code from C, Up: C interface

14.6 Passing data to and from C

For each of the Mercury types int, float, char, and string, there is a C typedef for the corresponding type in C: MR_Integer, MR_Float, MR_Char, and MR_String respectively.

In the current implementation, MR_Integer is a typedef for an integral type whose size is the same size as a pointer; MR_Float is a typedef for double (unless the program and the Mercury library was compiled with -DUSE_SINGLE_PREC_FLOAT, in which case it is a typedef for float); MR_Char is a typedef for char; and MR_String is a typedef for MR_Char *.

Mercury variables of type int, float, char, or string are passed to and from C as C variables whose type is given by the corresponding typedef.

For the Mercury standard library type bool.bool, there is a corresponding C type, MR_bool. C code can refer to the boolean data constructors yes and no, as MR_YES and MR_NO respectively.

Mercury variables of a type for which there is a C pragma foreign_type declaration (see Using foreign types from Mercury) will be passed as the corresponding C type.

Mercury variables of any other type are passed as a MR_Word, which in the current implementation is a typedef for an unsigned type whose size is the same size as a pointer. (Note: it would in fact be better for each Mercury type to map to a distinct abstract type in C, since that would be more type-safe, and thus we may change this in a future release. We advise programmers who are manipulating Mercury types in C code to use typedefs for each user-defined Mercury type, and to treat each such type as an abstract data type. This is good style and it will also minimize any compatibility problems if and when we do change this.)

Mercury arguments declared with input modes are passed by value to the C function. For output arguments, the Mercury implementation will pass to the C function an address in which to store the result. If the Mercury procedure can fail, then its C function should return a truth value of type MR_Integer indicating success or failure: non-zero indicates success, and zero indicates failure. If the Mercury procedure is a Mercury function that cannot fail, and the function result has an output mode, then the C function should return the Mercury function result value. Otherwise the function result is appended as an extra argument. Arguments of type io.state or store.store(_) are not passed at all. (The reason for this is that these types represent mutable state, and in C modifications to mutable state are done via side effects, rather than argument passing.)

Mercury lists can be manipulated by C code using the following macros, which are defined by the Mercury implementation.

     MR_list_is_empty(list)     /* test if a list is empty */
     MR_list_head(list)         /* get the head of a list */
     MR_list_tail(list)         /* get the tail of a list */
     MR_list_empty()            /* create an empty list */
     MR_list_cons(head,tail)    /* construct a list with the given head and tail */

Note that the use of these macros is subject to some caveats (see Memory management).


Next: , Previous: Passing data to and from C, Up: C interface

14.7 Using C pointers

The inbuilt Mercury type c_pointer can be used to pass C pointers between C functions which are called from Mercury. For example:

     :- module pointer_example.
     
     :- interface.
     
     :- type complicated_c_structure.
     
     % Initialise the abstract C structure that we pass around in Mercury.
     :- pred initialise_complicated_structure(complicated_c_structure::uo) is det.
     
     % Perform a calculation on the C structure.
     :- pred do_calculation(int::in, complicated_c_structure::di,
             complicated_c_structure::uo) is det.
     
     :- implementation.
     
     % Our C structure is implemented as a c_pointer.
     :- type complicated_c_structure --->
             complicated_c_structure(c_pointer).
     
     :- pragma c_header_code("
             extern struct foo *init_struct(void);
             extern struct foo *perform_calculation(int, struct foo *);
     ");
     
     :- pragma c_code(initialise_complicated_structure(Structure::uo),
             [may_call_mercury],
             "Structure = init_struct();").
     
     :- pragma c_code(do_calculation(Value::in, Structure0::di, Structure::uo),
             [may_call_mercury],
             "Structure = perform_calculation(Value, Structure0);").


Next: , Previous: Using C pointers, Up: C interface

14.8 Memory management

Passing pointers to dynamically-allocated memory from Mercury to code written in other languages, or vice versa, is in general implementation-dependent.

The current Mercury implementation supports two different methods of memory management: conservative garbage collection, or no garbage collection. (With the latter method, heap storage is reclaimed only on backtracking.)

Conservative garbage collection makes inter-language calls simplest. When using conservative garbage collection, heap storage is reclaimed automatically. Pointers to dynamically-allocated memory can be passed to and from C without taking any special precautions.

When using no garbage collection, you must be careful not to retain pointers to memory on the Mercury heap after Mercury has backtracked to before the point where that memory was allocated. You must also avoid the use of the macros MR_list_empty() and MR_list_cons(). (The reason for this is that they may access Mercury's MR_hp register, which might not be valid in C code. Using them in the bodies of procedures defined using pragma c_code with will_not_call_mercury would probably work, but we don't advise it.) Instead, you can write Mercury functions to perform these actions and use pragma export to access them from C. This alternative method also works with conservative garbage collection.

Future Mercury implementations may use non-conservative methods of garbage collection. For such implementations, it will be necessary to explicitly register pointers passed to C with the garbage collector. The mechanism for doing this has not yet been decided on. It would be desirable to provide a single memory management interface for use when interfacing with other languages that can work for all methods of memory management, but more implementation experience is needed before we can formulate such an interface.


Previous: Memory management, Up: C interface

14.9 Trailing

In certain compilation grades (see the “Compilation model options” section of the Mercury User's Guide), the University of Melbourne Mercury implementation supports trailing. Trailing is a means of having side-effects, such as destructive updates to data structures, undone on backtracking. The basic idea is that during forward execution, whenever you perform a destructive modification to a data structure that may still be live on backtracking, you should record whatever information is necessary to restore it on a stack-like data structure called the “trail”. Then, if a computation fails, and execution backtracks to before those updates were performed, the Mercury runtime engine will traverse the trail back to the most recent choice point, undoing all those updates.

The interface used is a set of C functions (which are actually implemented as macros) and types. Typically these will be called from C code within pragma c_code declarations in Mercury code.

For examples of the use of this interface, see the modules extras/trailed_update/tr_array.m and extras/clpr/cfloat.m in the Mercury distribution.


Next: , Up: Trailing

14.9.1 Choice points

A “choice point” is a point in the computation to which execution might backtrack when a goal fails or throws an exception. The “current” choice point is the one that was most recently encountered; that is also the one to which execution will branch if the current computation fails.

When you trail an update, the Mercury engine will ensure that if execution ever backtracks to the choice point that was current at the time of trailing, then the update will be undone.

If the Mercury compiler determines that it will never need to backtrack to a particular choice point, then it will “prune” away that choice point. If a choice point is pruned, the trail entries for those updates will not necessarily be discarded, because in general they may still be necessary in case we backtrack to a prior choice point.


Next: , Previous: Choice points, Up: Trailing

14.9.2 Value trailing

The simplest form of trailing is value trailing. This allows you to trail updates to memory and have the Mercury runtime engine automatically undo them on backtracking.

MR_trail_value()
Prototype:
          void MR_trail_value(MR_Word *address, MR_Word value);
     

Ensures that if future execution backtracks to the current choice point, then value will be placed in address.

     
     

MR_trail_current_value()
Prototype:
          void MR_trail_current_value(MR_Word *address);
     

Ensures that if future execution backtracks to the current choice point, the value currently in address will be restored.

MR_trail_current_value(address) is equivalent to MR_trail_value(address, *address).


Next: , Previous: Value trailing, Up: Trailing

14.9.3 Function trailing

For more complicated uses of trailing, you can store the address of a C function on the trail and have the Mercury runtime call your function back whenever future execution backtracks to the current choice point or earlier, or whenever that choice point is pruned, because execution commits to never backtracking over that point, or whenever that choice point is garbage collected.

Note the garbage collector in the current Mercury implementation does not garbage-collect the trail; this case is mentioned only so that we can cater for possible future extensions.

MR_trail_function()
Prototype:
          typedef enum {
                  MR_undo,
                  MR_exception,
                  MR_retry,
                  MR_commit,
                  MR_solve,
                  MR_gc
          } MR_untrail_reason;
          
          void MR_trail_function(
                  void (*untrail_func)(MR_Word, MR_untrail_reason),
                  void *value
          );
     

A call to MR_trail_function(untrail_func, value) adds an entry to the function trail. The Mercury implementation ensures that if future execution ever backtracks to current choicepoint, or backtracks past the current choicepoint to some earlier choicepoint, then (*untrail_func)(value, reason) will be called, where reason will be MR_undo if the backtracking was due to a goal failing, MR_exception if the backtracking was due to a goal throwing an exception, or MR_retry if the backtracking was due to the use of the “retry” command in mdb, the Mercury debugger, or any similar user request in a debugger. The Mercury implementation also ensures that if the current choice point is pruned because execution commits to never backtracking to it, then (*untrail_func)(value, MR_commit) will be called. It also ensures that if execution requires that the current goal be solvable, then (*untrail_func)(value, MR_solve) will be called. This happens in calls to solutions/2, for example. (MR_commit is used for “hard” commits, i.e. when we commit to a solution and prune away the alternative solutions; MR_solve is used for “soft” commits, i.e. when we must commit to a solution but do not prune away all the alternatives.)

MR_gc is currently not used — it is reserved for future use.

Typically if the untrail_func is called with reason being MR_undo, MR_exception, or MR_retry, then it should undo the effects of the update(s) specified by value, and then free any resources associated with that trail entry. If it is called with reason being MR_commit or MR_solve, then it should not undo the update(s); instead, it may check for floundering (see the next section). In the MR_commit case it may, in some cases, be possible to also free resources associated with the trail entry. If it is called with anything else (such as MR_gc), then it should probably abort execution with an error message.


Next: , Previous: Function trailing, Up: Trailing

14.9.4 Delayed goals and floundering

Another use for the function trail is check for floundering in the presence of delayed goals.

Often, when implementing certain kinds of constraint solvers, it may not be possible to actually solve all of the constraints at the time they are added. Instead, it may be necessary to simply delay their execution until a later time, in the hope the constraints may become solvable when more information is available. If you do implement a constraint solver with these properties, then at certain points in the computation — for example, after executing a negated goal — it is important for the system to check that their are no outstanding delayed goals which might cause failure, before execution commits to this execution path. If there are any such delayed goals, the computation is said to “flounder”. If the check for floundering was omitted, then it could lead to unsound behaviour, such as a negation failing even though logically speaking it ought to have succeeded.

The check for floundering can be implemented using the function trail, by simply calling MR_trail_function() to add a function trail entry whenever you create a delayed goal, and putting the appropriate check for floundering in the MR_commit and MR_solve cases of your function. The Mercury distribution includes some examples of this: see the ML_cfloat_untrail_func() function in the file extras/clpr/cfloat.m and the ML_var_untrail_func() function in the file extras/trailed_update/var.m.) If your function does detect floundering, then it should print an error message and then abort execution.


Previous: Delayed goals and floundering, Up: Trailing

14.9.5 Avoiding redundant trailing

If a mutable data structure is updated multiple times, and each update is recorded on the trail using the functions described above, then some of this trailing may be redundant. It is generally not necessary to record enough information to recover the original state of the data structure for every update on the trail; instead, it is enough to record the original state of each updated data structure just once for each choice point occurring after the data structure is allocated, rather than once for each update.

The functions described below provide a means to avoid redundant trailing.

MR_ChoicepointId
Declaration:
          typedef ... MR_ChoicepointId;
     

The type MR_ChoicepointId is an abstract type used to hold the identity of a choice point. Values of this type can be compared using C's == operator or using MR_choicepoint_newer().

     
     

MR_current_choicepoint_id()
Prototype:
          MR_ChoicepointId MR_current_choicepoint_id(void);
     

MR_current_choicepoint_id() returns a value indicating the identity of the most recent choice point; that is, the point to which execution would backtrack if the current computation failed. The value remains meaningful if the choicepoint is pruned away by a commit, but is not meaningful after backtracking past the point where the choicepoint was created (since choicepoint ids may be reused after backtracking).

     
     

MR_null_choicepoint_id()
Prototype:
          MR_ChoicepointId MR_null_choicepoint_id(void);
     

MR_null_choicepoint_id() returns a “null” value that is distinct from any value ever returned by MR_current_choicepoint_id. (Note that MR_null_choicepoint_id() is a macro that is guaranteed to be suitable for use as a static initializer, so that it can for example be used to provide the initial value of a C global variable.)

     
     

MR_choicepoint_newer()
Prototype:
          bool MR_choicepoint_newer(MR_ChoicepointId, MR_ChoicepointId);
     

MR_choicepoint_newer(x, y) true iff the choicepoint indicated by x is newer than (i.e. was created more recently than) the choicepoint indicated by y. The null ChoicepointId is considered older than any non-null ChoicepointId. If either of the choice points have been backtracked over, the behaviour is undefined.

The way these functions are generally used is as follows. When you create a mutable data structure, you should call MR_current_choicepoint_id() and save the value it returns as a prev_choicepoint field in your data structure. When you are about to modify your mutable data structure, you can then call MR_current_choicepoint_id() again and compare the result from that call with the value saved in the prev_choicepoint field in the data structure using MR_choicepoint_newer(). If the current choicepoint is newer, then you must trail the update, and update the prev_choicepoint field with the new value; furthermore, you must also take care that on backtracking the previous value of the prev_choicepoint field in your data structure is restored to its previous value, by trailing that update too. But if MR_current_choice_id() is not newer than the prev_choicepoint field, then you can safely perform the update to your data structure without trailing it.

If your mutable data structure is a C global variable, then you can use MR_null_choicepoint_id() for the initial value of the prev_choicepoint field. If on the other hand your mutable data structure is created by a predicate or function that uses tabled evaluation (see Tabled evaluation), then you should use MR_null_choicepoint_id() for the initial value of the field. Doing so will ensure that the data will be reset to its initial value if execution backtracks to a point before the mutable data structure was created, which is important because this copy of the mutable data structure will be tabled and will therefore be produced again if later execution attempts to create another instance of it.

For an example of avoiding redundant trailing, see the sample module below.

Note that there is a cost to this — you have to include an extra field in your data structure for each part of the data structure which you might update, you need to perform a test for each update to decide whether or not to trail it, and if you do need to trail the update, then you have an extra field that you need to trail. Whether or not the benefits from avoiding redundant trailing outweigh these costs will depend on your application.

     :- module trailing_example.
     :- interface.
     
     :- type int_ref.
     
         % Create a new int_ref with the specified value.
     :- pred new_int_ref(int_ref::uo, int::in) is det.
     
         % update_int_ref(Ref0, Ref, OldVal, NewVal):
         % Ref0 has value OldVal and Ref has value NewVal.
     :- pred update_int_ref(int_ref::mdi, int_ref::muo,
             int::out, int::in) is det.
     
     :- implementation.
     
     :- type int_ref ---> int_ref(c_pointer).
     
     :- pragma import(new_int_ref(uo, in),
     	"new_int_ref").
     :- pragma import(update_int_ref(mdi, muo, out, in),
     	"update_int_ref").
     
     :- pragma c_header_code("
     typedef MR_Word Mercury_IntRef;
     void new_int_ref(Mercury_IntRef *ref, MR_Integer value);
     void update_int_ref(Mercury_IntRef ref0, Mercury_IntRef *ref,
         MR_Integer *old_value, MR_Integer new_value);
     ").
     
     :- pragma c_code("
     typedef struct {
         MR_ChoicepointId prev_choicepoint;
         MR_Integer data;
     } C_IntRef;
     
     void
     new_int_ref(Mercury_IntRef *ref, MR_Integer value)
     {
         C_IntRef *x = malloc(sizeof(C_IntRef));
         x->prev_choicepoint = MR_current_choicepoint_id();
         x->data = value;
         *ref = (Mercury_IntRef) x;
     }
     
     void
     update_int_ref(Mercury_IntRef ref0, Mercury_IntRef *ref,
         MR_Integer *old_value, MR_Integer new_value)
     {
         C_IntRef *x = (C_IntRef *) ref0;
         *old_value = x->data;
     
         /* check whether we need to trail this update */
         if (MR_choicepoint_newer(MR_current_choicepoint_id(),
             x->prev_choicepoint))
         {
             /* trail both x->data and x->prev_choicepoint,
     	   since we're about to update them both*/
             assert(sizeof(x->data) == sizeof(MR_Word));
             assert(sizeof(x->prev_choicepoint) == sizeof(MR_Word));
             MR_trail_current_value((MR_Word *)&x->data);
             MR_trail_current_value((MR_Word *)&x->prev_choicepoint);
     
             /* update x->prev_choicepoint to indicate that
                x->data's previous value has been trailed
                at this choice point */
             x->prev_choicepoint = MR_current_choicepoint_id();
         }
         x->data = new_value;
         *ref = ref0;
     }
     ").


Next: , Previous: C interface, Up: Top

15 Impurity declarations

In order to efficiently implement certain predicates, it is occasionally necessary to venture outside pure logic programming. Other predicates cannot be implemented at all within the paradigm of logic programming, for example, all solutions predicates. Such predicates are often written using the C interface. Sometimes, however, it would be more convenient, or more efficient, to write such predicates using the facilities of Mercury. For example, it is much more convenient to access arguments of compound Mercury terms in Mercury than in C, and the ability of the Mercury compiler to specialize code can make higher-order predicates written in Mercury significantly more efficient than similar C code.

One important aim of Mercury's impurity system is to make the distinction between the pure and impure code very clear. This is done by requiring every impure predicate or function to be so declared, and by requiring every call to an impure predicate or function to be flagged as such. Predicates or functions that are implemented in terms of impure predicates or functions are assumed to be impure themselves unless they are explicitly promised to be pure.

Please note that the facilities described here are needed only very rarely. The main intent is for implementing language primitives such as the all solutions predicates, or for implementing interfaces to foreign language libraries using the foreign language interface. Any other use of impure or semipure probably indicates either a weakness in the Mercury standard library, or the programmer's lack of familiarity with the standard library. Newcomers to Mercury are hence encouraged to skip this section.


Next: , Up: Impurity

15.1 Choosing the right level of purity

Mercury distinguishes three “levels” of purity:

pure
For pure procedures, the set of solutions depends only on the values of the input arguments. They do not interact with the “real” world (i.e., do any input/output) without taking an io.state (see Types) as input and returning one as output, and do not change the value of any data structure that will not be undone on backtracking (unless the data structure would be unreachable on backtracking). Note that equality axioms are important when considering the value of data structures. The declarative semantics of pure predicates is never affected by the invocation of other predicates. It is possible for the invocation of pure predicates to affect the operational behaviour of non-pure predicates and vice versa.

By default, Mercury predicates and functions are pure. Without using the foreign language interface, writing mode-specific clauses or calling other impure predicates and functions it is impossible to write impure code in Mercury.

semipure
Semipure predicates are just like pure predicates, except that their declarative semantics may be affected by the invocation of impure predicates. That is, they are sensitive to the state of the computation other than as reflected by their input arguments, though they do not affect the state themselves.
impure
Impure predicates may perform I/O or modify hidden state, even if these side effects alter the operational semantics of other code. However, impure predicates may not change the declarative semantics of pure code. They must be type-, mode-, determinism- and uniqueness correct.


Next: , Previous: Purity levels, Up: Impurity

15.2 Purity ordering

The three levels of purity have a total ordering defined upon them (which we will simply call the purity), where pure > semipure > impure.


Next: , Previous: Purity ordering, Up: Impurity

15.3 Semantics

It is important to the proper operation of impure and semipure code, to the flexibility of the compiler to optimize pure code, and to the semantics of the Mercury language, that a clear distinction be drawn between ordinary Mercury code and imperative code written with Mercury syntax. How this distinction is drawn will be explained below; the purpose of this section is to explain the semantics of programs with impure predicates.

A declarative semantics of impure Mercury code would be largely useless, because the declarative semantics cannot capture the intent of the programmer. Impure predicates are executed for their side-effects, which by definition are not part of their declarative semantics. Thus it is the operational semantics of impure predicates that Mercury must specify, and Mercury compilers must respect.

The operational semantics of a Mercury predicate which invokes impure code is a modified form of the strict sequential semantics (see Semantics). Impure goals may not be reordered relative to any other goals; not even “minimal” reordering as implied by the modes is permitted. If any such reordering is needed, this is a mode error. However, pure and semipure goals may be reordered as the compiler desires (within the bounds of the semantics the user has specified for the program) as long as they are not moved across an impure goal. Execution of impure goals is strict: they must be executed if they are reached, even if it can be determined that the computation cannot lead to successful termination.

Semipure goals can be given a “contextual” declarative semantics. They cannot have any side-effects, so it is expected that, given the context in which they are called (relative to any impure goals in the program), their declarative semantics fully captures the intent of the programmer. Thus a semipure goal has a perfectly consistent declarative semantics, until an impure goal is reached. After that, it has another (possibly different) declarative semantics, until the next impure goal is executed, and so on. Mercury compilers must respect this contextual nature of the semantics of semipure goals; within a single context, a compiler may treat a semipure goal as if it were pure.


Next: , Previous: Impurity semantics, Up: Impurity

15.4 Declaring impure functions and predicates

Every Mercury predicate or function has exactly two purity values associated with it. One is the declared purity of the predicate or function, which is given by the programmer. The other value is the inferred purity, which is calculated from the purity of goals in the body of the predicate or function.

A predicate is declared to be impure or semipure by preceding the word pred in its pred declaration with impure or semipure, respectively. Similarly, a function is declared impure or semipure by preceding the word func in its func declaration with impure or semipure. That is, a declaration of the form:

     :- impure pred Pred(Arguments...).
     :- semimpure pred Pred(Arguments...).

or

     :- impure func Func(Arguments...) = Result.
     :- semipure func Func(Arguments...) = Result.

declares the predicate Pred to be impure and the function Func to be semipure, respectively.

Type class methods may also be declared as impure or semipure by preceeding the word pred or func with the appropriate purity level. An instance of the type class must provide method implementations that are at least as pure as the method declaration.


Next: , Previous: Declaring impurity, Up: Impurity

15.5 Marking a call as impure

Every call to a Mercury predicate or function also has exactly two purity values associated with it. One is the declared purity of the call, which is given by the programmer as an annotation of the call. The other value is the inferred purity, which is the purity of the predicate or function.

It is an error for the declared purity of a goal to be more pure than the inferred purity; the compiler should flag this as an error. The compiler should issue a warning if the declared purity of a goal is less pure than its inferred purity.

If a predicate is impure or semipure, all calls to it must be preceded with the word impure or semipure, respectively.

If a function is impure or semipure, it must be called as part of a simple unification with a variable, and this unification must be prefixed by the word impure or semipure, respectively.

Note that only predicate calls and unifications of variables with functions need to (and are permitted to) be prefixed with impure or semipure. Compound goals never need this. See Impurity Example for an example of this syntax.

The requirement that impure or semipure calls be marked with impure or semipure allows someone reading the code to tell which goals are not pure, making code which relies on side effects somewhat less mysterious. Furthermore, it means that if a call is not preceded by impure or semipure, then the reader can rely on the call having a proper declarative semantics, without hidden side-effects.


Next: , Previous: Impure calls, Up: Impurity

15.6 Promising that a predicate is pure

Predicates that are implemented in terms of impure or semipure predicates are assumed to have the least of the purity of the goals in their body. The declared purity of a predicate must not be more pure than the inferred purity; if it is, the compiler must generate an error. If the declared purity is less pure than the inferred purity, the compiler should issue a warning (this is similar to the above case for goals). Because the inferred purity of the predicate is calculated from the declared purity of the calls it executes, the lowest purity bound is propagated up from callee to caller through the program.

In some cases the impurity of a predicate's body is an implementation detail which should not be exposed to callers. These predicates are pure or semipure even though they call impure or semipure predicates. The only way for the programmer to stop the propagation of impurity is to explicitly promise that the predicate or function is pure or semipure.

Of course, the Mercury compiler cannot verify that the predicate's purity matches the promise, so it is the programmer's responsibility to ensure this. If a predicate is promised pure or semipure and is not, the behaviour of the program is undefined.

The programmer may promise that a predicate or function is pure or semipure using the promise_pure and promise_semipure pragmas:

     :- pragma promise_pure(Name/Arity).
     :- pragma promise_semipure(Name/Arity).

Programmers should be very careful about mixing code that is promised pure with impure predicates or functions that may manipulate the same hidden state (for example, the impure predicates used to implement a predicate that is promised pure); the promise pure declaration is supposed to promise that impure code cannot change the declarative semantics of pure code. The module system can be used to minimize the possibility of making errors with such code, by keeping impure predicates or functions behind the interface where code is promised pure.


Next: , Previous: Promising purity, Up: Impurity

15.7 An example using impurity

The following example illustrates how a pure predicate may be implemented using impure code. Note that this code is not reentrant, and so is not useful as is. It is meant only as an example.

     :- pragma c_header_code("#include <limits.h>").
     :- pragma c_header_code("MR_Integer max;").
     
     :- impure pred init_max is det.
     :- pragma c_code(init_max,
             [will_not_call_mercury],
             "max = INT_MIN;").
     
     :- impure pred set_max(int::in) is det.
     :- pragma c_code(set_max(X::in),
             [will_not_call_mercury],
             "if (X > max) max = X;").
     
     :- semipure func get_max = (int::out) is det.
     :- pragma c_code(get_max = (X::out),
             [will_not_call_mercury],
             "X = max;").
     
     :- pragma promise_pure(max_solution/2).
     :- pred max_solution(pred(int), int).
     :- mode max_solution(pred(out) is multi, out) is det.
     
     max_solution(Generator, Max) :-
             impure init_max,
             (   Generator(X),
                 impure set_max(X),
                 fail
             ;   semipure Max = get_max
             ).


Previous: Impurity Example, Up: Impurity

15.8 Using impurity with higher-order code

Higher-order code can manipulate impure or semipure predicates and functions, provided that explicit purity annotations are used in three places: on the higher-order types, on lambda expressions, and on higher-order calls. (There are no purity annotations on higher-order insts and modes, however.)


Next: , Up: Higher-order impurity

15.8.1 Purity annotations on higher-order types

Ordinary higher-order types, such as pred(T1, T2) and func(T1, T2) = T, represent only pure predicates or pure functions. But for each ordinary higher-order type Foo, there are two corresponding types semipure Foo and impure Foo. These types can be used for higher-order code that needs to manipulate impure or semipure procedures. For example the type impure func(int) = int represents impure functions from int to int.

There are no implicit conversions and no subtyping relationship between ordinary higher-order types and the corresponding impure or semipure higher-order types. However, a value of an ordinary higher-order type can be explicit “converted” to a value of an impure (or semipure) higher-order type by wrapping it in an impure (or semipure) lambda expression that just calls the pure higher-order term.


Next: , Previous: Purity annotations on higher-order types, Up: Higher-order impurity

15.8.2 Purity annotations on lambda expressions

Purity annotations are required on lambda expressions that call semipure or impure code. Lambda expressions can be declared as semipure or impure by including such an annotation before the pred or func identifier in the lambda expression. Such lambda expressions have the corresponding semipure or impure higher-order type. For example, the expression

     	(impure func(X) = Y :- semipure get_max(Y), impure set_max(X))

is an example of an impure function lambda expression with type (impure func(int) = int), and the expression

     	(impure pred(X::in, Y::out) is det :-
     		semipure get_max(Y),
     		impure set_max(X))

is an example of an impure predicate lambda expression with type impure pred(int, int).


Previous: Purity annotations on lambda expressions, Up: Higher-order impurity

15.8.3 Purity annotations on higher-order calls

Any calls to impure or semipure higher-order terms must be explicitly annotated as such. For impure or semipure higher-order predicates, the annotation is indicated by putting impure or semipure before the call. For example:

     	:- func foo(impure pred(int)) = int.
     	:- mode foo(in(pred(out) is det)) = out is det.
     
     	foo(ImpurePred) = X1 + X2 :-
     		% using higher-order syntax
     		impure ImpurePred(X1),
     		% using the call/N syntax
     		impure call(ImpurePred, X2).

For calling impure or semipure higher-order functions, the notation is different than what you might expect. In addition to using an impure or semipure operator on the unification which invokes the higher-order function application, you must also use impure_apply or semipure_apply rather than using apply or higher-order syntax. For example:

     	:- func map(impure func(T1) = T2, list(T1)) = list(T2).
     
     	map(_ImpureFunc, []) = [].
     	map(ImpureFunc, [X|Xs]) = [Y|Ys] :-
     		impure Y = impure_apply(ImpureFunc, X),
     		impure Ys = map(ImpureFunc, Ys).


Next: , Previous: Impurity, Up: Top

16 Pragmas

The pragma declarations described below are a standard part of the Mercury language, as are the pragmas for controlling the C interface (see C interface) and impurity (see Impurity). As an extension, implementations may also choose to support additional pragmas with implementation-dependent semantics (see Implementation-dependent extensions).


Next: , Up: Pragmas

16.1 Inlining

A declaration of the form

     :- pragma inline(Name/Arity).

is a hint to the compiler that all calls to the predicate(s) or function(s) with name Name and arity Arity should be inlined.

The current Mercury implementation is smart enough to inline simple predicates even without this hint.

A declaration of the form

     :- pragma no_inline(Name/Arity).

ensures the compiler will not inline this predicate. This may be used simply for performance concerns (inlining can cause unwanted code bloat in some cases) or to prevent possibly dangerous inlining when using low-level C code.


Next: , Previous: Inlining, Up: Pragmas

16.2 Type specialization

The overhead of polymorphism can in some cases be significant, especially where polymorphic predicates make heavy use of class method calls or the built-in unification and comparison routines. To avoid this, the programmer can suggest to the compiler that a specialized version of a procedure should be created for a specific set of argument types.


Next: , Up: Type specialization

16.2.1 Syntax and semantics of type specialization pragmas

A declaration of the form

     :- pragma type_spec(Name/Arity, Subst).
     :- pragma type_spec(Name(Modes), Subst).

suggests to the compiler that a specialized version of predicate(s) or function(s) with name Name and arity Arity should be created with the type substitution given by Subst applied to the argument types. The second form of the declaration only suggests specialization of the specified mode of the predicate or function.

The substitution is written as a conjunction of bindings of the form TypeVar = Type, for example K = int or (K = int, V = list(int)).

The declarations

     :- pred map.lookup(map(K, V), K, V).
     :- pragma type_spec(map.lookup/3, K = int).

give a hint to the compiler that a version of map.lookup/3 should be created for integer keys.

Implementations are free to ignore pragma type_spec declarations. Implementations are also free to perform type specialization even in the absence of any pragma type_spec declarations.


Next: , Previous: Syntax and semantics of type specialization pragmas, Up: Type specialization

16.2.2 When to use type specialization

The set of types for which a predicate or function should be specialized is best determined by profiling your application. Overuse of type specialization will result in code bloat.

Type specialization of predicates or functions which unify or compare polymorphic variables is most effective when the specialized types are built-in types such as int, float and string, or enumeration types, since their unification and comparison procedures are simple and can be inlined.

Predicates or functions which make use of type class method calls may also be candidates for specialization. Again, this is most effective when the called type class methods are simple enough to be inlined.


Previous: When to use type specialization, Up: Type specialization

16.2.3 Implementation specific details

The University of Melbourne Mercury compiler performs user-requested type specializations when invoked with --user-guided-type-specialization, which is enabled at optimization level -O2 or higher. However, for the Java back-end, user-requested type specializations are ignored.


Next: , Previous: Type specialization, Up: Pragmas

16.3 Obsolescence

A declaration of the form

     :- pragma obsolete(Name/Arity).

declares that the predicate(s) or function(s) with name Name and arity Arity are “obsolete”: it instructs the compiler to issue a warning whenever the named predicate(s) or function(s) are used.

pragma obsolete declarations are intended for use by library developers, to allow gradual (rather than abrupt) evolution of library interfaces. If a library developer changes the interface of a library predicate, they should leave the old version of that predicate in the library, but mark it as obsolete using a pragma obsolete declaration, and document how library users should modify their code to suit the new interface. The users of the library will then get a warning if they use obsolete features, and can consult the library documentation to determine how to fix their code. Eventually, when the library developer deems that users have had sufficient warning, they can remove the old version entirely.


Previous: Obsolescence, Up: Pragmas

16.4 Source file name

The source_file pragma and #line directives provide support for preprocessors and other tools that generate Mercury code. The tool can insert these directives into the generated Mercury code to allow the Mercury compiler to report diagnostics (error and warning messages) at the original source code location, rather than at the location in the automatically generated Mercury code.

A source_file pragma is a declaration of the form

     :- pragma source_file(Name).

where Name is a string that specifies the name of the source file.

For example, if a preprocessor generated a file foo.m based on a input file foo.m.in, and it copied lines 20, 30, and 31 from foo.m.in, the following directives would ensure that any error or warnings for those lines copied from foo.m were reported at their original source locations in foo.m.in.

     :- module foo.
     :- pragma source_file("foo.m.in").
     #20
     % this line comes from line 20 of foo.m
     #30
     % this line comes from line 30 of foo.m
     % this line comes from line 31 of foo.m
     :- pragma source_file("foo.m").
     #10
     % this automatically generated line is line 10 of foo.m

Note that if a generated file contains some text which is copied from a source file, and some which is automatically generated, it is a good idea to use pragma source_file and #line directives to reset the source file name and line number to point back to the generated file for the automatically generated text, as in the above example.


Next: , Previous: Pragmas, Up: Top

17 Implementation-dependent extensions

The University of Melbourne Mercury implementation supports the following extensions to the Mercury language:


Next: , Up: Implementation-dependent extensions

17.1 Fact tables

Large tables of facts can be compiled using a different algorithm that is more efficient and produces more efficient code.

A declaration of the form

     :- pragma fact_table(Name/Arity, FileName).

tells the compiler that the predicate or function with name Name and arity Arity is defined by a set of facts in an external file FileName. Defining large tables of facts in this way allows the compiler to use a more efficient algorithm for compiling them. This algorithm uses less memory than would normally be required to compile the facts so much larger tables are possible.

Each mode is indexed on all its input arguments so the compiler can produce very efficient code using this technique.

In the current implementation, the table of facts is compiled into a separate C file named FileName.c. The compiler will automatically generate the correct dependencies for this file when the command mmake main_module.depend is invoked. This ensures that the C file will be compiled to FileName.o and then linked with the other object files when mmake main_module is invoked.

The main limitation of the fact_table pragma is that in the current implementation, predicates or functions defined as fact tables can only have arguments of types string, int or float.

Another limitation is that the --high-level-code back-end does not support pragma fact_table for procedures with determinism nondet or multi.


Next: , Previous: Fact tables, Up: Implementation-dependent extensions

17.2 Tabled evaluation

(Note: “Tabled evaluation” has no relation to the “fact tables” described above.)

Ordinarily, the results of each procedure call are not recorded; if the same procedure is called with the same arguments, then the answer(s) must be recomputed again. For some procedures, this recomputation can be very wasteful.

With tabled evaluation, the implementation keeps a table containing the previously computed results of the specified procedure; at each procedure call, the implementation will search the table to check whether the answer(s) have already been computed and if so, the answers will be returned directly from the tables rather than being recomputed. This can result in much faster execution, at the cost of additional space to record answers in the table.

The implementation can optionally also check at runtime for the situation where a procedure calls itself recursively with the same arguments, which would normally result in a infinite loop; if this situation is encountered, it can (at the programmer's option) either throw an exception, or avoid the infinite loop by computing solutions using the “minimal model” semantics.

The current Mercury implementation thus supports three different pragmas for tabling, to cover these three cases: pragma memo does no loop checking, pragma loop_check checks for loops and throws an exception if a loop is detected, while pragma minimal_model computes the “minimal model” semantics.

Warning: The current implementation of pragma minimal_model is broken: the generated code sometimes produces incorrect results. It should not be used. Also the current implementation of all three pragmas is broken for procedures with determinism nondet or multi. The pragma memo and pragma loop_check declarations should not be used on such procedures.

The syntax for each of these declarations is

     :- pragma memo(Name/Arity).
     :- pragma loop_check(Name/Arity).
     :- pragma minimal_model(Name/Arity).

where Name/Arity specifies the predicate or function to which the declaration applies. The declaration applies to all modes of the predicate and/or function named. At most one of these declarations may be specified for any given predicate or function.

Note that a pragma minimal_model declaration changes the declarative semantics of the specified predicate or function: instead of using the completion of the clauses as the basis for the semantics, as is normally the case in Mercury, the declarative semantics that is used is the “minimal model” semantics. Anything which is true or false in the completion semantics is also true or false (respectively) in the minimal model semantics, but there are goals for which the minimal model specifies that the result is true or false, whereas the completion semantics leaves the result unspecified. For these goals, the usual Mercury semantics requires the implementation to either loop or report an error message, but the minimal model semantics requires a particular answer to be returned. In particular, the minimal model semantics says that any call that is not true in all models is false.

Programmers should therefore use a pragma minimal_model declaration only in cases where their intended interpretation for a procedure coincides with the minimal model for that procedure. Fortunately, however, this is usually what programmers intend.

For more information on tabling, see K. Sagonas's PhD thesis The SLG-WAM: A Search-Efficient Engine for Well-Founded Evaluation of Normal Logic Programs. See [4]. The operational semantics of procedures with a pragma minimal_model declaration corresponds to what Sagonas calls “SLGd resolution”.

In the general case, the execution mechanism required by minimal model tabling is quite complicated, requiring the ability to delay goals and then wake them up again. The Mercury implementation uses a technique based on copying relevant parts of the stack to the heap when delaying goals, similar to the one described in CAT: the copying approach to tabling, by B. Demoen and K. Sagonas. See [5]. This ensures that code which does not use tabling does not pay any runtime overheads from the more complicated execution mechanism required by (minimal model) tabling.

Please note: the current implementation of tabling does not support all the possible compilation grades (see the “Compilation model options” section of the Mercury User's Guide) allowed by the Mercury implementation. In particular, minimal model tabling is incompatible with high level code, the use of trailing, and accurate garbage collection.

Reminder: the current implementation of pragma minimal_model is broken, and the current implementation of pragma memo and pragma loop_check is broken for procedures with determinism nondet or multi.


Next: , Previous: Tabled evaluation, Up: Implementation-dependent extensions

17.3 Termination analysis

The compiler includes a termination analyser which can be used to prove termination of predicates and functions. Details of the analysis is available in “Termination Analysis for Mercury” by Chris Speirs, Zoltan Somogyi and Harald Sondergaard. See [1].

The analysis is based around an algorithm proposed by Gerhard Groger and Lutz Plumer in their paper “Handling of mutual recursion in automatic termination proofs for logic programs.” See [2].

For an introduction to termination analysis for logic programs, please refer to “Termination Analysis for Logic Programs” by Chris Speirs. See [3].

Information about the termination properties of a predicate or function can be given to the compiler. Pragmas are also available to require the compiler to prove termination of a given predicate or function, or to give an error message if it cannot do so.

The analyser is enabled by the option --enable-termination, which can be abbreviated to --enable-term. When termination analysis is enabled, any predicates or functions with a check_termination pragma defined on them will have their termination checked, and if termination cannot be proved, the compiler will emit an error message detailing the reason that termination could not be proved.

The option --check-termination, which may be abbreviated to --check-term or --chk-term, forces the compiler to check the termination of all predicates in the module. It is common for the compiler to be unable to prove termination of some predicates and functions because they call other predicates which could not be proved to terminate or because they use language features (such as higher order calls) which cannot be usefully analysed. In this case, the compiler only emits a warning for these predicates and functions if the --verbose-check-termination option is enabled. For every predicate or function that the compiler cannot prove the termination of, a warning message is emitted, but compilation continues. The --check-termination option implies the --enable-termination option.

The accuracy of the termination analysis is substantially degraded if intermodule optimization is not enabled. Unless intermodule optimization is enabled, the compiler must assume that any imported predicate may not terminate.

By default, the compiler assumes that a procedure defined using the foreign language interface will terminate for all input if it does not call Mercury. If it does call Mercury then by default the compiler will assume that it may not terminate.

The foreign code attributes terminates/does_not_terminate may be used to force the compiler to treat a foreign_proc as terminating/non-terminating irrespective of whether it calls Mercury. As a matter of style, it is preferable to use foreign code attributes for foreign_procs rather than the termination pragmas described below.

The following declarations can be used to inform the compiler of the termination properties of a predicate or function.

     :- pragma terminates(Name/Arity).

This declaration may be used to inform the compiler that this predicate or function is guaranteed to terminate for any input. This is useful when the compiler cannot prove termination of some predicates or functions which are in turn preventing the compiler from proving termination of other predicates or functions. This declaration affects not only the predicate specified but also any other predicates that are mutually recursive with it.

     :- pragma does_not_terminate(Name/Arity).

This declaration may be used to inform the compiler that this predicate may not terminate. This declaration affects not only the predicate specified but also any other predicates that are mutually recursive with it.

     :- pragma check_termination(Name/Arity).

This pragma forces the compiler to prove termination of this predicate. If it cannot prove the termination of the specified predicate or function then the compiler will quit with an error message.


Previous: Termination analysis, Up: Implementation-dependent extensions

17.4 Aditi deductive database interface


Next: , Up: Aditi deductive database interface

17.4.1 Aditi overview

The University of Melbourne Mercury implementation includes support for compiling Mercury predicates for bottom-up evaluation using the Aditi2 deductive database system. The Aditi system is not yet publicly available, so this is currently not very useful to anyone other than the Mercury and Aditi developers. For more information see the Aditi web site at <http://www.cs.mu.oz.au/aditi>.

Evaluation by a deductive database system is useful for predicates which use large amounts of data, since the database system can use efficient join algorithms instead of backtracking. Also, some predicates which loop when executed top-down may terminate when executed bottom-up by the database (this effect can also be achieved using tabling (see Tabled evaluation)). Bottom-up evaluation computes the answers to a predicate a set at a time, rather than a tuple at a time as in the normal top-down execution of a Mercury program.

There are several restrictions on predicates to be evaluated using Aditi. Argument types may not include polymorphic, higher-order or abstract types. Type classes are not supported within database predicates. The argument modes must not contain partially instantiated insts. Aditi predicates must be stratified (see Aditi glossary) and must not be mutually recursive with predicates in other modules.

Every predicate with a pragma aditi or pragma base_relation declaration must have an input argument of type aditi.state. This ensures that Aditi predicates are only called from within transactions and that updates and database calls are ordered correctly, in the same way that io.state arguments are used to ensure ordering of I/O operations. Within the clauses for predicates with a pragma aditi declaration variables with type aditi.state may only be passed to other database predicates — they may not be packaged into terms or passed to top-down Mercury predicates. This allows the compiler to remove all instances of aditi.state variables from database predicates, and enforces the restriction that top-down Mercury code called from within the database cannot call bottom-up code, which is currently impossible for Aditi to handle.

Some useful predicates are defined in $ADITI_HOME/doc/aditi.m in the Aditi distribution.

The Aditi interface currently has the major restriction that recursive or imported top-down Mercury predicates or functions cannot be called from predicates with pragma aditi declarations. The following predicates and functions from the standard library can be called from Aditi:

builtin.compare/3,

int.'<'/2, int.'>'/2, int.'=<'/2, int.'>='/2, int.abs/2, int.max/3, int.min/3, int.to_float/2, int.pow/2, int.log2/2, int.'+'/2, int.'+'/1, int.'-'/2, int.'-'/1, int.'*'/2, int.'//'/2, int.rem/2,

float.'<'/2, float.'>'/2, float.'>='/2, float.'=<'/2, float.abs/1, float.abs/2, float.max/2, float.max/3, float.min/2, float.min/3, float.pow/2, float.log2/2, float.float/1, float.truncate_to_int/1, float.truncate_to_int/2, float.'+'/2, float.'+'/1, float.'-'/2, float.'-'/1, float.'*'/2, float.'/'/2,

math.ceiling/1, math.round/1, math.floor/1, math.sqrt/1, math.pow/2, math.exp/1, math.ln/1, math.log10/1, math.log2/1, math.sin/1, math.cos/1, math.tan/1, math.asin/1, math.acos/1, math.atan/1, math.sinh/1, math.cosh/1, math.tanh/1,

string.length/2.


Next: , Previous: Aditi overview, Up: Aditi deductive database interface

17.4.2 Aditi pragma declarations

The following pragma declarations control compilation of Aditi predicates.

     :- pragma aditi(Name/Arity).

This predicate should be evaluated using the Aditi deductive database.

     :- pragma base_relation(Name/Arity).

This predicate is an Aditi base relation.

     :- pragma supp_magic(Name/Arity).
     :- pragma context(Name/Arity).

Perform either the supplementary magic sets or context transformations. One of these transformations must be performed on every Aditi predicate. supp_magic is the default. There are restrictions on predicates to which the context transformation can be applied; these are described in Right-, left-, and multi-linear rule transformations that maintain context information. See [6].

     :- pragma naive(Name/Arity).
     :- pragma psn(Name/Arity).

Specify naive or predicate semi-naive evaluation (see Aditi glossary) for the predicate. psn is the default.

     :- pragma aditi_memo(Name/Arity).
     :- pragma aditi_no_memo(Name/Arity).

The Aditi deductive database can store the results of procedures within a transaction to avoid unnecessary recomputations. This is unrelated to the type of memoing described in Tabled evaluation. aditi_no_memo is the default. Memoing is not yet implemented, so any pragma aditi_memo declarations will be ignored.

     :- pragma owner(Name/Arity, UserName).

The predicate is owned by the named user. A predicate in the database is identified by owner, module name, predicate name and arity. The owner field is used for security checks. If no pragma owner declaration is given, the owner is taken from the --aditi-user option, which defaults to the value of the environment variable USER, or “guest” if that is not set.

     :- pragma aditi_index(Name/Arity, IndexType, Attributes).

The base relation has the given B-tree index. B-tree indexes allow efficient retrieval of a tuple or range of tuples from a base relation. IndexType must be one of unique_B_tree or non_unique_B_tree. Attributes is a list of argument numbers (argument numbers are counted from one).


Next: , Previous: Aditi pragma declarations, Up: Aditi deductive database interface

17.4.3 Aditi update syntax

The Melbourne Mercury compiler provides special syntax to specify updates of Aditi base relations.

Note: Only error checking is implemented for Aditi updates — no code is generated yet.


Next: , Up: Aditi update syntax
17.4.3.1 Aditi update notes

All Aditi update goals have determinism det.

There must be a pragma base_relation declaration for any relation to be updated.

It is currently up to the application to ensure that any modifications do not violate the determinism of a base relation. If any modification does violate the determinism of a base relation, then the behaviour is undefined. However, updates of relations with unique B-tree indexes are checked to ensure that a key is not given multiple values. The transaction will abort if this occurs.

Predicate and function names in Aditi update goals may be module qualified.

The examples make use of the following declarations:

     :- pred p(aditi.state::aditi_mui, int::out, int::out) is nondet.
     :- pragma base_relation(p/3).
     
     :- func f(aditi.state::aditi_mui, int::out) = (int::out) is nondet.
     :- pragma base_relation(f/2).
     
     :- pred ancestor(aditi.state::aditi_mui, int::out, int::out) is nondet.
     :- pragma aditi(ancestor/3).


Next: , Previous: Aditi update notes, Up: Aditi update syntax
17.4.3.2 Insertion and deletion
     aditi_insert(PredName(Arg1, Arg2, ...), DB0, DB).
     
     aditi_insert(FuncName(Arg1, Arg2, ...) = RetVal, DB0, DB).

Insert the specified tuple into a relation.

     aditi_delete(PredName(Arg1, Arg2, ...), DB0, DB).
     
     aditi_delete(FuncName(Arg1, Arg2, ...) = RetVal, DB0, DB).

Delete the specified tuple from a relation.


Note that PredName(Arg1Arg2, ...) in an aditi_insert or aditi_delete goal is not a higher-order term. Pred = p(DB0, X, Y), aditi_insert(Pred, DB0, DB) is a syntax error.

Examples:
     insert_example_1(DB0, DB) :-
             aditi_insert(p(_, 1, 2), DB0, DB).
     
     insert_example_2(DB0, DB) :-
             aditi_insert(f(_, 1) = 2, DB0, DB).
     
     delete_example_1(DB0, DB) :-
             aditi_delete(p(_, 1, 2), DB0, DB).
     
     delete_example_2(DB0, DB) :-
             aditi_delete(f(_, 1) = 2, DB0, DB).
     


Next: , Previous: Insertion and deletion, Up: Aditi update syntax
17.4.3.3 Bulk insertion and deletion
     aditi_bulk_insert((PredName(Arg1, Arg2, ...) :- Goal), DB0, DB).
     
     aditi_bulk_insert((FuncName(Arg1, Arg2, ...) = RetVal :- Goal), DB0, DB).
     
     aditi_bulk_insert(PredOrFunc Name/Arity, Closure, DB0, DB).

Insert all solutions of Goal or Closure into the named relation.

     aditi_bulk_delete((PredName(Arg1, Arg2, ...) :- Goal), DB0, DB).
     
     aditi_bulk_delete((FuncName(Arg1, Arg2, ...) = RetVal :- Goal), DB0, DB).
     
     aditi_bulk_delete(PredOrFunc Name/Arity, Closure, DB0, DB).

Delete all solutions of Goal or Closure from the named relation.


The form
     aditi_bulk_insert((PredName(DB1, Arg2, ...) :- Goal), DB0, DB).

is equivalent to

     Closure = (aditi_bottom_up
             pred(DB1::aditi_mui, Arg2::out, ...) is nondet :- Goal),
     aditi_bulk_insert(PredOrFunc Name/Arity, Closure, DB0, DB).

and likewise for the function version. For instance, the examples bulk_insert_example_1, bulk_insert_example_2 and bulk_insert_example_3 below are all equivalent.

Similarly, the form

     aditi_bulk_delete((PredName(Arg1, Arg2, ...) :- Goal), DB0, DB).

is equivalent to

     DeleteClosure = (aditi_bottom_up
             pred(DB1::aditi_mui, Arg2::out, ...) is nondet :-
             PredName(DB1, Arg2, ...),
             Goal
     ),
     aditi_bulk_delete(PredOrFunc Name/Arity, Closure, DB0, DB).

and likewise for the function version. For instance bulk_delete_example_1 and bulk_delete_example_2 below are equivalent.



Examples:
     bulk_insert_example_1(DB0, DB) :-
             aditi_bulk_insert(p(DB1, X, Y) :- ancestor(DB1, X, Y), DB0, DB).
     
     bulk_insert_example_2(DB0, DB) :-
             aditi_bulk_insert(pred p/3, ancestor, DB0, DB).
     
     bulk_insert_example_3(DB0, DB) :-
             InsertP = (aditi_bottom_up
                     pred(DB1::aditi_mui, X::out, Y::out) is nondet :-
                             ancestor(DB1, X, Y)
                     ),
             aditi_bulk_insert(pred p/3, InsertP, DB0, DB).
     
     bulk_delete_example_1 -->
             aditi_bulk_delete(
                     (p(DB1, X, _) :-
                             X > 1,
                             X < 5
                     )).
     
     bulk_delete_example_2(DB0, DB) :-
             DeleteP = (aditi_bottom_up
                     pred(DB1::aditi_mui, X::out, Y::out) is nondet :-
                             p(DB1, X, Y),
                             X > 1,
                             X < 5
                     ),
             aditi_bulk_delete(DeleteP, DB0, DB).
     
     bulk_delete_example_3(DB0, DB) :-
             aditi_bulk_delete(f(DB1, X) = _Y :- X = 1, DB0, DB).
     
     bulk_delete_example_4(DB0, DB) :-
             DeleteQ = (aditi_bottom_up
                     func(DB1::aditi_mui, X::out) = (Y::out) is nondet :-
                             q(DB1, X) = Y,
                             X > 1,
                             X < 5
                     ),
             aditi_bulk_delete(func f/2, DeleteQ, DB0, DB).

The type of InsertP is aditi_bottom_up pred(aditi.state, int, int). Its inst is pred(aditi_mui, out, out) is nondet, as for a normal lambda expression.

Note that in bulk_delete_example_1 the extra set of parentheses around the goal are needed, otherwise the second goal in the conjunction in the deletion goal would be parsed as an extra argument of the aditi_bulk_delete call, resulting in a syntax error.


Previous: Bulk insertion and deletion, Up: Aditi update syntax
17.4.3.4 Modification
     aditi_bulk_modify(
             (PredName(OldArg1, OldArg2, ...) ==>
             PredName(NewArg1, NewArg2, ...) :-
                     Goal
             ),
             DB0, DB).
     
     aditi_bulk_modify(
             ((FuncName(OldArg1, OldArg2, ...) = OldRetVal) ==>
             (FuncName(NewArg1, NewArg2, ...) = NewRetVal) :-
                     Goal
             ),
             DB0, DB).
     
     aditi_bulk_modify(PredOrFunc PredName/Arity, Closure, DB0, DB).

Modify tuples for which Goal or Closure succeeds, leaving any other tuples unchanged.



The forms using ==> can be considered as syntactic sugar for the form using PredOrFunc PredName/Arity:
     aditi_bulk_modify(
             (PredName(DB1, OldArg1, OldArg2, ...) ==>
             PredName(_DB, NewArg1, NewArg2, ...) :-
                     Goal
             ),
             DB0, DB).

is equivalent to

     ModifyClosure =
             (aditi_bottom_up pred(DB1::aditi_mui, OldArg1::out, OldArg2::out, ...,
                     _DB::unused, NewArg1::out, NewArg2::out, ...) is nondet :-
                     PredName(DB1, OldArg1, OldArg2, ...),
                     Goal
             ),
     aditi_bulk_modify(pred PredName/PredArity, ModifyClosure, DB0, DB).

and likewise for the function version.

The bulk modify operation

     aditi_bulk_modify(pred p/3, Closure, DB0, DB).

is almost equivalent to a bulk delete followed by a bulk insert:

     DeleteClosure =
             (aditi_bottom_up pred(DB1::aditi_mui, X1::out, Y1::out) is nondet :-
                     Closure(DB1, X1, Y1, _, _)
             ),
     InsertClosure =
             (aditi_bottom_up pred(DB1::aditi_mui, X2::out, Y2::out) is nondet :-
                     Closure(DB1, _, _, X2, Y2)
             ),
     aditi_bulk_delete(pred p/3, DeleteClosure, DB0, DB1),
     aditi_bulk_insert(pred p/3, InsertClosure, DB1, DB).

However, they are not quite equivalent, because in the bulk modify operation InsertClosure is executed using the contents of p/3 before the deletion is applied.



Examples:
     bulk_modify_example_1(DB0, DB) :-
             aditi_bulk_modify(
                     (p(DB1, X, Y0) ==> p(_DB, X, Y) :-
                             X > 2,
                             X < 5,
                             Y = Y0 + 1
                     ), DB0, DB).
     
     bulk_modify_example_2(DB0, DB) :-
             aditi_bulk_modify(
                     (f(_DB0, X) = Y0 ==> f(_DB, X) = Y :-
                             X > 2, X < 5, Y = Y0 + 1
                     ), DB0, DB).
     
     bulk_modify_example_3(DB0, DB) :-
             ModifyP = (aditi_bottom_up pred(DB1::aditi_mui, X::in, Y0::in,
                             _::unused, X::out, Y::out) is nondet :-
                         p(DB1, X, Y0),
                         X > 2,
                         X < 5,
                         Y = Y0 + 1
                      ),
             aditi_bulk_modify(pred p/3, ModifyP, DB0, DB).
     
     bulk_modify_example_4(DB0, DB) :-
             ModifyF = (aditi_bottom_up pred(DB1::aditi_mui, X::in, Y0::in,
                             _::unused, X::out, Y::out) is nondet :-
                         f(DB1, X) = Y0,
                         X > 2, X < 5, Y = Y0 + 1
                      ),
             aditi_bulk_modify(func f/2, ModifyQ, DB0, DB).
     
     bulk_modify_example_5 -->
             aditi_bulk_modify(
                     (p(_DB0, X, Y0) ==> p(_DB, X, Y) :-
                             X > 2, X < 5, Y = Y0 + 1
                     )).

Note that in bulk_modify_example_5 the extra set of parentheses around the goal are needed, otherwise the second and third goals in the conjunction in the modification goal would be parsed as extra arguments of the aditi_bulk_modify call, resulting in a syntax error.

The type of ModifyP is aditi_bottom_up pred(aditi.state, int, int, aditi.state, int, int). Its inst is pred(aditi_mui, out, out, unused, out, out) is nondet, as for a normal lambda expression.


Previous: Aditi update syntax, Up: Aditi deductive database interface

17.4.4 Aditi glossary

Aditi-RL
Aditi Relational Language is used by the Aditi system to execute queries. The basic instructions in Aditi-RL are relational database operations such as join, select and project.
aggregate
Aggregates are used to compute a value such as a sum over all the solutions for a predicate. Aggregates can be computed over Aditi predicates using aditi.aggregate_compute_initial defined in $ADITI_HOME/doc/aditi.m in the Aditi distribution.
base relation
A base relation is a predicate consisting of a set of facts stored in a database. There must be no clauses for a base relation.
derived relation
A derived relation is an Aditi predicate for which there are clauses. Derived relations are compiled to Aditi-RL for execution by an Aditi database.
predicate semi-naive evaluation
When a recursive predicate is called, the Aditi system produces the set of all solutions using fixed point iteration. The set of solutions is initialised to those tuples which can be derived using the non-recursive rules of the predicate. In each iteration, new tuples are derived for the predicate using the recursive rules for the predicate and the tuples derived in previous iterations. Evaluation finishes when no new tuples are generated. Predicate semi-naive evaluation (see [8]) is a method of evaluating recursive predicates which uses just the new tuples in each iteration where possible. This improves efficiency by reducing the size of joins.
schema
A schema is a representation of the types of the attributes of a relation.
stratification
A program is stratified if no predicate can call itself through a negation or an aggregate.
transaction
A transaction is a database operation which is executed atomically. If part of a transaction fails, the database reverts to its original state before the transaction. For details on how transactions are implemented in Mercury, see Database transactions in a purely declarative logic programming language [7] and $ADITI_HOME/doc/aditi.m in the Aditi distribution.


Previous: Implementation-dependent extensions, Up: Top

18 Bibliography


Next: , Up: Bibliography

[1]

Chris Speirs, Zoltan Somogyi and Harald Sondergaard, Termination Analysis for Mercury. In P. Van Hentenryck, editor, Static Analysis: Proceedings of the 4th International Symposium, Lecture Notes in Computer Science. Springer, 1997. A longer version is available for download from <http://www.cs.mu.oz.au/publications/tr_db/mu_97_09.ps.gz>.


Next: , Previous: [1], Up: Bibliography

[2]

Gerhard Groger and Lutz Plumer, Handling of mutual recursion in automatic termination proofs for logic programs. In K. Apt, editor, The Proceedings of the Joint International Conference and Symposium on Logic Programming, pages 336–350. MIT Press, 1992.


Next: , Previous: [2], Up: Bibliography

[3]

Chris Speirs, Termination Analysis for Logic Programs, Technical Report 97/23, Department of Computer Science, The University of Melbourne, Melbourne, Australia, 1997. Available from <http://www.cs.mu.oz.au/mercury/papers/mu_97_23.ps.gz>.


Next: , Previous: [3], Up: Bibliography

[4]

K. Sagonas, The SLG-WAM: A Search-Efficient Engine for Well-Founded Evaluation of Normal Logic Programs, PhD thesis, SUNY at Stony Brook, 1996. Available from
<http://www.cs.kuleuven.ac.be/~kostis/Thesis/thesis.ps.gz>.


Next: , Previous: [4], Up: Bibliography

[5]

B. Demoen and K. Sagonas, CAT: the copying approach to tabling, submitted for publication, Katholieke Universiteit Leuven, 1998.


Next: , Previous: [5], Up: Bibliography

[6]

David B. Kemp and Kotagiri Ramamohanarao and Zoltan Somogyi. Right-, left-, and multi-linear rule transformations that maintain context information, The Proceedings of the Sixteenth Conference on Very Large Databases, pages 380–391, August 1990. Available from <http://www.cs.mu.oz.au/mercury/papers/tr90-2.ps.gz>.


Next: , Previous: [6], Up: Bibliography

[7]

David B. Kemp, Thomas Conway, Evan Harris, Fergus Henderson, Kotagiri Ramamohanarao and Zoltan Somogyi, Database transactions in a purely declarative logic programming language, Technical Report 96/45, Department of Computer Science, University of Melbourne, December 1996, Available from <http://www.cs.mu.OZ.AU/publications/tr_db/mu_96_45.ps.gz>.


Previous: [7], Up: Bibliography

[8]

R. Ramakrishnan, D. Srivistava and S. Sudarshan, Rule ordering in bottom-up fixpoint evaluation of logic programs. In Proceedings of the Sixteenth International Conference on Very Large Data Bases, page 359–371, August 1990.


Footnotes

[1] Operator term (see Terms).

[2] The type of an explicitly type qualified term may be an instance of the type specified by the qualifier. This allows explicit type qualifications to constrain the types of two data-terms to be identical, without knowing the exact types of the data-terms. It also allows type qualifications to refer to the types of the results of existentially typed predicates or functions.

[3] If equalitypred is not an equivalence relation, then the program is inconsistent: its declarative semantics contains a contradiction, because the additional axioms for the user-defined equality contradict the standard equality axioms. That implies that the implementation may compute any answer at all (see Semantics), i.e. the behaviour of the program is undefined.