|
|
Although Clean is purely functional, operations with side-effects (I/O operations, for instance) are permitted. To achieve this without violating the semantics, the classical types are supplied with so called uniqueness attributes. If an argument of a function is indicated as unique, it is guaranteed that at run-time the corresponding actual object is local, i.e. there are no other references to it. Clearly, a destructive update of such a “unique object” can be performed safely.
The uniqueness type system makes it possible to define direct interfaces with an operating system, a file system (updating persistent data), with GUI’s libraries, it allows to create arrays, records or user defined data structures that can be updated destructively. The time and space behavior of a functional program therefore greatly benefits from the uniqueness typing.
Uniqueness types are deduced automatically. Type attributes are polymorphic: attribute variables and inequalities on these variables can be used to indicate relations between and restrictions on the corresponding concrete attribute values.
Sometimes the inferred type attributes give some extra information on the run-time behavior of a function. The uniqueness type system is a transparent extension of classical typing that means that if one is not interested in the uniqueness information one can simply ignore it.
Since the uniqueness typing is a rather complex matter we explain this type system and the motivation behind it in more detail. The first Section (9.1) explains the basic motivation for and ideas behind uniqueness typing. Section 9.2 focuses on the so-called uniqueness propagation property of (algebraic) type constructors. Then we show how new data structures can be defined containing unique objects (Section 9.3). Sharing may destroy locality properties of objects. In Section 9.4 we describe the effect of sharing on uniqueness types. In order to maintain referential transparency, it appears that function types have to be treated specially. The last Section (9.5) describes the combination of uniqueness typing and overloading. Especially, the subsections on constructor classes and higher-oder type definitions are very complex: we suggest that the reader skips these sections at first instance.
The uniqueness typing is an extension of classical Milner/Mycroft typing. In the uniqueness type system uniqueness type attributes are attached to the classical types. Uniqueness type attributes appear in the type specifications of functions (see 9.4) but are also permitted in the definitions of new data types (see 9.3). A classical type can be prefixed by one of the following uniqueness type attributes:
Type |
= |
{BrackType}+ |
|
BrackType |
= |
[UniversalQuantVariables] [Strict] [UnqTypeAttrib] SimpleType |
|
UnqTypeAttrib |
= |
* |
// type attribute “unique” |
|
| |
UniqueTypeVariable: |
// a type attribute variable |
|
| |
. |
// an anonymous type attribute variable |
The basic idea behind uniqueness typing is the following. Suppose a function, say F, has a unique argument (an argument with type *s, for some s). This attribute imposes an additional restriction on applications of F.
It is guaranteed that F will have private (“unique”) access to this particular argument (see Barendsen and Smetsers, 1993; Plasmeijer and Van Eekelen, 1993): the object will have a reference count of 11 at the moment it is inspected by the function. It is important to know that there can be more than 1 reference to the object before this specific access takes place. If a uniquely typed argument is not used to construct the function result it will become garbage (the reference has dropped to zero). Due to the fact that this analysis is performed statically the object can be garbage collected (see Chapter 1) at compile-time. It is harmless to reuse the space occupied by the argument to create the function result. In other words: it is allowed to update the unique object destructively without any consequences for referential transparency.
Example: the I/O library function fwritec is used to write a character to a file yielding a new file as result. In general it is semantically not allowed to overwrite the argument file with the given character to construct the resulting file. However, by demanding the argument file to be unique by specifying
fwritec:: Char *File -> *File
It is guaranteed by the type system that fwritec has private access to the file such that overwriting the file can be done without violating the functional semantics of the program. The resulting file is unique as well and can therefore be passed as continuation to another call of e.g. fwritec to make further writing possible.
WriteABC:: *File -> *File
WriteABC file = fwritec 'c' (fwritec 'b' (fwritec 'a' file))
Observe that a unique file is passed in a single threaded way (as a kind of unique token) from one function to another where each function can safely modify the file knowing that is has private access to that file.
One can make these intermediate files more vissible by by writing the WriteABC as follows.
WriteABC file = file3
where
file1 = fwritec 'a' file
file2 = fwritec 'b' file1
file3 = fwritec 'c' file2
or, alternatively (to avoid the explicit numbering of the files),
WriteABC file
# file = fwritec 'a' file
file = fwritec 'b' file
= fwritec 'c' file
The type system makes it possible to make no distinction between a Clean file and a physical file of the real world: file I/O can be treated as efficiently as in imperative languages. The uniqueness typing prevents writing while other readers/writers are active. E.g. one cannot apply fwritec to a file being used elsewhere.
For instance, the following expression is not approved by the type system:
(file, fwritec 'a' file)
Function arguments with no uniqueness attributes added to the classical type are considered as “non-unique”: there are no reference requirements for these arguments. The function is only allowed to have read access (as usual in a functional language) even if in some of the function applications the actual argument appears to have reference count 1.
freadc:: File -> (Char, File)
The function freadc can be applied to both a unique as well as non-unique file. This is fine since the function only wants read access on the file. The type indicates that the result is always a non-unique file. Such a file can be passed for further reading, but not for further writing.
To indicate that functions don’t change uniqueness properties of arguments, one can use attribute variables.
The simplest example is the identity functions that can be typed as follows:
id:: u:a -> u:a
id x = x
Here a is an ordinary type variable, whereas u is an attribute variable. If id is applied to an unique object the result is also unique (in that case u is instantiated with the concrete attribute *). Of course, if id is applied to a non-unique object, the result remains non-unique. As with ordinary type variables, attribute variables should be instantiated uniformly.
A more interesting example is the function freadc that is typed as
freadc:: u:File -> u:(Char, u:File)
Again freadc can be applied to both unique and non-unique files. In the first case the resulting file is also unique and can, for example, be used for further reading or writing. Moreover, observe that not only the resulting file is attributed, but also the tuple containing that file and the character that has been read. This is due to the so called uniqueness propagation rule; see below.
To summarize, uniqueness typing makes it possible to update objects destructively within a purely functional language. For the development of real world applications (which manipulate files, windows, arrays, databases, states etc.) this is an indispensable property.
Having explained the general ideas of uniqueness typing, we can now focus on some details of this typing system.
If a unique object is stored in a data structure, the data structure itself becomes unique as well. This uniqueness propagation rule prevents that unique objects are shared indirectly via the data structure in which these objects are stored. To explain this form of hidden sharing, consider the following definition of the function head
head:: [*a] -> *a
head [hd:tl] = hd
The pattern causes head to have access to the “deeper” arguments hd and tl. Note that head does not have any uniqueness requirements on its direct list argument. This means that in an application of head the list might be shared, as can be seen in the following function heads
heads list = (head list, head list)
If one wants to formulate uniqueness requirements on, for instance, the hd argument of head, it is not sufficient to attribute the corresponding type variable a with *; the surrounding list itself should also become unique. One can easily see that, without this additional requirement the heads example with type
heads:: [*a] -> (*a,*a)
heads list = (head list, head list)
would still be valid although it delivers the same object twice. By demanding that the surrounding list becomes unique as well, (so the type of head becomes head:: *[*a] -> *a) the function heads is rejected. In general one could say that uniqueness propagates outwards.
Some of the readers will have noticed that, by using attribute variables, one can assign a more general uniqueness type to head:
head:: u:[u:a] -> u:a
The above propagation rule imposes additional (implicit) restrictions on the attributes appearing in type specifications of functions.
Another explicit way of indicating restrictions on attributes is by using coercion statements. These statements consist of attribute variable inequalities of the form u <= v. The idea is that attribute substitutions are only allowed if the resulting attribute inequalities are valid, i.e. not resulting in an equality of the form
‘non-unique ≤ unique’.
The use of coercion statements is illustrated by the next example in which the uniqueness type of the well-known append function is shown.
append:: v:[u:a] w:[u:a] -> x:[u:a], [v<=u, w<=u, x<=u,w<=x]
The first three coercion statements express the uniqueness propagation for lists: if the elements a are unique (by choosing * for u) these statements force v,w and x to be instantiated with * also. (Note that u <= * iff u = *.) The statement w <= x expresses that the spine uniqueness of append’s result depends only on the spine attribute w of the second argument.
In Clean it is permitted to omit attribute variables and attribute inequalities that arise from propagation properties; these will be added automatically by the type system. As a consequence, the following type for append is also valid.
append:: [u:a] w:[u:a] -> x:[u:a], [w<=x]
Of course, it is always allowed to specify a more specific type (by instantiating type or attribute variables). All types given below are valid types for append.
append:: [u:a] x:[u:a] -> x:[u:a],
append:: *[*Int] *[*Int] -> *[*Int],
append:: [a] *[a] -> *[a].
To make types more readable, Clean offers the possibility to use anonymous attribute variables. These can be used as a shorthand for indicating attribute variables of which the actual names are not essential. This allows us to specify the type for append as follows.
append:: [.a] w:[.a] -> x:[.a], [w<=x]
The type system of Clean will substitute real attribute variables for the anonymous ones. Each dot gives rise to a new attribute variable except for the dots attached to type variables: type variables are attributed uniformly in the sense that all occurrences of the same type variable will obtain the same attribute. In the above example this means that all dots are replaced by one and the same new attribute variable.
Although one mostly uses uniqueness attributes in type specifications of functions, they are also allowed in the definition of new data types.
AlgebraicTypeDef |
= |
::TypeLhs |
= ConstructorDef |
|
|
|
{| ConstructorDef} ; |
ConstructorDef |
= |
[ExistentalQuantVariables] ConstructorName {BrackType} |
|
|
| |
[ExistentalQuantVariables] (ConstructorName) [Fix][Prec] {BrackType} |
TypeLhs |
= |
[*]TypeConstructor {TypeVariable} |
TypeConstructor |
= |
TypeName |
ExistentalQuantVariables |
= |
E.{TypeVariable }+: |
UniversalQuantVariables |
= |
A.{TypeVariable }+: |
BrackType |
= |
[UniversalQuantVariables] [Strict] [UnqTypeAttrib] SimpleType |
UnqTypeAttrib |
= |
* |
|
| |
UniqueTypeVariable: |
|
| |
. |
As can be inferred from the syntax, the attributes that are actually allowed in data type definitions are ‘*’ and ‘.’; attribute variables are not permitted. The (unique) * attribute can be used at any subtype whereas the (anonymous). attribute is restricted to non-variable positions.
If no uniqueness attributes are specified, this does not mean that one can only build non-unique instances of such a data type. Attributes not explicitly specified by the programmer are added automatically by the type system. To explain this standard uniqueness attribution mechanism, first remember that the types of data constructors are not specified by the programmer but derived from their corresponding data type definition.
For example, the (classical) definition of the List type
:: List a = Cons a (List a) | Nil
leads to the following types for its data constructors:
Cons:: a (List a) -> List a
Nil:: List a
To be able to create unique instances of data types, the standard attribution of Clean will automatically derive appropriate uniqueness variants for the types of the corresponding data constructors. Such a uniqueness variant is obtained via a consistent attribution of all types and subtypes appearing in a data type definition. Here, consistency means that such an attribution obeys the following rules (assume that we have a type definition for some type T).
1) Attributes that are explicitly specified are adopted.
2) Each (unattributed) type variable and each occurrence of T will receive an attribute variable. This is done in a uniform way: equal type variables will receive equal attributes, and all occurrence of T are also equally attributed.
3) Attribute variables are added at non-variable positions if they are required by the propagation properties of the corresponding type constructor. The attribute variable that is chosen depends on the argument types of this constructor: the attribution scheme takes the attribute variable of first argument appearing on a propagating position (see example below).
4) All occurrences of the. attribute are replaced by the attribute variable assigned to the occurrences of T.
Example of standard attribution for data constructors. For Cons the standard attribution leads to the type
Cons:: u:a v:(List u:a) -> v:List u:a, [v<=u]
The type of Nil becomes
Nil:: v:List u:a, [v<=u]
Consider the following Tree definition
:: Tree a = Node a [Tree a]
The type of the data constructor Node is
Node:: u:a v:[v:Tree u:a] -> v:Tree u:a, [v<=u]
Another Tree variant.
:: Tree *a = Node *a [Tree *a]
leading to
Node:: *a *[*Tree *a] -> *Tree *a
Note that, due to propagation, all subtypes have become unique.
Next, we will formalize the notion of uniqueness propagation. We say that an argument of a type constructor, say T, is propagating if the corresponding type variable appears on a propagating position in one of the types used in the right-hand side of T’s definition. A propagating position is characterized by the fact that it is not surrounded by an arrow type or by a type constructor with non-propagating arguments. Observe that the definition of propagation is cyclic: a general way to solve this problem is via a fixed-point construction.
Example of the propagation rule. Consider the (record) type definition for Object.
Object a b:: {state:: a, fun:: b -> a}
The argument a is propagating. Since b does not appear on a propagating position inside this definition, Object is not propagating in its second argument.
The type inference system of Clean will derive uniqueness information after the classical Milner/Mycroft types of functions have been inferred (see 4.3). As explained in Section 9.1, a function may require a non-unique object, a unique object or a possibly unique object. Uniqueness of the result of a function will depend on the attributes of its arguments and how the result is constructed. Until now, we distinguished objects with reference count 1 from objects with a larger reference count: only the former might be unique (depending on the uniqueness type of the object itself). In practice, however, one can be more liberal if one takes the evaluation order into account. The idea is that multiple reference to an (unique) object are harmless if one knows that only one of the references will be present at the moment it is accessed destructively. This has been used in the following function.
AppendAorB:: *File -> *File
AppendAorB file
| fc == 'a' = fwritec 'a' file
= fwritec 'b' file
where
(fc,nf) = freadc file
When the right-hand side of AppendAorB is evaluated, the guard is determined first (so access from freadc to file is not unique), and subsequently one of the alternatives is chosen and evaluated. Depending on cond, either the reference from the first fwritec application to function file or that of the second application is left and therefore unique.
For this reason, the uniqueness type system uses a kind of sharing analysis. This sharing analysis is input for the uniqueness type system itself to check uniqueness type consistency (see 9.3). The analysis will label each reference in the right-hand side of a function definition as read-only (if destructive access might be dangerous) or write-permitted (otherwise). Objects accessed via a read-only reference are always non-unique. On the other hand, uniqueness of objects accessed via a reference labeled with write-permitted solely depends on the types of the objects themselves.
Before describing the labeling mechanism of Clean we mention that the “lifetime” of references is determined on a syntactical basis. For this reason we classify references to the same expression in a function definition (say for f) according to their estimated run-time use, as alternative, observing and parallel.
• Two references are alternative if they belong to different alternatives of f. Note that alternatives are distinguished by patterns (including case expressions) or by guards.
• A reference r is observing w.r.t. a reference r’ if the expression containing r’ is either (1) guarded by an expression or (2) preceded by a strict let before expression containing r.
• Otherwise, references are in parallel.
The rules used by the sharing analysis to label each reference are the following.
• A reference, say r, to a certain object is labeled with read-only if there exist another reference, say r’, to the same object such that either r is observing w.r.t r’ or r and r’ are in parallel.
• Multiple references to cyclic structures are always labeled as read-only.
• All other references are labeled with write-permitted.
Unfortunately, there is still a subtlety that has to be dealt with. Observing references belonging in a strict context do not always vanish totally after the expression containing the reference has been evaluated: further analysis appears to be necessary to ensure their disappearance. More concretely, suppose e[r] denotes the expression containing r. If the type of e[r] is a basic type then, after evaluation, e[r] will be reference-free. In particular, it does not contain the reference r anymore. However, If the type of e[r] is not a basic type it is assumed that, after evaluation, e[r] might still refer to r. But even in the latter case a further refinement is possible. The idea is, depending on e[r], to correct the type of the object to which r refers partially in such way that only the parts of this object that are still shared lose their uniqueness.
Consider, for example, the following rule
f l =
#! x = hd (hd l)
= (x, l)
Clearly, x and l share a common substructure; x is even part of l. But the whole “spine” of l (of type [[...]]) does not contain any new external references. Thus, if l was spine-unique originally, it remains spine unique in the result of f. Apparently, the access to l only affected part of l’s structure. More technically, the type of l itself is corrected to take the partial access on l into account. In the previous example, x, regarded as a function on l has type [[a]] -> a. In f’s definition the part of l’s type corresponding to the variable a is made non-unique. This is clearly reflected in the derived type for f, being
f:: u:[w:[a]] -> (a,v:[x:[a]]), [w <= x, u <= v]
In Clean this principle has been generalized: If the strict let expression e[r] regarded as a function on r has type
T (... a...) -> a
Then the a-part of the type of the object to which r refers becomes non-unique; the rest of the type remains unaffected. If the type of e[r] is not of the indicated form, r is not considered as an observing reference (w.r.t. some reference r’), but, instead, as in parallel with r’.
Higher-order functions give rise to partial (often called Curried) applications, i.e. applications in which the actual number of arguments is less than the arity of the corresponding symbol. If these partial applications contain unique sub-expressions one has to be careful.
Consider, for example the following the function fwritec with type
fwritec:: *File Char -> *File
in the application
fwritec unifile
(assuming that unifile returns a unique file). Clearly, the type of this application is of the form o:(Char -> *File). The question is: what kind of attribute is o? Is it a variable, is it *, or, is it not unique? Before making a decision, one should notice that it is dangerous to allow the above application to be shared. For example, if the expression fwritec unifile is passed to a function
WriteAB write_fun = (write_fun 'a', write_fun 'b')
Then the argument of fwritec is no longer unique at the moment one of the two write operations take place. Apparently, the fwritec unifile expression is essentially unique: its reference count should never become greater than 1. To prevent such an essentially unique expression from being copied, Clean considers the -> type constructor in combination with the * attribute as special: it is not permitted to discard its uniqueness. Now, the question about the attribute o can be answered: it is set to *. If WriteAB is typed as follows
WriteAB:: (Char -> u:File) -> (u:File, u:File)
WriteAB write_fun = (write_fun 'a', write_fun 'b')
the expression WriteAB (fwritec unifile) is rejected by the type system because it does not allow the argument of type *(Char -> *File) to be coerced to (Char -> u:File). One can easily see that it is impossible to type WriteAB in such a way that the expression becomes typable.
To define data structures containing Curried applications it is often convenient to use the (anonymous). attribute. Example
:: Object a b = { state:: a, fun::.(b -> a) }
new:: * Object *File Char
new = { state = unifile, fun = fwritec unifile }
By adding an attribute variable to the function type in the definition of Object, it is possible to store unique functions in this data structure. This is shown by the function new. Since new contains an essentially unique expression it becomes essentially unique itself. So, new can never lose its uniqueness, and hence, it can only be used in a context in which a unique object is demanded.
Determining the type of a Curried application of a function (or data constructor) is somewhat more involved if the type of that function contains attribute variables instead of concrete attributes. Mostly, these variables will result in additional coercion statements. as can be seen in the example below.
Prepend:: u:[.a] [.a] -> v:[.a], [u<=v]
Prepend a b = Append b a
PrependList:: u:[.a] -> w:([.a] -> v:[.a]), [u<=v, w<=u]
PrependList a = Prepend a
Some explanation is in place. The expression (PrependList some_list) yields a function that, when applied to another list, say other_list, delivers a new list extended consisting of the concatenation of other_list and some_list. Let’s call this final result new_list. If new_list should be unique (i.e. v becomes *) then, because of the coercion statement u<=v the attribute u also becomes *. But, if u = * then also w = *, for, w<=u. This implies that (arrow) type of the original expression (PrependList some_list) becomes unique, and hence this expression cannot be shared.
As said before, offering a unique object to a function that requires a non-unique argument is safe (unless we are dealing with unique arrow types; see above). The technical tool to express this is via a coercion (subtype) relation based on the ordering
‘unique’ ≤ ‘non-unique’
on attributes. Roughly, the validity of s ≤ s’ depends subtype-wise on the validity of u ≤ u’ with u,u’ attributes in s,s’. One has, for example
u:[v:[w:Int]] ≤ u’:[v’:[w’:Int]] iff u ≤ u’, v ≤ v’, w ≤ w’.
However, a few refinements are necessary. Firstly, the uniqueness constraints expressed in terms of coercion statements (on attribute variables) have to be taken into account. Secondly, the coercion restriction on arrow types should be handled correctly. And thirdly, due to the so-called contravariance of -> in its first argument we have that
u:(s -> s’) ≤ u:(t -> t’) iff t ≤ s, s’ ≤ t’
Since -> may appear in the definitions of algebraic type constructors, these constructors may inherit the co- and contravariant subtyping behavior with respect to their arguments. We can classify the ‘sign’ of the arguments of each type constructor as + (positive, covariant), - (negative, contravariant) or top (both positive and negative). In general this is done by analysing the (possible mutually recursive) algebraic type definitions by a fixed-point construction, with basis sign(->) = (-,+).
Example: a has sign T, b has sign + in
::FunList a b = FunCons (a, a -> b) (FunList a b)
| FunNil
This leads to the following coercion rules
5) Attributes of two corresponding type variables as well as of two corresponding arrow types must be equal.
6) The sign classification of each type constructor is obeyed. If, for instance, the sign of T’s argument is negative, then
T s ≤ T s’ iff s’ ≤ s
7) In all other cases, the validity of a coercion relation depends on the validity of u ≤ u’, where u,u’ are attributes of the two corresponding subtypes.
The presence of sharing inherently causes a (possibly unique) object to become non-unique, if it is accessed via a read-only reference. In Clean this is achieved by a type correction operation that converts each unique type S to its smallest non-unique supertype, simply by making the outermost attribute of S non-unique. Note that this operation fails if S is a function type.
An overloaded function actually stands for a collection of real functions. The types of these real functions are obtained from the type of the overloaded function by substituting the corresponding instance type for the class variable. These instance types may contain uniqueness information, and, due to the propagation requirement, the above-mentioned substitution might give rise to uniqueness attributes overloaded type specification.
Consider, for instance, the identity class
class id a:: a -> a
If we want to define an instance of id for lists, say id L, which leaves uniqueness of the list elements intact, the (fully expanded) type of idL becomes
instance id L v:[u:a] -> v:[u:a]
However, as said before, the type specification of such an instance is not specified completely: it is derived from the overloaded type in combination with the instance type (i.e. [...] in this particular example).
In Clean we require that the type specification of an overloaded operator anticipates on attributes arising from uniqueness propagation, that is, the uniqueness attribute of the class variable should be chosen in such a way that for any instance type this ‘class attribute’ does not conflict with the corresponding uniqueness attribute(s) in the fully expanded type of this instance. In the above example this means that the type of id becomes
class id a:: a -> a
Another possibility is
class id a:: *a -> *a
However, the latter version of id will be more restrictive in its use, since it will always require that its argument is unique.
The combination of uniqueness typing and constructor classes (with their higher-order class variables) introduces another difficulty. Consider, for example, the overloaded map function.
class map m:: (a -> b) (m a) -> m b
Suppose we would add (distinct) attribute variables to the type variables a and b (to allow ‘unique instances’ of map)
class map m:: (.a ->.b) (m.a) -> m.b
The question that arises is: Which attributes should be added to the two applications of the class variable m? Clearly, this depends on the actual instance type filled in for m. E.g., if m is instantiated with a propagating type constructor (like []), the attributes of the applications of m are either attribute variables or the concrete attribute ‘unique’. Otherwise, one can choose anything.
Example
instance map []
where
map f l = [ f x // x <- l ]
:: T a = C (Int -> a)
instance map T
where
map f (C g) = C (f o g)
In this example, the respective expanded types of the instances are
map:: (u:a -> v:b) w:[u:a] -> x:[v:b], w <= u, x <= v
map:: (u:a -> v:b) (T u:a) -> T v:b
The type system of Clean requires that a possible propagation attribute is explicitly indicated in the type specification of the overloaded function. In order to obtain versions of map producing spine unique data structures, its overloaded type should be specified as follows:
class map m:: (.a ->.b).(m.a) ->.(m.b)
This type will provide that for an application like
map inc [1,2,3]
indeed yields a spine unique list.
Observe that if you would omit the (anonymous) attribute variable of the second argument, the input data structure cannot contain unique data on propagating positions, e.g. one could not use such a version of map for mapping a destructive write operator on a list of unique files.
In fact, the propagation rule is used to translate uniqueness properties of objects into uniqueness properties of the data structures in which these objects are stored. As said before, in some cases the actual data structures are unknown.
Consider the following function
DoubleMap f l = (map f l, map f l)
The type of this function is something like
DoubleMap:: (.a ->.b) (m.a) -> (.(m.b),.(m.b))
Clearly, l is duplicated. However, this does not necessarily mean that a cannot be unique anymore. If, for instance, m is instantiated with a non-propagating type constructor (like T as defined on the previous page) then uniqueness of a is still permitted. On the other hand, if m is instantiated with a propagating type constructor, a unique instantiation of a should be disapproved. In Clean, the type system ‘remembers’ sharing of objects (like l in the above example) by making the corresponding type attribute non-unique. Thus, the given type for DoubleMap is exactly the type inferred by Clean’s type system. If one tries to instantiate m with a propagating type constructor, and, at the same type, a with some unique type, this will fail.
The presence of higher-order class variables, not only influences propagation properties of types, but also the coercion relation between types. These type coercions depend on the sign classification of type constructors. The problem with higher-order polymorphism is that in some cases the actual type constructors substituted for the higher order type variables are unknown, and therefore one cannot decide whether coercions in which higher-order type variable are involved, are valid.
Consider the functions
double x = (x,x)
dm f l = double (map f l)
Here, map’s result (of type.(m.a)) is coerced to the non-unique supertype (m.a). However, this is only allowed if m is instantiated with type constructors that have no coercion restrictions. E.g., if one tries to substitute *WriteFun for m, where
WriteFun a = C.(a -> *File)
this should fail, for, *WriteFun is essentially unique. The to solve this problem is to restrict coercion properties of type variable applications (m s) to
u:(m s) ≤ u:(m t) iff s ≤ t && t ≤ s
A slightly modified version of this solution has been adopted in Clean. For convenience, we have added the following refinement. The instances of type constructors classes are restricted to type constructors with no coercion restrictions. Moreover, it is assumed that these type constructors are uniqueness propagating. This means that the WriteFun cannot be used as an instance for map. Consequently, our coercion relation we can be more liberal if it involves such class variable applications.
Overruling this requirement can be done adding the anonymous attribute. the class variable. E.g.
class map.m:: (.a ->.b).(m.a) ->.(m.b)
Now
instance map WriteFun
where
map...
is valid, but the coercions in which (parts of) map’s type are involved are now restricted as explained above.
To see the difference between the two indicated variants of constructor variables, we slightly modify map’s type.
class map m:: (.a ->.b) *(m.a) ->.(m.b)
Without overruling the instance requirement for m the type of dm (dm as given on the previous page) becomes.
dm:: (.a ->.b) *(m.a) ->.(m b, m b)
Observe that the attribute of disappeared due to the fact that each type constructor substituted for m is assumed to be propagating.
If one explicitly indicates that there are no instance restriction for the class variable m (by attributing m with.), the function dm becomes untypable.
We will describe the effect of uniqueness typing on type definitions containing higher-order type variables. At it turns out, this combination introduces a number of difficulties which would make a full description very complex. But even after skipping a lot of details we have to warn the reader that some of the remaining parts are still hard to understand.
As mentioned earlier, two properties of newly defined type constructor concerning uniqueness typing are important, namely, propagation and sign classification. One can probably image that, when dealing with higher-order types the determination on these properties becomes more involved. Consider, for example, the following type definition.
:: T m a = C (m a)
The question whether T is propagating in its second argument cannot be decided by examining this definition only; it depends on the actual instantiation of the (higher-order) type variable m. If m is instantiated with a propagating type constructor, like [], then T becomes propagating in its second argument as well. Actually, propagation is not only a property of type constructors, but also of types themselves, particularly of ‘partial types’ For example, the partial type [] is propagating in its (only) argument (Note that the number of arguments a partial type expects, directly follows from the kinds of the type constructors that have been used). The type T [] is also propagating in its argument, so is the type T ((,) Int)).
The analysis in Clean that determines propagation properties of (partial) types has been split into two phases. During the first phase, new type definitions are examined in order to determine the propagation dependencies between the arguments of each new type constructor. To explain the idea, we return to our previous example.
:: T m a = C (m a)
First observe that the propagation of the type variable m is not interesting because m does not stand for ‘real data’ (which is always of kind *). We associate the propagation of m in T with the position(s) of the occurrence(s) of m’s applications. So in general, T is propagating in a higher-order variable m if one of m’s applications appears on a propagating position in the definition of T. Moreover, for each higher order type variable, we determine the propagation properties of all first order type variables in the following way: m is propagating in a, where m and a are higher-order respectively first-order type variables of T, if a appears on a propagating position in one of m’s applications. In the above example, m is propagating in a, since a is on a propagating position in the application (m a). During the second phase, the propagation properties of (partial) types are determined using the results of the first phase. This (roughly) proceeds as follows. Consider the type T s for some (partial) type s, and T as defined earlier. First, determine (recursively) the propagation of s. Then the type T s is propagating if (1) s is propagating, (2) T is propagating in m, and moreover (3) m is propagating in a (the second argument of the type constructor). With T as defined above, (2) and (3) are fulfilled. Thus, for example T [] is propagating and therefore also T (T []). Now define
:: T2 a = C2 (a -> Int)
Then T T2 is not propagating.
The adjusted uniqueness propagation rule (see also...) becomes:
• Let s,t be two uniqueness types. Suppose s has attribute u. Then, if t is propagating the application (t s) should have an attribute v such that v ≤ u.
Some of the readers might have inferred that this propagation rule is a ‘higher-order’ generalization of the old ‘first-order’ propagation rule.
As to the sign classification, we restrict ourselves to the remark that that sign analysis used in Clean is adjusted in a similar way as described above for the propagation case.
Example
:: T m a = C ((m a) -> Int)
The sign classification of T if (-,^). Here ^ denotes the fact the a is neither directly used on a positive nor on a negative position. The sign classification of m w.r.t. a is +. The partial type T [] has sign -, which e.g. implies that
T [] Int ≤ T [] *Int
The type T T2 (with T2 as defined on the previous page) has sign +, so
T T2 Int ≥ T T2 *Int
It will be clear that combining uniqueness typing with higher-order types is far from trivial: the description given above is complex and moreover incomplete. However explaining all the details of this combination is far beyond the scope of the reference manual.
So, it is allowed to update a uniquely typed function argument (*) destructively when the argument does not reappear in the function result. The question is: when does the compiler indeed make use of this possibility.
Destructive updates takes place in some predefined functions and operators which work on predefined data structures such arrays (&-operator) and files (writing to a file). Arrays and files are intended to be updated destructively and their use can have a big influence on the space and time behavior of your application (a new node does not have to be claimed and filled, the garbage collector is invoked less often and the locality of memory references is increased).
Performing destructive updates is only sensible when information is stored in nodes. Arguments of basic type (Int, Real, Char or Bool) are stored on the B-stack or in registers and it therefore does not make sense to make them unique.
The Clean compiler also has an option to re-use user-defined unique data structures: the space being occupied by a function argument of unique type will under certain conditions be reused destructively to construct the function result when (part of) this result is of the same type. So, a more space and time efficient program can be obtained by turning heavily used data structures into unique data structures. This is not just a matter of changing the uniqueness type attributes (like turning a lazy data structure into a strict one). A unique data structure also has to be used in a “single threaded” way (see Chapter 4). This means that one might have to restructure parts of the program to maintain the unicity of objects.
The compiler will do compile-time garbage collection for user defined unique data-structures only in certain cases. In that case run-time garbage collection time is reduced. It might even drop to zero. It is also possible that you gain even more than just garbage collection time due to better cache behavior.