Chapter 2
Modules and Scopes
|
|
A Clean program is composed out of modules. Each module is stored in a file that contains Clean source code. There are implementation modules and definition modules, in the spirit of Modula-2 (Wirth, 1982). This module system is used for several reasons.
- First of all, the module structure is used to control the scope of definitions. The basic idea is that definitions only have a meaning in the implementation module they are defined in unless they are exported by the corresponding definition module.
- Having the exported definitions collected in a separate definition module has as advantage that one in addition obtains a self-contained interface document one can reach out to others. The definition module is a document that defines which functions and data types can be used by others without revealing uninteresting implementation details.
- Furthermore, the module structure enables separate compilation that heavily reduces compilation time. If the definition module stays the same, a change in an implementation module only will cause the recompilation of that implementation module. When the definition module is changed as well, only those implementation modules that are affected by this change need to be recompiled.
In this Chapter we explain the module structure of Clean and the influence it has on the scope of definitions. New scopes can also be introduced inside modules. This is further explained in the Chapters 2 and 3.
In the pictures in the subsections below nested boxes indicate nested scopes.
In Clean we distinguish the following kind of identifiers.
ModuleName |
= |
LowerCaseId |
| |
UpperCaseId |
| |
FunnyId |
FunctionName |
= |
LowerCaseId |
| |
UpperCaseId |
| |
FunnyId |
ConstructorName |
= |
|
|
UpperCaseId |
| |
FunnyId |
SelectorVariable |
= |
LowerCaseId |
|
|
|
|
Variable |
= |
LowerCaseId |
|
|
|
|
MacroName |
= |
LowerCaseId |
| |
UpperCaseId |
| |
FunnyId |
FieldName |
= |
LowerCaseId |
|
|
|
|
TypeName |
= |
|
|
UpperCaseId |
| |
FunnyId |
TypeVariable |
= |
LowerCaseId |
|
|
|
|
UniqueTypeVariable |
= |
LowerCaseId |
|
|
|
|
ClassName |
= |
LowerCaseId |
| |
UpperCaseId |
| |
FunnyId |
MemberName |
= |
LowerCaseId |
| |
UpperCaseId |
| |
FunnyId |
LowerCaseId |
= |
LowerCaseChar~{IdChar} |
UpperCaseId |
= |
UpperCaseChar~{IdChar} |
FunnyId |
= |
{SpecialChar}+ |
LowerCaseChar |
= |
a | b | c | d | e | f | g | h | i | j |
|
| |
k | l | m | n | o | p | q | r | s | t |
|
| |
u | v | w | x | y | z |
UpperCaseChar |
= |
a | b | c | d | e | f | g | h | i | j |
|
| |
k | l | m | n | o | p | q | r | s | t |
|
| |
u | v | w | x | y | z |
SpecialChar |
= |
~ | @ | # | $ | % | ^ | ? | ! |
|
| |
+ | - | * | < | > | \ | / | | | & | = |
|
| |
: |
IdChar |
= |
LowerCaseChar |
|
| |
UpperCaseChar |
|
| |
Digit |
|
| |
_ | ‘ |
Digit |
= |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
The convention used is that variables always start with a lowercase character while constructors and types always start with an uppercase character. The other identifiers can either start with an uppercase or a lowercase character. Notice that for the identifiers names can be used consisting of a combination of lower and/or uppercase characters but one can also define identifiers constructed from special characters like +, <, etc. (see Appendix A). These two kinds of identifiers cannot be mixed. This makes it possible to leave out white space in expressions like a+1 (same as a + 1).
The scope is the program region in which definitions (e.g. function definition, class definition, macro definition, type definition) with the identifiers introduced (e.g. function name, class name, class variable, macro name, type constructor name, type variable name) have a meaning.
It must be clear from the context to which definition an identifier is referring. If all identifiers in a scope have different names than it will always be clear which definition is meant. However, one generally wants to have a free choice in naming identifiers. If identifiers belong to different name spaces no conflict can arise even if the same name is used. In Clean the following name spaces exist:
• ModuleNames form a name space;
• FunctionNames, ConstructorNames, SelectorVariables, Variables and MacroNames form a name space;
• FieldNames form a name space;
• TypeNames, TypeVariables and UniqueTypeVariables form a name space;
• ClassNames form a name space.
So, it is allowed to use the same identifier name for different purposes as long as the identifier belongs to different name spaces.
Identifiers belonging to the same name space must all have different names within the same scope. Under certain conditions it is allowed to use the same name for different functions and operators (overloading, see Chapter 6).
Reusing identifier names is possible by introducing a new scope level. Scopes can be nested: within a scope a new nested scope can be defined. Within such a nested scope new definitions can be given, new names can be introduced. As usual it is allowed in a nested scope to re-define definitions or re-use names given in a surrounding scope: When a name is re-used the old name and definition is no longer in scope and cannot be used in the new scope. A definition given or a name introduced in a (nested) scope has no meaning in surrounding scopes. It has a meaning for all scopes nested within it (unless they are redefined within such a nested scope).
A Clean program consists of a collection of definition modules and implementation module s. An implementation module and a definition module correspond to each other if the names of the two modules are the same. The basic idea is that the definitions given in an implementation module only have a meaning in the module in which they are defined unless these definitions are exported by putting them into the corresponding definition module. In that case the definitions also have a meaning in those other modules in which the definitions are imported (see 2.5).
CleanProgram |
= |
{Module}+ |
Module |
= |
DefinitionModule |
|
| |
ImplementationModule |
DefinitionModule |
= |
definition module ModuleName ; |
|
|
{DefDefinition} |
|
| |
system module ModuleName ; |
|
|
{DefDefinition} |
ImplementationModule |
= |
[implementation] module ModuleName ; |
|
|
{ImplDefinition} |
An executable Clean program consists at least of one implementation module, the main or start module, which is the top-most module (root module) of a Clean program.
Each Clean module has to be put in a separate file.
The name of a module (i.e. the module name) should be the same as the name of the file (minus the suffix) in which the module is stored.
A definition module should have .dcl as suffix; an implementation module should have .i cl as suffix.
A definition module can have at most one corresponding implementation module.
Every implementation module (except the main module, see 2.3.1) must have a corresponding definition module.
In the main module a Start rule has to be defined (see Chapter 1).
Only in the main module one can leave out the keyword implementation in the module header. In that case the implementation module does not need to have a corresponding definition module (which makes sense for a top-most module).
A very tiny but complete Clean program consisting of one implementation module.
module hello
Start = "Hello World!"
Evaluation of a Clean program consists of the evaluation of the application defined in the right-hand side of the Start rule to normal form (see Chapter 1). The right-hand side of the Start rule is regarded to be the initial expression to be computed.
It is allowed to have a Start rule in other implementation modules as well. This can be handy for testing functions defined in such a module: to evaluate such a Start rule simply generate an application with the module as root and execute it. In the Clean IDE one can specify which module has to be regarded as being the root module.
The definition of the left-hand side of the Start rule consists of the symbol Start with one optional argument (of type *World), which is the environment parameter, which is necessary to write interactive applications.
A Clean program can run in two modes.
The first mode is a console mode. It is chosen when the Start rule is defined as a nullary function.
Start:: TypeOfStartFunction
Start = … // initial expression
In the console mode, that part of the initial expression (indicated by the right-hand side of the Start rule), which is in root normal form (also called the head normal form or root stable form), is printed as soon as possible. The console mode can be used for instance to test or debug functions.
In the Clean IDE one can choose to print the result of a Start expression with or without the data constructors.
For example, the initial expression
Start:: String
Start = "Hello World!"
in mode “show data constructors” will print: "Hello World!", in mode “don’t show data constructors” it will print: Hello World!
The second mode is the world mode. It is chosen when the optional additional parameter (which is of type *World) is added to the Start rule and delivered as result.
Start:: *World -> *World
Start w = … // initial expression returning a changed world
The world which is given to the initial expression is an abstract data structure, an abstract world of type *World which models the concrete physical world as seen from the program. The abstract world can in principle contain anything what a functional program needs to interact during execution with the concrete world. The world can be seen as a state and modifications of the world can be realized via state transition functions defined on the world or a part of the world. By requiring that these state transition functions work on a unique world the modifications of the abstract world can directly be realized in the real physical world, without loss of efficiency and without losing referential transparency (see Chapter 9)
The concrete way in which the world can be handled in Clean is determined by the system programmer. One way to handle the world is by using the predefined Clean I/O library, which can be regarded as a platform independent mini operating system. It makes it possible to do file I/O, window based I/O, dynamic process creation and process communication in a pure functional language in an efficient way. The definition of the I/O library is treated in a separate document (Object IO tutorial, Achten et al., 1997).
In an implementation module the following global definitions can be specified in any order.
ImplDefinition |
= |
ImportDef |
// see 2.5 |
|
| |
FunctionDef |
|
|
| |
GraphDef |
// see 3.6 |
|
| |
MacroDef |
// see 10.3 |
|
| |
TypeDef |
|
|
| |
ClassDef |
|
|
| |
GenericsDef |
Definitions on the global level (= outermost level in the module,) have in principle the whole implementation module as scope (see Figure 2.1).
Figure 2.1 (Scope of global definitions inside an implementation module).
implementation module XXX
:: TypeName typevars = type_expression // definition of a new type
functionName:: type_of_args -> type_of_result // definition of the type of a function
functionName args = expression
// definition of a function
selector = expression // definition of a constant graph
class className = expression // definition of a class
macroName args :== expression // definition of a macro
Types can only be defined globally (see Chapter 5) and therefore always have a meaning in the whole implementation module. Type variables introduced on the left-hand side of a (algebraic, record, synonym, overload, class, instance, function, graph) type definition have the right-hand side of the type definition as scope.
Functions, the type of these functions, constants (selectors) and macro’s can be defined on the global level as well as on a local level in nested scopes. When defined globally they have a meaning in the whole implementation module. Arguments introduced on the left-hand side of a definition (formal arguments) only have a meaning in the corresponding right-hand side.
Functions, the type of these functions, constants (selectors) and macro’s can also be defined locally in a new scope. However, new scopes can only be introduced at certain points. In functional languages local definitions are by tradition defined by using let-expressions (definitions given before they are used in a certain expression, nice for a bottom-up style of programming) and where-blocks (definitions given afterwards, nice for a top-down style of programming). These constructs are explained in detail in Chapter 3.
Clean programs can be written in two modes: layout sensitive mode ‘on’ and ‘off’. The layout sensitive mode is switched off when a semi-colon is specified after the module name. In that case each definition has to be ended with a semicolon ‘;’. A new scope has to begin with ‘{’ and ends with a ‘}’. This mode is handy if Clean code is generated automatically (e.g. by a compiler).
Example of a Clean program not using the layout rule.
module primes;
import StdEnv;
primes:: [Int];
primes = sieve [2..];
where
{ sieve:: [Int] -> [Int]; sieve [pr:r] = [pr:sieve (filter pr r)];
filter:: Int [Int] -> [Int];
filter pr [n:r] | n mod pr == 0 = filter pr r;
| otherwise = [n:filter pr r];
}
Programs look a little bit old fashion C-like in this way. Functional programmers generally prefer a more mathematical style. Hence, as is common in modern functional languages, there is a layout rule in Clean. When a semicolon does not end the header of a module, a Clean program has become layout sensitive. The layout rule assumes the omission of the semi-colon (‘;’) that ends a definition and of the braces (‘{’ and ‘}’) that are used to group a list of definitions. These symbols are automatically added according to the following rules:
In layout sensitive mode the indentation of the first lexeme after the keywords let, #, #!, of, where, or with determines the indentation that the group of definitions following the keyword has to obey. Depending on the indentation of the first lexeme on a subsequent line the following happens. A new definition is assumed if the lexeme starts on the same indentation (and a semicolon is inserted). A previous definition is assumed to be continued if the lexeme is indented more. The group of definitions ends (and a close brace is inserted) if the lexeme is indented less. Global definitions are assumed to start in column 0.
We strongly advise to write programs in layout sensitive mode. For reasons of portability it is assumed that a tab space is set to 4 white spaces and that a non-proportional font is used.
Same program using the layout sensitive mode.
module primes
import StdEnv
primes:: [Int]
primes = sieve [2..]
where
sieve:: [Int] -> [Int]
sieve [pr:r] = [pr:sieve (filter pr r)]
filter:: Int [Int] -> [Int]
filter pr [n:r]
| n mod pr == 0 = filter pr r
| otherwise = [n:filter pr r]
The definitions given in an implementation module only have a meaning in the module in which they are defined. If you want to export a definition, you have to specify the definition in the corresponding definition module. Some definitions can only appear in implementation modules, not in definition modules. The idea is to hide the actual implementation from the outside world. The is good for software engineering reasons while another advantage is that an implementation module can be recompiled separately without a need to recompile other modules. Recompilation of other modules is only necessary when a definition module is changed. All modules depending on the changed module will have to be recompiled as well. Implementations of functions, graphs and class instances are therefore only allowed in implementation modules. They are exported by only specifying their type definition in the definition module. Also the right-hand side of any type definition can remain hidden. In this way an abstract data type is created (see 5.4).
In a definition module the following global definitions can be given in any order.
DefDefinition |
= |
ImportDef |
// see 2.5 |
|
| |
FunctionTypeDef |
// see 3.7 |
|
| |
MacroDef |
// see 10.3 |
|
| |
TypeDef |
|
|
| |
ClassDef |
|
|
| |
TypeClassInstanceExportDef |
// see 6.10 |
|
| |
GenericExportDef |
The definitions given in an implementation module only have a meaning in the module in which they are defined (see 2.3) unless these definitions are exported by putting them into the corresponding definition module. In that case they also have a meaning in those other modules in which the definitions are imported (see 2.5).
In the corresponding implementation module all exported definitions have to get an appropriate implementation (this holds for functions, abstract data types, class instances).
An abstract data type is exported by specifying the left-hand side of a type rule in the definition module. In the corresponding implementation module the abstract type has to be defined again but then right-hand side has to be defined as well. For such an abstract data type only the name of the type is exported but not its definition.
A function, global graph or class instance is exported by defining the type header in the definition module. For optimal efficiency it is recommended also to specify strictness annotations (see 10.1). For library functions it is recommended also to specify the uniqueness type attributes (see Chapter 9). The implementation of a function, a graph, a class instance has to be given in the corresponding implementation module.
Although it is not required anymore to repeat an exported definition in the corresponding implementation module, it is a good habit to do so to keep the implementation module readable. If a definition is repeated, the definition given in the definition module and in the implementation module should be the same (modulo variable names).
Definition module.
definition module ListOperations
::complex // abstract type definition
re:: complex -> Real // function taking the real part of complex number
im:: complex -> Real // function taking the imaginary part of complex
mkcomplex:: Real Real -> Complex // function creating a complex number
corresponding implementation module):
implementation module ListOperations
::complex :== (!Real,!Real) // a type synonym
re:: complex -> Real // type of function followed by its implementation
re (frst,_) = frst
im:: complex -> Real
im (_,scnd) = scnd
mkcomplex:: Real Real -> Complex
mkcomplex frst scnd = (frst,scnd)
Via an import statement a definition exported by a definition module (see 2.4) can be imported into any other (definition or implementation) module. There are two kinds of import statements, explicit imports and implicit imports.
ImportDef |
= |
ImplicitImportDef |
|
| |
ExplicitImportDef |
A module depends on another module if it imports something from that other module. In Clean 2.x cyclic dependencies are allowed.
Explicit imports are import statements in which the modules to import from as well as the identifiers indicating the definitions to import are explicitly specified. All identifiers explicitly being imported in a definition or implementation module will be included in the global scope level (= outermost scope, see 2.3.2) of the module that does the import.
ExplicitImportDef |
= |
from ModuleName import {Imports}-list ; |
Imports |
= |
FunctionName |
|
| |
::TypeName [ConstructorsOrFields] |
|
| |
class ClassName [Members] |
|
| |
instance ClassName {TypeName}+ |
ConstructorsOrFields |
= |
(..) |
|
| |
({ConstructorName}-list) |
|
| |
{..} |
|
| |
{{FieldName}-list} |
Members |
= |
(..) |
|
| |
({MemberName}-list) |
The syntax and semantics of explicit import statements has been completely revised in Clean 2.x in order to make it possible to discriminate between the different namespaces that exist in Clean (see 2.1.2). One can import functions or macro's, types with optionally their corresponding constructors, record types with optionally their corresponding fieldnames, classes and instances of classes.
Example of an explicit import.
implementation module XXX
from m import F,
:: T1, :: T2(..), :: T3(C1, C2), :: T4{..}, :: T5{field1, field2},
class C1, class C2(..), class C3(mem1, mem2),
instance C4 Int
With the import statement the following definition exported by module m are imported in module XXX: the function or macro F, the type T1, the algebraic type T2 with all it's constructors that are exported by m, the algebraic type T3 with it's constructors C1 and C2, the record type T4 with all it's fields that are exported by m, the record type T5 with it's fields field1 and field2, the class C1, the class C2 with all it's members that are exported by m, the class C3 with it's members mem1 and mem2, the instance of class C4 defined on integers.
Importing identifiers can cause error messages because the imported identifiers may be in conflict with other identifiers in this scope (remember that identifiers belonging to the same name space must all have different names within the same scope, see 2.1). This problem can be solved by renaming the internally defined identifiers or by renaming the imported identifiers (eg by adding an additional module layer just to rename things).
ImplicitImportDef |
= |
import {ModuleName}-list ; |
Implicit imports are import statements in which only the module name to import from is mentioned. In this case all definitions that are exported from that module are imported as well as all definitions that on their turn are imported in the indicated definition module, and so on. So, all related definitions from various modules can be imported with one single import. This opens the possibility for definition modules to serve as a kind of ‘pass-through’ module. Hence, it is meaningful to have definition modules with import statements but without any definitions and without a corresponding implementation module.
Example of an implicit import: all (arithmetic) rules which are predefined can be imported easily with one import statement.
import MyStdEnv
importing implicitly all definitions imported by the definition module ‘MyStdEnv’ which is defined below (note that definition module ‘MyStdEnv’ does not require a corresponding implementation module) :
definition module MyStdEnv
import StdBool, StdChar, StdInt, StdReal, StdString
All identifiers implicitly being imported in a definition or implementation module will be included in the global scope level (= outermost scope, see 2.3.2) of the module that does the import.
Importing identifiers can cause error messages because the imported identifiers may be in conflict with other identifiers in this scope (remember that identifiers belonging to the same name space must all have different names within the same scope, see 2.1). This problem can be solved by renaming the internally defined identifiers or by renaming the imported identifiers (eg by adding an additional module layer just to rename identifiers).
System modules are special modules. A system definition module indicates that the corresponding implementation module is a system implementation module which does not contain ordinary Clean rules. In system implementation modules it is allowed to define foreign functions: the bodies of these foreign functions are written in another language than Clean. System implementation modules make it possible to create interfaces to operating systems, to file systems or to increase execution speed of heavily used functions or complex data structures. Typically, predefined function and operators for arithmetic and File I/O are implemented as system modules.
System implementation modules may use machine code, C-code, abstract machine code (PABC-code) or code written in any other language. What exactly is allowed depends on the Clean compiler used and the platform for which code is generated. The keyword code is reserved to make it possible to write Clean programs in a foreign language. This is not treated in this reference manual.
When one writes system implementation modules one has to be very careful because the correctness of the functions can no longer be checked by the Clean compiler. Therefore, the programmer is now responsible for the following:
! The function must be correctly typed.
! When a function destructively updates one of its (sub-)arguments, the corresponding type of the arguments should have the uniqueness type attribute. Furthermore, those arguments must be strict.