AGFL alphabet translation possibilities
=======================================

AGFL provides the possibility to users to define how
input text characters should be interpreted prior to further processing.



default character translation
------- --------- -----------

By default (i.e., when no user-defined translation is available),
the AGFL system:
    - uses full 8-bit character encoding for the lexica;
    - translates capitals in the input to lowercase letters ("A"->"a")
      prior to further processing;
    - and leaves all other characters as-is.



overriding the default
---------- --- -------

To override the default, the user should 
    - provide a file describing the translation required;
    - let the environment variable "AGFL_ALPHABET" point to this file.

An example of an alphabet translation file is given in this directory:
"agfl_alphabet.txt". The translation defined in this example file
is exactly equal to the default behaviour of the system.
The AGFL user can take (a copy of) the example file and modify it
to his liking.

The environment variable can be set as follows:
    SET AGFL_ALPHABET=c:\agfl_2.2\alphabet\agfl_alphabet.txt
(in stead of c:\agfl_2.2\alphabet\agfl_alphabet.txt any legal disk,
path and file name seen fit by the user can be used).
It is suggested to include this definition in the file
agfl_2.2/bin/setagfl.bat, which already contains a commented-out
command line that defines the environment variable AGFL_ALPHABET
in such a way that it points to the example file in this directory.
Activating the setting in this command file will make sure the
alphabet translation is activated together with the definition of
all other symbols and settings required to use the AGFL system.



returning to the default
--------- -- --- -------

To return to the default, either
    - remove or comment out the setting of AGFL_ALPHABET,
      quit and restart the DOS box;
    - re-define the setting on the command line by either
      undoing the definition by means of
          SET AGFL_ALPHABET=
      or making the AGFL_ALPHABET variable point to the "default"
      alphabet translation file:
          SET AGFL_ALPHABET=c:\agfl_2.2\alphabet\agfl_alphabet.txt



let the run-time system report on the translation table used
--- --- -------- ------ ------ -- --- ----------- ----- ----

Once a parser is generated, the parser will report the active character
translations when started with the "-A" option.
This option will list all active translations prior to accepting input.



the alphabet translation file layout
--- -------- ----------- ---- ------

The alphabet translation file should contain exactly 256 lines:
1 line for each 8-bit character. The first line
defines the translation for character 0,
the 100th line for character value 99,
and the 256th line defines the translation for character 255, etc.

Each line must contain the value of the character to be translated to.
This value is mandatory, even if it is equal to the character to be translated from.
No character values below 0 or above 255 are accepted.

Example: Line 66 defines character value 65, which for many language definitions
corrsponds to "A".
When capital "A" should be passed as-is, the 66th line should contain:

65

Non-numeric text following he character value definition is ignored (which means
that additional textual comments are possible).

The run-time system checks the availability of the alphabet translation file
and check it's layout, and will give appropriate error messages.



final remarks
----- -------

   - The character translation mechanism will not influence the way
     characters are handled by windows.
     Example: re-definition of control-c (line 4, character value 3)
     will not prevent windows from stopping the parser
     when the user presses control-c when the system waits for input.

   - Representation of characters in windows depends on the language definition;
     but even then, strange effects are possible.
     For several language definitions we tested it turned out that
     representation of characters with values over 127 depend on the program used.
     Example: some editors show for some value a vowel+accent combination,
     whereas others show a double left corner character to be used for line drawing.

     This is unfortunate, but can not be prevented by us.
