The computer industry has always been characterized by rapid and profound changes. Since Forth was last standardized in the early 1980's, the speed, memory size and disk capacity of affordable personal computers have increased by factors of more than 100. 8-bit processors are now rare in PCs (although they are still widely used in embedded systems), and 32-bit processors are common. Operating systems, programming environments and user interfaces are far more sophisticated. Many recent Forth implementations, both commercial and public-domain, have attempted to address these issues.
At the time of writing (November, 1992), a Technical Committee X3J14
(of which authors Rather and Colburn are members) is nearing completion
of an ANS Forth
[note]
Among the 20 voting members in the TC are vendors (FORTH, Inc.,
Creative Solutions, Sun Microsystems and a division of NCR), some large
user organizations (Ford Motor Co., NASA), and a number of smaller user
organizations, consultants and experts. Starting in 1987, this group has
addressed a number of problems with FORTH-79 and FORTH-83, as well as some
contemporary issues. A few of the issues addressed in the draft standard
follow, as they represent current areas of lively debate and technical activity
among Forth users and implementors.
ANS Forth attempts to reconcile some of the divisions caused by the incompatibilities between FORTH-79 and FORTH-83. For example, it retains 0= to perform the FORTH-79 NOT function, introduces INVERT to perform the FORTH-83 NOT, and removes the word NOT. This enables application writers who depend on either version to leave their programs unchanged, and achieve compatibility by adding a simple shell in which NOT is defined as a synonym for the preferred behavior.
The proposed standard also removes virtually all restrictions on implementation options, provides for independence from CPU word size, and offers a number of optional extension word-sets for functions such as host OS file compatibility, dynamic memory allocation and floating point arithmetic. Some significant issues addressed by ANS Forth follow.
FORTH-79 and FORTH-83 mandated a 16-bit architecture, including stack width, addresses, flags and numbers. ANS Forth specifies sizes in terms of a "cell," whose width is implementation-defined but must be at least 16 bits. Words have been added to increment addresses transportably by a cell, a character, or an integral number of cells or characters.
Amid great controversy, FORTH-83 mandated floored division. Not only was this incompatible with prior usage (which didn't specify the algorithm for handling signed division), it is also at variance with hardware multiply/divide instructions on most processors. But many people felt strongly that floored division is mathematically more appropriate, and that it was important to specify. Recognizing that there were many implementations on both sides of this issue, the TC opted to allow either floored or truncated division. The implementation must specify which default it uses, and must provide primitives supporting both methods.
One of the unique characteristics of Forth is the degree to which its own internal tools are accessible to the application programmer. For example, there is one lexical analyzer used by the compiler, assembler, and text interpreter; it is also available for command and text parsing in applications. Similarly, the tools that implement control structures such as loops and conditionals are available for making custom structure words. In 1986 Wil Baden demonstrated [Baden_1986] that the standard Forth structure words plus a few extensions made from these underlying tools are adequate to make any structure, including solutions to problems posed in D. E. Knuth's paper "Structured Programming with go to statements" [Knuth, 1974].
|
Structure |
Description |
|
DO … LOOP |
Finite loop incrementing by 1 |
|
DO … <n> +LOOP |
Finite loop incrementing by <n> |
|
BEGIN … <f> UNTIL |
Indefinite loop terminating when <f> is 'true' |
|
BEGIN … <f> WHILE … REPEAT |
Indefinite loop terminating when <f> is 'false' |
|
BEGIN … AGAIN |
Infinite loop |
|
<f> IF … ELSE … THEN |
Two-branch conditional; performs words following IF if <f> is 'true' and words following
ELSE if it is 'false'. THEN marks the point at which the paths merge. |
|
<f> IF … THEN |
Like the two-branch conditional, but with only a 'true' clause. |
Table 4. Standard control structures in Forth. ANS Forth allows programmers to form new structures by mixing the component words or using them to define new structure words.
FORTH-79 and FORTH-83 provided syntactic specifications for the common structures listed in Table 4, as well as an "experimental" collection of structure primitives. The latter were not widely adopted, however, and few implementations perform the kind of syntax checking the standards anticipated. F83 offers a limited form of syntax checking, in that it requires the stack (which is used at compile time for compiling structures) to have the same size before and after compiling a definition, the theory being that a stack imbalance would indicate an incomplete structure. Unfortunately, this technique prevents the very common practice of leaving a value on the compile-time stack which is to be compiled as a literal inside a definition.
Common practice often took advantage of knowledge about how the structure words worked at compile time to manipulate them in creative ways. The ANS Forth Technical Committee sanctioned this by providing specifications of both the compile-time and run-time behaviors of the structure words, so that they may be combined in arbitrary order. A set of structure primitives is provided in a "programming tools" wordset, and the word POSTPONE is provided to enable programmers to write new structure words that reference existing compiler directives in order to provide a portion of the desired new behavior.
The original Forth systems developed by Moore in the 1970's compiled source from disk into an executable form in memory. This avoided the separate compile-link-load sequences characteristic of most compiled languages, and led to a very interactive programming style in which the programmer could use the resident Forth editor to modify source and re-compile it, having it available for testing in seconds. The internal structure of a definition was as shown in Fig. 2, with all fields contiguous in memory. The FIG model and its derivatives modified the details of this structure somewhat, but preserved its essential character.
Figure 2. Diagram showing the logical components of a Forth definition. In classical implementations, these fields are contiguous in memory. The data field will hold values for data objects, addresses or tokens for procedures, and the actual code for CODE definitions.
Forth systems implemented according to this model built a high-level definition by compiling pointers to previously defined words into its parameter field; the address interpreter that executed such definitions proceeded through these routines, executing the referenced definitions in turn by performing indirect jumps through the register used to keep its place. This is generally referred to as indirect-threaded code. [note]
The need to optimize for different conditions has led to a number of variants in this basic implementation strategy however. Some of the most interesting are:
Although the early standards assumed the classical structure, ANS Forth makes a special effort to avoid assumptions about implementation techniques, resulting in prohibitions against assuming a relationship between the head and data space of a definition or accessing the body of a data structure other than by pre-defined operators. This has generated some controversy among programmers who prefer the freedom to make such assumptions over the optimizations that are possible with alternative implementation strategies.
Forth's support for custom data types with user-defined structure as well as compile-time and run-time behaviors has, over the years, led programmers to develop object-based systems such as Moore's approach to image processing described above (Section 2.2.2, item 2). Pountain [1987] described one approach to object-oriented programming in Forth, which has been tried by a number of implementors. Several Forth vendors have taken other approaches to implementing object-based systems, and this is currently one of the most fertile areas of exploration in Forth. [note]
In 1984, Charles Duff introduced an object-oriented system written in Forth called Neon [Duff, 1984 a, b]. When Duff discontinued supporting it in the late 80's it was taken over by Bob Lowenstein, of the University of Chicago's Yerkes Observatory, where it is available as a public-domain system under the name Yerk. More recently, Michael Hore re-implemented Neon using a subroutine-threaded code; the result is available (also in the public domain) under the name MOPS. Both Yerk and MOPS are available as down-loadable files on a number of Forth-oriented electronic bulletin boards listed at the end of this paper.