«previous next»

TABLE OF CONTENTS

Abstract & Authors

1. Chuck Moore's Programming Language

1.2 Philosophy and Goals

2. Development and Dissemination

3. Forth Without Chuck Moore

4. Hardware Implementations of Forth

5. Present and Future Directions

6. A Posteriori Evaluation

References

Addenda

Also in: Croatian (Hrvatski)

5 Present and Future Directions

The computer industry has always been characterized by rapid and profound changes. Since Forth was last standardized in the early 1980's, the speed, memory size and disk capacity of affordable personal computers have increased by factors of more than 100. 8-bit processors are now rare in PCs (although they are still widely used in embedded systems), and 32-bit processors are common. Operating systems, programming environments and user interfaces are far more sophisticated. Many recent Forth implementations, both commercial and public-domain, have attempted to address these issues.

5.1 Standardization Efforts

At the time of writing (November, 1992), a Technical Committee X3J14 (of which authors Rather and Colburn are members) is nearing completion of an ANS Forth [note] Among the 20 voting members in the TC are vendors (FORTH, Inc., Creative Solutions, Sun Microsystems and a division of NCR), some large user organizations (Ford Motor Co., NASA), and a number of smaller user organizations, consultants and experts. Starting in 1987, this group has addressed a number of problems with FORTH-79 and FORTH-83, as well as some contemporary issues. A few of the issues addressed in the draft standard follow, as they represent current areas of lively debate and technical activity among Forth users and implementors.

ANS Forth attempts to reconcile some of the divisions caused by the incompatibilities between FORTH-79 and FORTH-83. For example, it retains 0= to perform the FORTH-79 NOT function, introduces INVERT to perform the FORTH-83 NOT, and removes the word NOT. This enables application writers who depend on either version to leave their programs unchanged, and achieve compatibility by adding a simple shell in which NOT is defined as a synonym for the preferred behavior.

The proposed standard also removes virtually all restrictions on implementation options, provides for independence from CPU word size, and offers a number of optional extension word-sets for functions such as host OS file compatibility, dynamic memory allocation and floating point arithmetic. Some significant issues addressed by ANS Forth follow.

5.1.1 Cell size

FORTH-79 and FORTH-83 mandated a 16-bit architecture, including stack width, addresses, flags and numbers. ANS Forth specifies sizes in terms of a "cell," whose width is implementation-defined but must be at least 16 bits. Words have been added to increment addresses transportably by a cell, a character, or an integral number of cells or characters.

5.1.2 Arithmetic

Amid great controversy, FORTH-83 mandated floored division. Not only was this incompatible with prior usage (which didn't specify the algorithm for handling signed division), it is also at variance with hardware multiply/divide instructions on most processors. But many people felt strongly that floored division is mathematically more appropriate, and that it was important to specify. Recognizing that there were many implementations on both sides of this issue, the TC opted to allow either floored or truncated division. The implementation must specify which default it uses, and must provide primitives supporting both methods.

5.1.3 Control structures

One of the unique characteristics of Forth is the degree to which its own internal tools are accessible to the application programmer. For example, there is one lexical analyzer used by the compiler, assembler, and text interpreter; it is also available for command and text parsing in applications. Similarly, the tools that implement control structures such as loops and conditionals are available for making custom structure words. In 1986 Wil Baden demonstrated [Baden_1986] that the standard Forth structure words plus a few extensions made from these underlying tools are adequate to make any structure, including solutions to problems posed in D. E. Knuth's paper "Structured Programming with go to statements" [Knuth, 1974].

Structure

Description

DO … LOOP

Finite loop incrementing by 1

DO … <n> +LOOP

Finite loop incrementing by <n>

BEGIN … <f> UNTIL

Indefinite loop terminating when <f> is 'true'

BEGIN … <f> WHILE … REPEAT

Indefinite loop terminating when <f> is 'false'

BEGIN … AGAIN

Infinite loop

<f> IF … ELSE … THEN

Two-branch conditional; performs words following IF if <f> is 'true' and words following ELSE if it is 'false'. THEN marks the point at which the paths merge.

<f> IF … THEN

Like the two-branch conditional, but with only a 'true' clause.

Table 4. Standard control structures in Forth. ANS Forth allows programmers to form new structures by mixing the component words or using them to define new structure words.

FORTH-79 and FORTH-83 provided syntactic specifications for the common structures listed in Table 4, as well as an "experimental" collection of structure primitives. The latter were not widely adopted, however, and few implementations perform the kind of syntax checking the standards anticipated. F83 offers a limited form of syntax checking, in that it requires the stack (which is used at compile time for compiling structures) to have the same size before and after compiling a definition, the theory being that a stack imbalance would indicate an incomplete structure. Unfortunately, this technique prevents the very common practice of leaving a value on the compile-time stack which is to be compiled as a literal inside a definition.

Common practice often took advantage of knowledge about how the structure words worked at compile time to manipulate them in creative ways. The ANS Forth Technical Committee sanctioned this by providing specifications of both the compile-time and run-time behaviors of the structure words, so that they may be combined in arbitrary order. A set of structure primitives is provided in a "programming tools" wordset, and the word POSTPONE is provided to enable programmers to write new structure words that reference existing compiler directives in order to provide a portion of the desired new behavior.

5.2 Implementation Strategies

The original Forth systems developed by Moore in the 1970's compiled source from disk into an executable form in memory. This avoided the separate compile-link-load sequences characteristic of most compiled languages, and led to a very interactive programming style in which the programmer could use the resident Forth editor to modify source and re-compile it, having it available for testing in seconds. The internal structure of a definition was as shown in Fig. 2, with all fields contiguous in memory. The FIG model and its derivatives modified the details of this structure somewhat, but preserved its essential character.

Components of a Forth definition

Figure 2. Diagram showing the logical components of a Forth definition. In classical implementations, these fields are contiguous in memory. The data field will hold values for data objects, addresses or tokens for procedures, and the actual code for CODE definitions.

Forth systems implemented according to this model built a high-level definition by compiling pointers to previously defined words into its parameter field; the address interpreter that executed such definitions proceeded through these routines, executing the referenced definitions in turn by performing indirect jumps through the register used to keep its place. This is generally referred to as indirect-threaded code. [note]

The need to optimize for different conditions has led to a number of variants in this basic implementation strategy however. Some of the most interesting are:

  1. Direct threaded code. In this model, the code field contains machine code instead of a pointer to machine code This is somewhat faster, but typically costs extra bytes for some classes of words. It is most prevalent on 32-bit systems.
  2. Subroutine-threaded code. In this model, the compiler places a jump-to-subroutine instruction with the destination address in-line. This technique costs extra bytes for each compiled reference on a 16-bit system. It is often slower than direct-threaded code, but it is an enabling technique to allow the progression to native code generation.
  3. Native code generation. Going one step beyond subroutine-threaded code, this technique generates in-line machine instructions for simple primitives such as + and jumps to other high-level routines. The result can run much faster, at some cost in size and compiler complexity. Native code can be more difficult to debug than threaded code. This technique is characteristic of optimized systems for the Forth chips such as the RTX, and on 32-bit systems where code compactness is often less critical than speed.
  4. Optimizing compilers. A variant of native code generation, these were invented for the Forth processors (discussed in Section 1.5) that can execute several Forth primitives in a single cycle. They looked for the patterns that could be handled in this way and automatically generated the appropriate instruction. The range of optimization was governed by the capabilities of the processor; for example, the polyFORTH compiler for the Novix and RTX processors had a four-element peephole window.
  5. Token threading. This technique compiles references to other words using a token, such as an index into a table, which is more compact than an absolute address. Token threading was used in a version of Forth for a Panasonic hand-held computer developed in the early 1980's, for example, and is a key element in MacForth.
  6. Segmented architectures. The 80x86 family supports segmented address spaces. Some Forths take advantage of this to enable a 16-bit system to support programs larger than 64K. Similarly, implementations for Harvard-architecture processors such as the 8051 and TI TMS320 series manage separate code and data spaces.

Although the early standards assumed the classical structure, ANS Forth makes a special effort to avoid assumptions about implementation techniques, resulting in prohibitions against assuming a relationship between the head and data space of a definition or accessing the body of a data structure other than by pre-defined operators. This has generated some controversy among programmers who prefer the freedom to make such assumptions over the optimizations that are possible with alternative implementation strategies.

5.3 Object-oriented Extensions

Forth's support for custom data types with user-defined structure as well as compile-time and run-time behaviors has, over the years, led programmers to develop object-based systems such as Moore's approach to image processing described above (Section 2.2.2, item 2). Pountain [1987] described one approach to object-oriented programming in Forth, which has been tried by a number of implementors. Several Forth vendors have taken other approaches to implementing object-based systems, and this is currently one of the most fertile areas of exploration in Forth. [note]

In 1984, Charles Duff introduced an object-oriented system written in Forth called Neon [Duff, 1984 a, b]. When Duff discontinued supporting it in the late 80's it was taken over by Bob Lowenstein, of the University of Chicago's Yerkes Observatory, where it is available as a public-domain system under the name Yerk. More recently, Michael Hore re-implemented Neon using a subroutine-threaded code; the result is available (also in the public domain) under the name MOPS. Both Yerk and MOPS are available as down-loadable files on a number of Forth-oriented electronic bulletin boards listed at the end of this paper.

«previous next»