NEW ANS FORTH WORDS PROPOSAL
Jabarai Zakiya jzakiya_at_mail.com
3rdeye Technology, LLC
--------------------------------
Proposed Words to CORE wordset: +GROW -GROW GET PUT
------------------------------------------------------
MOTIVATION FOR PROPOSAL
------------------------
I use Forth extensively to implement arithmetic algorithms.
I first test out my coding methodology in high level
ANS FORTH, but I want to make my code as fast as possible.
I currently use SwiftForth (V 2.00.3). Its native mode
compiles predefined Forth words as inline code for speed.
SwiftForth (and MPE's ProForth, et al) perform various
degress of code phrase optimization to speedup/compact
code. But invariably, for my purposes, I must resort to
hand assembly coding to produce the optimum results I
desire, especially for Pentium class (x86, x386) processors.
In the process of inline coding to produce Forthlike code
(versus straight assembly language coding) I've recognized
FUNDAMENTAL CONCEPTUAL OPERATIONS I frequently perform at
the assembly level that, if elevated to high level Forth
concepts, would provide a significant benefit to the
language. These concepts specifically pertain to
elementary stack manipulation operations.
To me, the defining hardware/software conceptualization
of Forth is based on its stack architecture. The current
stack manipulation words: DUP, DROP, OVER, ROT, etc,
conceptually work at the atomic level of stack manipulation.
What I mean by this is that there are sub-atomic operations
that must be performed on the stack to achieve these higher
order stack operations. What I propose with these additional
words is the elevation of these fundamental sub-atomic
stack operations to high level concepts for direct control.
Proposed Words Operations
-------------------------
At the sub-atomic level of stack manipulation you perform
two classes of operations: you "GROW" the stacks to either
lengthen or shorthen it, and you alter stack values by
putting new data in different stack elements.
The following words provide direct control of these
operations, for which all existing stack manipulation
words can be defined.
Nomenclature: TOS = STACK[0], NEXT = STACK[1], etc.
Word 1: +GROW Usage: n +GROW ( b a n -- b.._ a)
Operation: +GROW takes a value n from TOS and extends the
stack below the TOS n elements , with the TOS value
unaffected. The values for the extended stack are undefined.
This cannot be conceptually performed in ANS FORTH.
0 +GROW --> NOP
Word 2: -GROW Usage: n -GROW ( d c b...a - a c b a )
Operation: -GROW takes a value n from the TOS and shrinks the
stack below the TOS n elements, with the TOS value unaffected.
Conceptually, it can be performed as:
: -GROW ( y..x n - y x) 0 DO NIP LOOP ; 0 -GROW --> NOP
Word 3: GET Usage: n GET ( x c b a n - x c a b x )
Operation: GET takes a value n from the TOS and places STACK[n]
in TOS that existed before n was placed on the stack, without
affecting the stack length.
Conceptually, it can be performed as:
: GET ( y...w x n - y..w y) PICK NIP ; 0 GET --> NOP
Word 4: PUT Usage: n PUT ( x..b a n - a..b a )
Operation: PUT takes a value n from TOS and places in STACK[n]
the value in TOS that existed before n was placed on the stack,
without affecting the stack legnth.
Conceptually, it can be performed as:
: PUT ( y...x n - y...y) DUP >R ROLL DROP DUP R> -ROLL ;
0 PUT --> NOP
Compilation Behavior
--------------------
Each of these words will have a compile time behavior and a
run time behavior similar to that for PICK and ROLL.
When used inside a colon definition, the run time behavior
code will be compiled if no literal value is compiled
preceeding the word, such as in : FOO1 ROT GET ;
If a literal preceeds these words within a colon defined
word then the literal is compiled directly into the
instruction by the compler, e.g. in : FOO2 ROT 3 GET ;
Using SwiftForth's code : FOO PICK NIP ; compiles as:
: FOO PICK NIP ;
SEE FOO
46705F 0 [EBP] [EBX*4] EBX MOV 8B5C9D00
467063 4 # EBP ADD 83C504
467066 RET C3
Thus, FOO1 would compile to:
SEE FOO1
xxxxxx <ROT code> yyyyyyyy
xxxxxx 0 [EBP] [EBX*4] EBX MOV 8B5C9D00
xxxxxx 4 # EBP ADD 83C504
xxxxxx RET C3
and F002 should compile to:
SEE F002
xxxxxx <ROT code> yyyyyyyy
xxxxxx 8 [EBP] EBX MOV 8B5D08
xxxxxx RET C3
Presently in SwiftForth 3 PICK NIP (3 GET) compiles to:
: 3_GET 3 PICK NIP ; ok
SEE 3_GET
46709F 4 # EBP SUB 83ED04
4670A2 EBX 0 [EBP] MOV 895D00
4670A5 C [EBP] EBX MOV 8B5D0C
4670A8 4 # EBP ADD 83C504
4670AB RET C3 ok
ok
5 4 3 2 1 0 .S
5 4 3 2 1 0 <-Top ok
3_GET .S
5 4 3 2 1 3 <-Top ok
The value 'n' shall be treated as a positive interger. The
value '0' will produce no new stack affects and can be
coded as a no-operation (NOP) at run time or skipped by the
compiler at compile time.
Implementation Mechanics
------------------------
The Forth data stack is typically implemented as at least
one hardware register that holds the value for TOS (some
implementations hold NEXT and/or more stack elements in
registers, especially some FORTH engines) and the remainder
of the stack in RAM memory. Typically, the address for NEXT
is stored in a register, which is decremented and incremented
to GROW the stack in either direction.
In SwiftForth (MPE, Win32For, et al) the 32-bit register EBX
is used to hold the value of TOS for x86 class processors.
The EBP register is used to hold the address in RAM where
NEXT resides, and the ESP register holds the address for the
TOS of the RETURN stack (which typically completely resides
in RAM).
These proposed words are implemented as simple memory reads and
writes of the TOS and NEXT registers, EBX and EBP for x86 class
processors, and the equivalent registers for other cpus.
To implement *GROW would require only the incrementing or
decrementing of the EBP register for x86 cpus as follows:
5 +GROW --> 20 # EBP SUB
1 -GROW --> 4 # EBP ADD (NIP)
To implement GET and PUT would require only writing a stack
element to TOS, or wrting TOS to a stack element as follows:
4 GET --> 12 [EBP] EBX MOV
1 PUT --> EBX 0 [EBP] MOV
Currently, to acquire this level of direct stack control
requires you to work outside of ANS FORTH and utilize the
assembly language coding features of your Forth system.
Of course, this code will no longer be ANS FORTH compliant,
and will take effort to transport to different cpus. Again,
these proposed words create a standard mechanism to directly
control stack manipulation that would be conceptually
standard and portable at the ANS FORTH high level.
Examples of Usage
-----------------
: NIP ( a b c - a c )
1 -GROW ( a b 4 # EBP ADD )
;
: DROP ( a b c - a b )
1 GET ( a b b 0 [EBP] EBX MOV )
1 -GROW ( a b 4 # EBP ADD )
;
: OVER ( a b - a b a )
1 +GROW ( a _ b 4 # EBP SUB )
1 PUT ( a b b EBX 0 [EBP] MOV )
2 GET ( a b a 4 [EBP] EBX MOV )
;
: DUP ( a - a a )
1 +GROW ( _ a 4 # EBP SUB )
1 PUT ( a a EBX 0 [EBP] MOV )
;
: 2DUP ( a b - a b a b )
2 +GROW ( a _ _ b 8 # EBP SUB )
2 PUT ( a b _ b EBX 4 [EBP] MOV )
3 GET ( a b _ a 8 [EBP] EBX MOV )
1 PUT ( a b a a EBX 0 [EBP] MOV )
2 GET ( a b a b 4 [EBP] EBX MOV )
;
: 2OVER ( a b c d - a b c d a b )
2 +GROW ( a b c _ _ d 8 # EBP SUB )
2 PUT ( a b c d _ d EBX 4 [EBP] MOV )
5 GET ( a b c d _ a 16 [EBP] EBX MOV )
1 PUT ( a b c d a a EBX 0 [EBP] MOV )
4 GET ( a b c d a b 12 [EBP] EBX MOV )
;
Of course, phrases can also be interpreted:
DROP DUP --> 1 GET
5 PICK --> 1 +GROW 1 PUT 5 GET
Naming Conventions
------------------
I chose the names +GROW/-GROW because they create
conceptual conjugate symmetry, in the same manner
that ROT/-ROT and ROLL/-ROLL do. Each name
represents clear pictures of what they do to the
stack, and are only 5 characters long. They also
do not conflict with any system words in SwiftForth,
and hopefully most other systems
In the same manner, PUT succintly represents its
stack operation, and is only 3 characters long.
You 'put' the TOS in element 'n' of the stack.
It also doesn't conflict with any SwifForth words,
and hopefully the same for other systems. An
alternative name is: >STACK
In the same manner, GET nicely represents its
stack operation, and is only 3 characters long.
The TOS 'gets' element 'n' from the stack.
It, however, conflicts with an existing SwiftForth
defined word. An alternative name I like is: >TOS
Extensions of Wordsets
----------------------
These concepts can be extended to create similar
words to control the manipulation of the RETURN
stack and also to provide direct control of a cpus
register set.
Thus, +RGROW, -RGROW, RGET, RPUT conceptually perform
the same operations on the RETURN stack, relative to
TOS.
Also the words EAGET, EAPUT, ECGET, ECPUT, EDGET, EDPUT
(for 32-bit x86 regs) provide the same high level
conceptualization for accessing stack data by the
x86 general purpose registers EAX, ECX, and EDX.
These words operate conceptually the same as GET and PUT.
2 EAGET --> 4 [EBP] EAX MOV
1 EAGET --> 0 [EBP] EAX MOV
0 EAGET --> EBX EAX MOV
2 EAPUT --> EAX 4 [EBP] MOV
1 EAPUT --> EAX 0 [EBP] MOV
0 EAPUT --> EAX EBX MOV
etc
Now, the previous example of 2DUP can be written as:
: 2DUP ( a b - a b a b )
2 +GROW ( a _ _ b 8 # EBP SUB )
2 PUT ( a b _ b EBX 4 [EBP] MOV )
3 EAGET ( a b _ b 8 [EBP] EAX MOV )
2 EAPUT ( a b a b EAX 0 [EBP] MOV )
;
which is the SwiftForth definition.
This is not only shorter than the ANS version of
2DUP, but is 'Pentium optimized' code also.
Or also, inline code could look like:
ICODE 2DUP ( a b - a b a b )
2 +GROW 2 PUT 3 EAGET 2 EAPUT RET
END-CODE
Also, now you can write the following:
: R@ ( b | R:a || b a | R:a)
1 +GROW ( - b 4 # EBP SUB )
1 PUT ( b b EBX 0 [EBP] MOV )
1 RPUT ( b a 0 [ESP] EBX MOV )
;
Which is exactly the way SwiftForth compiles R@:
SEE R@
40356F 4 # EBP SUB 83ED04
403572 EBX 0 [EBP] MOV 895D00
403575 0 [ESP] EBX MOV 8B1C24
403578 RET C3 ok
Thus, now inline coding can even be done at a higher
conceptual level, without having to write out assembly
mnemonics. This makes transporting inline code between
cpus much easier, because you mostly only have to
replace the register set mnemonics, but the code order
can remain the same to provide at least functional
equivalence.
These words also now allow very smart optimizing
compilers to be written because an easier means is
provided to conceptualize code fragments at the
sub-atomic level (fundamental stack operations) versus
at the atomic or molecular code level (ANS words and
phrases).
Towards A New FORTH
------------------
When these conceptual capabilities are fully extended
to smart optimizing compiler development, a new
language paradigm based on the fundamental stack
operations is created. This will give you the ability
to "write" programs by merely compiling BEFORE and AFTER
stack pictures. A compiler could easily process
( a b c - c a b b a ) alone.
A good compiler (writer) should be able to process a
word defined as : FOO [[ (( a b c )) - (( c a b b a )) ]] ;
and produce the correct result AND optmized code. All the
compiler (writer) would have to do is assess how the
stack GROWS and where data elements go. True also for words
defined as : FOO [[ (( x y z )) - (( {x+y} {x-z} {2*z} )) ]] ;
What has been created is a true codeless (to the programmer)
VISUAL FORTH. Under this paradigm "programing" becomes the
"art" of creating a series of stack pictures as the program.
Of course, you can still code with words too.
Creating a VISUAL FORTH paradigm of problem solving would
(should) significantly increase people's productivity, and
attract more users to Forth. With this paradigm, you are
mainly limited by your imagination to what you can create!!
For these reasons, and more. I urge acceptance of these
words and concepts into ANS FORTH.
______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup
Received on Thu Nov 23 2000 - 23:16:56 PST
Subscribe to our e-mail list service. It's free for all SwiftForth and SwiftX users!
This archive was generated 07-Feb-2012. Archive updated nightly.