Forpost Language
(Peter Sovietov. Translated and augmented by Oleg Kozlovsky)
Introduction
Language
Description
Specifics
of implementation
Dictionary
reference
1. Introduction
Forpost is an embeddable, interpreted stack-based language which has
simple, compact and effective implementation in ANSI C. A programmer
using Forpost is fully capable to grasp all intricacies of quite small
language structure and modify it if need arises. Learning Forpost takes
negligible time because this language is based on very few common and
clear principals. On the other hand, despite simplicity and minimal
internal structure this language uses heterogeneous arrays and
functions as the first-class objects, which gives it a capability to
modify and create other functions on the fly (self-modifying and
self-generating code). Forpost interpreter is safe in a sense that its
crash does't bring the system down. There is a stand-alone version of
the interpreter, so that Forpost can be used as general purpose
language.
Example of Forpost program
"tools.fp" load
# load library for print and cr words
"hello world!"
print cr # push text address on the stack, print text and line feed
Forpost interpreter may be
used interactively or from a command line which makes it an excellent
debugger and interpreter in one.
There are other examples in Forpost distribution such as simple email
application using sockets and smtp and pop3 protocols.
If you come across any Forpost word in this text with
the meaning that
may not be easily understood out of context, please see Dictionary
reference section.
2. Language description
Forpost is a stack-based language, just like Forth and Postscript. In
stack languages data is consistantly and explicitely pushed on stack
and poped from stack by executable words (word in stack languages is
analogous to functions in other ones). Then results of the execution
pushed back on stack. Mathematically Forpost's syntax corresponds to
reversed polish notation (RPN).
Syntax
Forpost language has a very simple and clear syntax. When Forpost
interpreter executes a program, it splits it into literals of different
types and executable words. Literal data is placed on corresponding
stacks
(i.e. integers go to integer stack, floats - to float stack, arrays -
to stack of addresses) and executable words are executed immediately.
Numeric literals appear in Forpost in exactly same way as they would
appear in C.
Syntactic units in Forpost are separated by blanks - spaces, new lines,
tabs.
Symbols { } " #
also used as separators.
- {...}
used to define arrays
- "..."
used to define string literals
- #... used for comments.
Everything that comes after # will be
ignored to the end of the line
- "..." {...} ; defines named
(user-defined) word
Semicolon at the end is the built-in word and is mandatory when
defining new words.
Examples
1 0xFF 3.14 #
numeric literals
"hello, world"
# string
{ 1 2 3 4 5 } #
array
print cr #
executable words
"hello" {"hello
world!" print cr}; # user defined word definition
As you can see, a user-defined word is nothing but a string-literal
appended with
the array of other words and/or literals.
Datatypes and their representation
When Forpost interpreter encounters a literal in the program's body it
pushes it on a stack ot the corresponding type, that is:
- stack - integer' stack
- fstack - floats' stack
- astack - arrays' (addresses) stack
True and False are equal to -1 and 0 respectively.
Forpost has 2 additional stacks:
- cstack - compiler's stack which can hold data integer, float or
array data
- rstack - stack of return addresses of executable words which is
not available for direct manipulation by the programmer
Arrays
Arrays are the only composite datatype in Forpost. Arrays are used to
represent:
- Text strings
- Executable words
- User defined datatypes
Duplication of an array pushes another copy of its address on astack.
Arrays themselves are permanently placed in Forpost's
memory(dictionary). Arrays start at offset 0.
Fetching an element from array causes it to be executed.
We can give an array a name by placing ; at the end
of it. Exactly the same way we define new words.
Examples(we advise you to type
examples below at
Forpost's interpreter prompt)
1 2 3 4} @ + +
+ # sum of elements
{1 2 {3 4}} 2
:@ @ + # sum of elements of internal array { 3 4 }
"hello" {
"hello, world" print cr } ; # define new word
"p" { {"post"
"script"} } ;
"f" { "for"
"th" } ;
f 0 :@ p 1 :a!
p @ print print
# :-)
Execution flow control
Besides traditional if
and ifelse
Forpost has word break
for unconditional exit out off several levels of nested
arrays, and also recurse to do
recursion. When
recurse
is the last element in an array the execution simply jumps to the
beginning of the array (tail recursion).
Example(factorial calculation)
"do" # n {a},
execute n times array a
{ dup {adup
>c @ c> 1 -} {drop adrop 2 break} ifelse recurse } ;
"i" # n - 1 2
.. n, push on integer stack n elements, from 1 to n
{ 1 swap 1 max
{dup 1 +} do drop } ;
"fact" # n -
1*2*..n, factorial
{ i dup 1 - {*}
do } ;
To try example above you can copy and paste it into a text file and
save it as "fact.fp" or "fact.txt". Then you can execute it either from
command line passing it as a parameter to the forpost interpreter, or
interactively by starting interpreter and typing "fact.txt" load
and then something like 5 fact.
3. Specifics of
implementation
Core Forpost consists of next set of .c files.
- forpost.c - virtual machine
- core.c - words of Forpost kernel
- forpost.h - header file with declarions for forpost.c and core.c
- dict.c - dictionary of Forpost primitives
The following 3 functions are necessary to call Forpost interpreter
from embedding application:
- forpost_open() - initialize interpreter
- interpret(s, size) - interpret string s of length size
- forpost_close() - deallocate Forpost resources, close interpreter
Virtual machine
All cells in Forpost memory have the same structure.
typedef struct cell {
op tag;
union {
int i;
FLOAT f;
struct cell *a;
} data;
}
cell;
Cell has 2 parts:
- tag - function pointer to the interpreter function associated
with the cell
- data - union that holds data of different datatypes
Forpost code is interpreted the following way:
for(;;) (ip->tag)();
The tag field can hold address of the primitive and also addresses of
following functions
- tX - push ip+1 to rstack, execute word at address found at
ip->data.a
- tR - return from the word: pop an element from rstack and execute
it
- tJ - unconditional jump to the address from ip->data.a
- tI - push integer literal from ip->data.i to stack
- tF - push float literal from ip->data.f to fstack
- tA - push array address from ip->data.a to astack
- tS - stop execution of compile code, return to the parsing of the
source code
Forpost primitives are expected to update ip(instruction pointer)
on their own.
Each array of size
n takes
n+2 elements in Forpost memory.
The first cell holds size of the array, then it's followed by the
array's data and the last element holds pointer to function tR.
4. Dictionary reference
How to read this reference? Symbols in black on the left from the
symbols in green
represent values present on the top of corresponding stacks, i.e.
integers, arrays or floats stacks, at the time when a Forpost word
was called. Symbols in green designate a Forpost word. Symbols in black
on right from the Forpost word, if they present, show resulting state
of top of the stack after the Forpost word in green was executed. When
you come across question mark on right side of the Forpost word it
means that result of the word's
execution is boolean value - true or false.
"foo" {code} ;
define Forpost word with name foo and action code
n array {a}
create array a out
of n elements filled with zeros
{a} length n
returns length n of
the array a
{a} @
executes a
{a} i :@
executes array a 's
element at index i
n {a} i :!
a[i] = n, stores value n in
array a at index
i
{x} {a} i :a!
a[i] = {x}, stores array x in
array a at index
i
{x} {a} i :x!
a[i] = {x}, stores array x in
array a at index
i as executable word
{a} i :a? ?
true, if an element of array is array itself
n dup n n
pushes another value of n
onto stack
n drop
removes value from the stack
n1 n2 swap n2 n1
exchange places of n1 and
n2
n1 n2 over n1 n2
n1
add copy of n1 on a top of n2
n1 n2 n3 rot n2 n3
n1
move("rotate") n1 to the top
n1 n2 2dup n1 n2
n1 n2
push copies of n1 and n2 on the stack
n1 n2 2drop
remove n1 and n2 from the stack
{a} adup
{a} {a}
add copy of a on the
stack
{a} adrop
removes a from the
pointer stack
{a1} {a2} aswap {a2} {a1}
exchange places of a1 and
a2
{a1} {a2} aover {a1} {a2} {a1}
add copy of a1 on a top of a2
{a1} {a2} {a3} arot {a2} {a3} {a1}
move("rotate") a1 to the top
n >c
place n on cstack
{a} a>c
place {a} on cstack
i :c
execute element of cstack under
index i
c>
remove top element of cstack and
execute it
cdrop
remove and discard top element of cstack
с= ?
true, if both top elements poped from cstack are equal
{a} i :>c
push element of array at the index i on
cstack
{a} i :c!
pop value from cstack and insert it into array a under
the index i
{a1} {a2} n copy
copy n elements of a1 to a2
{a1} {a2} a= ?
check if addresses a1 and
a2 are equal
n {a} if
execute a if n is not equal 0
n {a1} {a2} ifelse
execute a1 if n is not equal 0, otherwise execute
a2
n break
remove from rstack n top element,.
this is unconditional exit from nested arrays
n1 n2 + n3
n3 = n1 + n2
n1 n2 - n3
n3 = n1 - n2
n1 n2 * n3
n3 = n1 * n2
n1 n2 / n3
n3 = n1 / n2
n1 n2 mod n3
n3 = n1 % n2
n1 n2 u/mod r q
unsigned division where r is
remainder and q is quotient
n negate n
n = -n
n abs n
n = abs(n)
n1 n2 < ?
n1 n2 = ?
n1 n2
> ?
n1 n2
u< ?
unsigned comparison
n not ?
returns true if n = 0
n1 n2 min n1 | n2
returns the lesser value out of two
n1 n2 max n1 | n2
returns the greater value out of two
n1 n2 and n3
n3 = n1 & n2
n1 n2 or n3
n3 = n1 | n2
n1 n2 xor n3
n3 = n1 ^ n2
n invert n
n = ~n
n1 n2 lshift n3
n3 = n1 << n2
n1 n2 rshift n3
n3 = n1 >> n2
n emit
output to stdout symbol with code n
"text" n evaluate
interpret the text of length n
"file" load
read and interpret content of the file
{a} abort
stop interpretation, clear the stacks and execute a