FORMALIZATION OF BACKTRACKING IN FORTH M.L.Gassananko St.Petersburg, Russia mlg@iias.spb.su This work was partially supported by the grant INTAS 93-845 In our previous work we have presented a formalization of control transfers due to return stack manipulations. In this paper we show that this formalism is enough powerful to describe changing to a different execution scheme--backtracking. We also mention some more possible execution schemes. 1. INTRODUCTION The main idea of the techniques, formalization of which we present in this paper is the following approach: first describe a problem in concepts natural to it, then implement these concepts, along with an interpreter for them. Shortly after the advent of first computers, it was realised that the concepts used directly by the processor, and the order in which computer must perform actions are inappropriate for reasoning about problems. Then appeared high-level languages, and problem-oriented languages; it was realised that procedures must express concepts of the problem rather than just save memory; finally, such techniques as backtracking and data-driven approach not only map problem concepts onto program units, but use a special execution mechanism for them. (OOP and Forth also may be considered as steps in the same direction). A classical model Forth system is well-suited for the studies on the mechanisms of execution, since its interpreter data structures are open and may be accessed and changed by programs. This unique feature is very important; unfortunately, it is not in the ANSI standard, probably, partially because it cannot be found in other languages. During the last 5 years theories of stack effects checking and type inference were developed [Poi94 (review), Poi91, Poi93, StK92, StK93, Sto93], but these formalisms do not address manipulations with executable code addresses. In [Gas95] we have presented a formalism that describes how manipulations with return stack information reflect in control flow changes. In [Gas93R,Gas94] we have presented BacFORTH--a Forth extension that supports backtracking. Backtracking is implemented there via return stack manipulations, as well as a number of unstandard control structures. The use of backtracking implies some additional discipline with respect to the return stack and to the stack of continuations: success is implemented as a call of continuation (i.e. of the residue of the caller procedure code), failure is a return from continuation, the return stack contains backtracking information while the continuation stack holds continuation addresses that cannot be anymore kept on the return stack. The discipline implies that data must not interfere with the backtracking information (it is possible to keep data on the continuation stack instead of the return stack), that one should take into account what happens when backtracking (especially with respect to the stack balance); each backtrackable procedure assumes that when it receives control, the return stack top contains the address to which control must backtrack on failure; each backtrackable procedure is allowed to leave any information on the return stack, but the top value must point to the code that must receive control when backtracking. The rest of this section briefly describes the [Gas95] paper. 1.1 THE FORMALISM: MOTIVATION AND GOALS The fundamental difference between data and return addresses manipulations is that return addresses manipulations may result in control transfers. The return addresses manipulations are proven to be a very powerful technique and are an important part of Forth since they enable us to define both classical and unstandard control structures and to implement new methods of program execution, such as backtracking, self-processing data, etc. Unfortunately, return addresses manipulations today are often considered as a bad style, since they are not present in other languages, not formalised and not included into standards. We need a calculus that provided a formal description of existing (and, probably, yet unknown) techniques that use return addresses manipulations to organize code execution. Examples (for a classical system): : BRANCH R> @ >R ; \ ( IP: dest -- ) ( ip:=dest ) : UNNEST RDROP ; \ ( R: a -- ) ( ip:=a ) a.k.a. EXIT \ Backtracking: : ENTER >R ; \ ( addr -- ) call the code at addr : 1-10 ( -> n n ) ( <- n ) \ look over numbers 1..10 1 BEGIN DUP R@ ENTER 1+ DUP 11 = UNTIL DROP RDROP ; : TEST \ print the numbers from 1 to 10 1-10 . ; More problems: threaded code may contain both code and data. Example: \ LIT fetches a value from the threaded code \ and leaves it on the stack : LIT ( -- x ) ( IP: x -- ) R@ @ R> CELL+ >R ; Such word is compiled before any integer that appears in a colon definition. The definition: : X 5 . ; compiles to: [ code field ] [ LIT ] [ 5 ] [ . ] [ EXIT ] So, we need a criterion to distinguish between code and data. 1.2 THE FORMALISM: BASIC NOTIONS a) TCE - Threaded Code Element. Either a compiled token of a word or a data element (those typically follow compiled words that process them). TCEs are of known (probably, different) sizes. TCF - Threaded Code Fragment. A sequence of TCEs, typically is processed by the threaded code interpreter. We cannot predict the length of a TCF. TC - an abbreviation for "threaded code" b) Position. The difference between code and data is the way of processing. Code is processed only by the code interpreter. Data get processed by other functions and not by the code interpreter. A TCE may be in one of the following 4 positions: Active, if it is code: it gets processed only by the code interpreter. Passive, if it is data: it gets processed by other methods. Mixed, if is both code and data: it gets processed in both ways (it may be self-modifying code or code considered as data). Indifferent, if it is never processed at all (we may call it garbage). c) Equivalence of TCFs. (functional) DUP SWAP DROP == NOOP OVER SWAP == >R DUP R> Two code fragments are equivalent if each of them in any context in active position may be substituted by another. 2 DEFINITION OF HIGER-ORDER CODES Abbreviations: TC - threaded code TCF - threaded code fragment Ordinary threaded code is called 1-order threaded code. 0) Construction of higher-order threaded code starts from the 1-order threaded code. There are following notions: the continuation (for a procedure, when it is being invoked) is the resudue of TC in the TCF that invokes this procedure (at the moment of invocation the continuation address is in IP); success (in general) is a control transfer to the continuaion (this may be done in different ways: a jump, a call, etc.); exit (in general) also is a control transfer to the continuaion (exit and success are different actions; when exit is a jump, then success is a call) 1-return stack (1-continuation stack) is the return stack; 1-exit (1-return) - is the traditional EXIT, i.e. a control transfer to the address taken off the return (1-continuation) stack; 1-call is the traditional call of a TCF: the current IP value is placed onto the 1-return stack, and IP gets set to the start of the TCF; 1-trace stack does not exist. (i+1)-order threaded code is built from the i-order threaded code in the following way: 1) i-success is an i-call of thecontinuation; 2) i-return (i-backtracking) is an i-exit from the continuation; 3) i-failure (of a procedure) is an action that forces i-backtracking from the TCF that invoked the procedure; 4) one more stack is created: (i+1)-return ((i+1)-continuation) stack; 5) (i+1)-trace stack is the i-continuation stack; 6) i-reversible operation is such operation that performs reversal actions when i-backtracking; 7) (i+1)-call: place (i-reversibly) the continuation address onto the i-continuation stack; when i-backtracking, remove it from there and force i-failure of the procedure; 8) the i-PRO action complements i-call to the (i+1)-call (i.e. moves the continuation address from the i-continuation stack to the (i+1)-continuation stack, when backtracking it discards it from there and forces i-failure) the i-CONT action is i-success (i-call of continuation) using the continuation address at the (i+1)-continuation stack; 9) (i+1)-exit is i-success; when control i-returns from the continuation, i-backtracking is raised (i.e. this is i-CONT i-EXIT); For example, for 2-order TC: 1) 1-success is a call of continuation 2) 1-return (1-backtracking) is a return from the continuation; 3) 1-failure forces a return from the TCF that invoked the failing procedure; 4) 2-continuation stack is L-stack (with the access words >L , L> , etc.); 5) 2-trace stack is the return stack; 6) a reversible operation performs reversal actions when backtracking; 7) 2-call: place the continuation address onto L-stack; when backtracking remove the continuation address and raise backtracking again; 8) the 1-PRO action is implemented as: : ENTER >R ; : PRO R> R> >L ENTER L> DROP ; the 1-CONT action is implemented as: : CONT L> >R R@ ENTER R> >L ; 9) 2-exit is CONT EXIT . 3. BASICS OF THE FORMALISM 3.1 THE TERMS AND THE NOTATION Following the Forth tradition, the term "stack" will refer by default to the data stack, while other stacks will be termed by their full names. We will use the following latin letters to denote: t..z --- threaded code fragments (TCF); f,g,h --- functions (occupying exactly one threaded code element); M,N --- integer literals; P,Q --- properties (feaures); i..n --- natural numbers. The meanings of separate signs: [ and ] denote the boundaries of a TCF; ' - address in active position; ^ - address in passive position; , - a single threaded code element; r - mention of the return stack; l - mention of the L-stack (the stack of continuations [Gas94]); : - having a feature; ! - assertion. These signs are the elements from which all other symbols are assembled. In the Forth language all they have a different meaning. As in Forth, the symbols are separated by spaces. Equivalence will be denoted by the symbol == and will be understood as possibility to substitute one threaded code fragment for the other in active position in any context. The symbol =/= will denote inequivalence. The =>= sign will be used in the cases where substitution is possible, but some inormation gets lost: A =>= B means that (probably, in view of some other statements) X == A => X == B (the => symbol denotes implication) The notation x:P may read as "x has the property P" or "x that has the property P". When some properties are consequences of other properties, we will use the symbol :! to underline that some properties refer to the consequence rather than to the premise. For example, x:S2:!S1 is a concise notation for x:S2 => x:S1 (translation: if a TCF ignores (neither affects nor depends upon) two top data stack elements, it is also a fragmwnt that ignores one top stack element). When necessary, we will bracket the code fragment to which the property applies. For example, [ x:R y:R ]:!R (translation: the code fragment [ x y ], composed from two code fragments that ignore the return stack, also ignores the return stack). Now we give explanation of other symbols: ; - denotes EXIT [r x '] y - call of the threaded code fragment x with the 'r[ y ] x continuation y; that is, place the address of y onto the return stack and transfer control to x. The TCF y is in active position. [r x ^] y - call of the threaded code fragment x with the ^r[ y ] x continuation y; that is, place the address of y onto the return stack and transfer control to x. The TCF y is in inactive position. '[ x ] - place address of the threaded code fragment x, in active position, onto the data stack. ^[ x ] - place address of the threaded code fragment x, in inactive position, onto the data stack. ,'[ x ] - a threaded code element that contains a reference to a TCF in active position. ,^[ x ] - a threaded code element that contains a reference to a TCF in inactive position. ,[ x ] - the same as .'[ x ] ^,^[ x ] - the address of a threaded code element that contains a reference to the TCF x in inactive position. ^,[ x ] - the address of a threaded code element that contains a ^,'[ x ] reference to the TCF x in inactive position. pre u x - The TCF x is located in memory next to the TCF u. or pre[ u ] x ,f - Compilation token of the function f (only one TC element) Act: x - The TCF x is in active position. !Act: x - Assertion that x is in active position %A - universal quantifier %E - existential quantifier We may change from "^" to "'" assuming that address is in active position. We cannot change from "'" to "^", though, since doing so we would assume that not only the interpreter, but all the methods of processing do not distinguish between equivalent (to interpreter) fragments, which is not true. The use of functions in code fragments without preceding commas "," underlines that the actual number of compiled tokens is not important. Strictly speaking, the Act: property is a property not of a code fragment but of its context. It is convenient to assign it to an object in the context, though. Special means. We will consider each formula as surrounded by special symbols "<<" and ">>", if they do not appear in it. By default, these symbols may be substituted by any code fragment. Hence x == y is a concise notation for %A << %A >> << x >> == << y >> (which conforms with the axiom 7 given below). Example: [r x '] == 'r[ >> ] x means << [r x '] >> == << 'r[ >> ] x For repeating fragments we will use the following notation: ( repeat_specification ){ repeating_fragment } For example, we will write (5){ N } instead of N N N N N (i=1..5){ N[i] } instead of N[1] N[2] N[3] N[4] N[5] and so on. Along with this notation in some cases we will use the more customary one (namely, in cases where it is shorter), for example: N1 .. Nm is the same as (i=1..m){ Ni } . Note. The BacFORTH system [Gas94] includes the contol structures START...EMERGE and BACK...TRACKING: x START y EMERGE z == x [r y ; '] z x BACK y TRACKING z == x 'r[ y ; ] z Comments [not necessarily strict] in English are given in parenteses, that facilitate understanding but not always may serve instead of formal text and are not a part of the theory. 3.2 BASIC AXIOMS (1) '[ t >> ] >R ; == t (placing a TCF address onto return stack and transferring control to it via EXIT is equivalent to execution of this TCF) (2) [r t '] u == 'r[ u ] t (there are two notations for placing a TCF address onto return stack) (3) If x is defined as : x t ; then x == [r t ; ^] (4) ^r[ w ] RUSH> ,f == 'r[ ,f w ] ; ( RUSH> is equivalent to exit to a threaded code fragment that starts with and continues like that which address is on the return stack top; see also the section 2.5) (5) 'r[ t ] r> == '[ t ] (the word R> moves a TCF from return stack to data stack) (6) ( %A N : N x == N y ) ==> x == y (if for all integer N, N x is equivalent to N y, then x and y are equivalent) (7) x == y <=> ( %A u,v : u x v == u y v ) (x is equivalent to y, if and only if they have the same behaviour in any context (in active position); this is the definition of equivalence) (8) ^[ Act: x ] == '[ x ] (the tick "'" means an active position) (9) pre[ u ] x =>= x (if a TCF is in active position, the interpreter does not process the addresses before it (the interpreter always "goes forwards"); we can substitute pre[ u ] x for x, but we lose some information about the exact contents of memory) (10') ^[ ,'[ u ] v ] REF@ == '[ u ] (10^) ^[ ,^[ u ] v ] REF@ == ^[ u ] (the word REF@ receives the address of reference to a TCF, and leaves the address of that TCF) (11) ^[ ,[ u ] v ] REF+ == ^[ pre,[ u ] v ] (the word REF@ receives the address of a TCF that starts with a reference (to another TCF), and leaves the address of code that follows the reference) (12) %A t [ ; t ] == ; (an EXIT makes the interpreter ignore the code that follows the EXIT) (13) Application of data processing operations to addresses in active position is a contradiction. Examples: '[ u ] REF@ --- contradiction '[ u ] REF+ --- contradiction (14) %A t == u ; : %E v,w : t == u ; == v 'r[ w ] ; (an EXIT assumes an address in active position on the return stack top; even if the program has a bug, we may think of the return stack top as an address of code that hangs the system up; we will not examine possible equivalence of different hangups). Note. EXIT assumes a TCF address on the return stack top, its position is either active or mixed. 3.3 DEFINITIONS I Here we repeat basic definitions from [Gas95]. It turns out that in static analysis of return addresses manipulations, one of the most important properties of a code fragment is to ignore (neither affect nor depend on) some values on the stack(s). Def.1. S1( t ) is a TCF (threaded code fragment) such that %A N : ( N S1( t ) == t N ) ( S1( t ) behaves like t , but ignores the value on the data stack top) Examples: S1( DROP ) == NIP (Indeed, %A N N NIP == DROP N ) S1( SWAP ) == >R SWAP R> Def.2. S[m]( t ) is a TCF such that %A N1,...,Nm : ( N1 ... Nm S[m]( t ) == t N1 ... Nm ) We will omit brackets when concrete numbers substitute for m, i.e. we will write S3 instead of S[3] . ( S[m]( t ) behaves like t but ignores m top stack elements) Def.3. We will write t:S[m] , if %E y : t == S[m]( y ) . Note. We will use the signature t:S[m] in the same way as adverbs and adjectives are used in natural languages, e.g.: %A N,M : N t:S1 DROP == M t DROP ( t:S[m] -- TCF t that ignores m top stack elements) Def.4. We will write t:S , if t == S1( t ) . ( t:S ignores stack depth and contents) Def.5. S[-1]( t ) is a TCF such that S1( S[-1]( t ) ) == t (the "inverse operation to S1( )" ) Note. S[-1]( t ) exists not for all t . Def.6. S[-m]( t ) is a TCF such that S[m]( S[-m]( t ) ) == t If t:S[m] , i.e. %E y : t == S[m]( y ) , then y == S[-m]( t ) . In a similar maner we introduce the notions of R1, R[m], R[-m], t:R[m], t:R. Def.7. R1( t ) is a TCF such that %A N : ( N >R R1( t ) == t N >R ) ( R1( t ) behaves like t , but ignores the value on the return stack top) Def.8. R[m]( t ) is a TCF such that %A N[1], ... N[m] : (i=1..m){ N[i] >R } R[m]( t ) == t (i=1..m){ N[i] >R } Def.9. We will write t:R[m] , if %E y : t == R[m]( y ) . Def.10. We will write t:R , if t == R1( t ) . (t ignores the return stack; this property is very important for us) Def.11. R[-1]( t ) is a TCF such that R1( R[-1]( t ) ) == t (the "reversal operation" to R1( ) ) Def.12. R[-m]( t ) is a TCF such that R[m]( R[-m]( t ) ) == t Examples. S1( DROP ) == >R DROP R> == NIP S[-1]( NIP ) == DROP S1( EXIT ) == DROP EXIT [ SWAP ]:R [ R> DROP ]:S [ ABORT ]:R:S 4 FORMALIZATION OF 2-ORDER (BACKTRACKABLE) THREADED CODE In this section we describe 2-nd order threaded code. We introduce a new notion of equivalence (b-equivalence) which will be used to describe the 2-nd order TC (or 2TC). Usage of 2TC implies some discipline, and b-equivalence will assume the same discipline (2-order procedure may fail which results in exiting the code fragment in which it is used, and therefore the return stack top must contain a TCF address; a 2-nd order procedure makes no assumptions about what is below the return stack top; but, nevertheless, it may assume that other procedures also follow this discipline). 4.1 PROPERTIES OF TC FRAGMENTS THAT PERFORM EXIT Def.1. t:US1 if %A N : N t == N t DROP N (the TCF t:US1 does not change (although can use) the top stack value) Examples: [ DUP ]:US1 [ SWAP 1+ SWAP ]:US1 Note. Evidently, t:S1:!US1 Def.2. t:US[m] if %A N1 .. Nm : N1 .. Nm t == N t (m){ DROP } N1 .. Nm (the TCF t:US[m] does not change (although can use) m top stack values) Note. Evidently, t:S[m]:!US[m] Def.3. t:Ign if %A u : t u == t ( t:Ign ignores its continuation, i.e. does not pass control to the code that follows it) Note. t:Ign <=> t == t ; Indeed, %A u : t:Ign u == t => t ; == t and vice versa, from %A u : ; u == ; and t == t ; we conclude that %A u : t u == t Def.4. t:REI if t:Ign and [r t ']:R It should be noted that the definition of REI assumes that the address at the return stack top is in active position (is processes only by the code interpreter). Examples: [ DROP EXIT ]:REI [ IF 1+ EXIT THEN 1- EXIT ]:REI Def.5. In most cases below we will assume that the TCF shown in the formula receives control when the return stack top contains an address of TC fragment, which we will denote by r>> . The equations that need this assumption will be marked by the symbol (r). So, the formula (r) u == v is an abbreviation for %A << %A >> %A r>> : << 'r[ r>> ] u >> == << 'r[ r>> ] v >> Note. From purely formal point of view, the (r) symbol introduces a restriction which formally enables us to use the r>> symbol. From the point of view of underlying practice, the (r) symbol is not an additional restriction, since the code that requires a TCF address on the return stack will crash if it is not there. Def.6. Let us introduce a similar notation for the assumption that >>:REI : (e) u == v is an abbreviation for >>:REI => u >> == v >> which, in its turn, is an abbreviation for %A << %A >>:REI : << u >> == << v >> Statement.1. [r 'r[ ; ] t:REI '] == [r t '] Proof. [r 'r[ ; ] t '] == [r [r t ']:R ; '] == [r t '] Statement.2. (r) [r t:REI '] ; == t (r) 'r[ ; ] t:REI == t Proof. (r) t == 'r[ r>> ] t == [r t '] r>> == [r [r t '] ; '] r>> == 'r[ r>> ] [r t '] ; == == [r t '] ; == 'r[ ; ] t Property [ t:R ; ]:!REI (when we add EXIT to the end of a R-type TC fragment, we get a REI-type TC fragment) Statement.3. %A u:REI %E v:R : u == v ; %A v:R %E u:REI : v == [r u '] Proof. In view of the previous theorem and the definition of REI the code fragments defined as: v == [r u '] u == v ; == [r u '] ; meet the statement of the theorem. So, there is one-to-one mapping between R-type and REI-type TC fragments. (A TCF which performs a call of REI fragment is an R-type fragment; a REI fragment may be formed from an R-type fragment by addition of an EXIT to its end.) Def.7. (b-equivalence) (b) u == v if %A << %A r>> %A >>:REI << 'r[ r>> ] u >> == << 'r[ r>> ] v >> In other words, (b) u == v means that (r) (e) u == v Property. u == v => (b) u == v Property. (b) u:REI == v:REI => [r u '] == [r v '] One can note that all mathematics in the proofs of S[m] properties remain valid if we change equivalence for b-equivalence. Therefore we can e.g. use comparison theorems, although we can state only b-equivalence in case that the TC fragments are of RE-type. Def.8. t:RE if %A v:R : [r t v ; ']:R (if u:R, then call of the TCF [ t v ; ] gives a TCF of type R ) (indicating %A v:R we require that t does not leave anything but a TCF address on the return stack top) Note. Restricting ourselves to RE-type TC fragments, we exclude TC fragments that use data stack to keep information on which the return stack effect depends from consideration (it is not difficult to construct a definition that we probably would like to classify as RE, but which doesn't fit this definition--e.g. a definition which moves some return stack values to the data stack and succeeds, and when backtracking restores them and fails: this definition will crash if v:R modifies the saved values). Property. If v == NOOP , then it follows from the axiom 14 that 'r[ ; ] t:RE may be represented as u 'r[ w ] . Statement.4. t:RE <=> %A u:REI : [ t u ]:REI (Equivalent formulation of RE-type definition) Proof. 1) Let us show that t:RE => %A u:REI : [ t u ]:REI Let %A v:R : [r t v ; ']:R Let u:REI , then v == [r u ']:R and [r t v:R ; ']:R == [r t [r u '] ; '] == [r t 'r[ ; ] u '] == (since 'r[ >> ] t may be represented as u 'r[ w ] ) == [r t u '] 2) Let us show that from %A u:REI : [ t u ]:REI follows %A v:R : [r t v ; ']:R At first, [ t u ]:REI => [r t u ']:R Then, %A v : [ v:R ; ]:!REI => %A v:R : [ t [ v ; ]:REI ]:!REI => %A v:R : [r t v ; ']:R Let us pass on to the discussion of the properties of the RE property. Property. t:REI:!RE ( REI is a particular case of RE) Property. t:RE:Ign:!REI (if a TCF t:RE ignores its continuation, then t:REI) Property. (b) u:RE == 'r[ ; ] u The proof follows from that [ u >>:REI ]:!REI Property. (e) u:RE 'r[ ; ] == u (This statement may be rather assigned to the properties of >>:REI) Property. (b) x:RE == y => y:RE Proof. Let y:/RE, i.e. %E v:REI : [ y v ]:/REI But then (r) y v == x v, which contradicts to that [ x:RE v:REI ]:REI Property. If (b) u == v and (b) x:RE == y:RE , then: 1) (b) u x == v x 2) (b) u x == u y 3) (b) u x == v y Proof. 1) Let u x =/= v x , i.e. %E << %E r>> %E t:REI (r) [ u x t ] =/= [ v x t ] But then, substituting >> == x t into the definition of b-equivalence, we come to a contradiction to that (b) u == v (b) u == v 2) By analogy, if u x =/= u y , then we come to a contradiction substituting << == u into the definition of b-equivalence. 3) (b) u x == v x == v y Property. t:R:!RE Property. [ t:RE u:REI ]:!REI Property. [r t:RE u:REI ']:!R Property. [ u:RE v:RE ]:!RE In a similar manner we define the properties S1E, S1EI, US1E, US1EI, S[m]E and so on: t:XE if %A u:R:X : [r t u ']:X t:XRE if %A u:R:X : [r t u ']:X:R t:XEI if t ; == t and [r t ']:X t:XREI if t ; == t and [r t ']:X:R where X stands for some property. 4.2 DEFINITIONS II Def.1. BDROP == >R 'r[ R> ; ] DROPB == 'r[ DROP ; ] Def.2. S1B( t:RE ) is such TCF that %A N : (e) S1B( t ) 'r[ N ; ] == 'r[ N ; ] t ( S1B( t ) bahaves like t, but does not affect the stack top when bactracking) Def.3. S[m]B( t:RE ) is such TCF that %A N1, ..., Nm (e) S[m]B( t ) 'r[ N1 ... Nm ; ] == 'r[ N1 ... Nm ; ] t Def.4. S[-m]B( t:RE ) is such TCF that S[m]B( S[-m]B( t ) ) == t (a "reversal operation" for S1B( t ) ) Def.5. t:SB if S1B( t ) == t Def.6. BS1( t:RE ) is such TCF that (b) N BS1( t ) >>:US1EI == 'r[ N ; ] t N DROPB ( BS1( t ) does not affect the return stack top neither in "forwards" execution nor when backtracking) Statement.4. (Equivalent formulation for the definition of BS1( t ) ) t:RE => %A N : %A x:US1:R : N [r BS1( t:RE ) x:US1:R ; '] == [r t N x DROP ; '] N Proof. ( => ) Let us show that the statement is true in view of definition of BS1( t ). Let us consider arbitrary N and x:US1:R . At first, we rewrite: 'r[ >> ] N BS1( t ) x ; == [r N BS1( t ) x ; '] r>> Then 'r[ N ; ] t:RE N DROPB x ; == == [r [r t N x 'r[ DROP ; ] ; '] N ; '] r>> == == [r [r t N x DROP ; '] N ; '] r>> == == [r t N x DROP ; '] N r>> One can mention that in view of definition of BS1( t ) these two expressions are equivalent. ( <= ) Let us show that the definition of BS1( t ) may be derived from the statement of the theorem. Let us consider arbitrary N , r>> and >>:REI:US1EI. Then N [r BS1( t ) [r >> ']:US1:R ; '] r>> == == [r N BS1( t ) >> '] r>> == 'r[ r>> ] N BS1( t ) >> [r t N [r >> ']:US1:R DROP ; '] N r>> == == [r t N 'r[ DROP ] >> '] N r>> == == 'r[ N r>> ] t N DROPB == == 'r[ r>> ] 'r[ N ; ] t N DROPB Since we assume that the statement of the theorem is true, these two lead us to the original definition of BS1( t ). BS[m]( t ) is defined by analogy: Def.7. BS[m]( t:RE ) is such TCF that %A N1 ... Nm : %A x:US[m]:R : N1 ... Nm [r BS[m]( t ) x:US[m]:R ; '] == == [r t N1 ... Nm x (m){ DROP } ; '] N1 ... Nm Def.8. t:BS if BS1( t ) == t Def.9. BS[-1]( t:RE ) is such TCF that BS1( BS[-1]( t ) ) == t Note. BS[-1]( t ) may exist not for any t . ("reversal operation" for BS1( t ) ) Def.10. BS[-m]( t ) is such TCF that BS[m]( BS[-m]( t ) ) == t if t:BS[m] , i.e. %E y : t == BS[m]( y ) , then y == BS[-m]( t ) . 4.3 PROPERTIES OF S[m]B Note that S1B( t ) does not obligatory exist. Statement.5. (b) S1B( S1B( t:RE ) ) == S2B( t ) Proof. %A N, M : (e) S1B( S1B( t ) ) 'r[ N ; ] 'r[ M ; ] == == 'r[ N ; ] S1B( t ) 'r[ M ; ] == == 'r[ N ; ] 'r[ M ; ] t The proofs of the following statements are analogous to those of the similar statements for S[m]. Statement.6. (b) t == S1B( t ) => %A m : t:S[m]B Statement.7. %A N : (b) DROPB S1B( t ) 'r[ N ; ] == t Statement.8. t:S1B <=> %A N : (b) DROPB S1B( t ) 'r[ N ; ] == t Consequence. %A N : (b) S[-1]B( t ) == DROPB t 'r[ N ; ] Statement.9. S[-1]B( S1B( t ) ) == t Statement.10. (Comparison by S[-1]B ) Let t:RE:S1B, u:RE:S1B S[-1]B( t ) == S[-1]B( u ) <=> t == u Statement.11. (Comparison by S1B ) Let t:RE, u:RE S1B( t ) == S1B( u ) <=> t == u Statement.12. If S1B( t ), S1B( u ) exist, then S1B( t ) S1B( u ) == S1B( t u ) Consequence. [ t:S1B u:S1B ]:!S1B Statement.13. u:RE:S1B, v:RE:S1B => (b) S[-1]B( u ) S[-1]B( v ) == S[-1]B( u v ) Statement.14. (b) S[-1]B( S[-1]B( t:S2 ) ) == S[-2]B( t ) Def.1. S[0]B( t ) == t Statement.15. For any integer m and n S[n]( S[m]( t ) ) == S[n+m]( t ) If the system contains additional stacks, one can prove similar statemnts for them, too. Statement.16. (b) S[-1]B( S[-1]( t:RE ) ) == S[-1]( S[-1]B( t ) ) Proof. (b) DROPB S[-1]( t ) 'r[ N ; ] == == DROPB M t DROP 'r[ N ; ] == == M DROPB t 'r[ N ; ] DROP == == S[-1]( S[-1]B( t ) ) Statement.17. (b) S1( S1B( t ) ) == S1B( S1( t ) ) Proof. (in view of the previous theorem) (b) S[-1]( S[-1]B( S1B( S1( t ) ) ) ) == == t == S[-1]B( S[-1]( S1( S1B( t ) ) ) ) == == S[-1]( S[-1]B( S1( S1B( t ) ) ) ) Now in view of the theorems of comparison by S[-1] and S[-1]B (b) S1( S1B( t ) ) == S1B( S1( t ) ) Statement.18. t:R:!SB i.e. (b) S1B( t:R ) == t Proof. (b) 'r[ N ; ] t == t 'r[ N ; ] 4.4 PROPERTIES OF BS[m] Statement.1. (b) BS1( BS1( t ) ) == BS2( t ) Proof. %A N : %A M : (b) N M BS1( BS1( t ) ) x:US2:R ; == == N 'r[ M ; ] BS1( t ) 'r[ DROP ; ] M x ; == == 'r[ M ; ] 'r[ N ; ] t 'r[ 2DROP ; ] N M x ; Statement.2. (b) t == BS1( t ) => %A m t:BS[m] Proof. %A m (b) t == BS1( t ) == BS1( BS1( t ) ) == ... == BS[m]( t ) Statement.3. %A N : (b) t:RE == N DROPB BS1( t ) BDROP Proof. %A N (b) N DROPB BS1( t ) BDROP == == 'r[ r>> ] 'r[ DROP ; ] N BS1( t ) >R 'r[ R> ; ] >>:REI == == [r 'r[ DROP ; ] N BS1( t ) >R 'r[ R> ; ] >> '] r>> == == [r [r N BS1( t ) >R 'r[ R> ; ] >> '] DROP ; '] r>> == == [r [r N BS1( t ) >R [r >> '] R> ; '] DROP ; '] r>> == == [r [r t N S1( [r >> '] ) DROP ; '] N DROP ; '] r>> == == [r [r t [r >> '] ; '] ; '] r>> == == 'r[ r>> ] [r t [r >> '] ; '] ; == == 'r[ ; ] t:RE 'r[ ; ] >>:REI == t Statement.4. %A t:RE : t:BS1 <=> %A N : (b) BS1( N DROPB t BDROP ) == t Proof. ( <= ). %E u == [ 0 DROPB t BDROP ] : (b) BS1( u ) == t => t:BS1 ( => ). (b) BS1( N DROPB t:BS1 BDROP == == BS1( N DROPB BS1( BS[-1]( t ) ) BDROP ) == ( BS[-1]( t ) exists, since t:BS1 ) == BS1( DROPB 'r[ N ; ] BS[-1]( t ) N DROPB BDROP ) == == BS1( BS[-1]( t ) ) == t Consequence. From the statement of the theorem and the definition of BS[-1]( t ) it follows that %A N : (b) BS[-1]( t:RE ) == N DROPB t BDROP , and vice versa---the statement of the theorem follows from this statement. Statement.5. BS[-1]( BS1( t:RE ) ) == t Proof follows from the two previous theorems. Statement.6. ( Comparison by BS[-1] ) Let t:RE:BS1, u:RE:BS1 . Then (b) S[-1]( t ) == S[-1]( u ) <=> (b) t == u Proof. ( <= ) It is evident. We understand b-equivalence of TC fragments so that it cannot be not true. ( => ) (b) BS[-1]( t ) == BS[-1]( u ) => (b) BS1( BS[-1]( t ) ) == BS1( BS[-1]( u ) ) => (b) t == u Statement.7. ( Comparison by BS1 ) (b) BS1( t ) == BS1( u ) <=> (b) t == u Proof. ( <= ) By understanding of TCF equivalence. ( => ) (b) t == BS[-1]( BS1( t ) ) == BS[-1]( BS1( u ) ) == u Statement.8 If BS1( t ), BS1( u ) exist, then BS1( t u ) == BS1( t ) BS1( u ) Proof. (b) N BS1( t ) BS1( u ) BDROP == == 'r[ N ; ] t N DROPB BS1( u ) BDROP == == 'r[ N ; ] t DROPB 'r[ N ; ] u N DROPB == == 'r [ N ; ] t u N DROPB == N BS1( t u ) Consequence. [t:BS1 u:BS1]:!BS1 Statement.9. [r BS1( t:RE ) S1( u:R ) ; '] == S1( [r t u ; '] ) Proof. N [r BS1( t ) S1( u ) ; '] == == [r t N S1( u ) DROP ; '] N == == [r t u '] N == N S1( [r t u ; '] ) Statement.10. [r BS1( t:RE ) u:US1:R ; ']:!US1 Proof. %A N : N [r BS1( t ) u ; '] == [r t N u DROP ; '] N == == [r t N u DROP ; '] N == [r t N u DROP ; '] N DROP N == == N [r BS1( t ) u ; '] DROP N Statement.11. u:BS1:RE, v:BS1:RE => (b) BS[-1]( u ) BS[-1]( v ) == BS[-1]( u v ) Proof. %A N %A >>:S1EI : [r BS[-1]( u v ) >> '] r>> == == [r N u v [ BDROP >> ]:S1EI '] r>> == == [r N u [ v BDROP >> ]:S1EI '] r>> == == [r BS[-1]( u ) N DROPB [ v BDROP >> ]:S1EI '] N r>> == == [r BS[-1]( u ) [r N v BDROP >> '] DROP ; '] N r>> == == [r BS[-1]( u ) [r BS[-1]( v ) N DROPB BDROP >> '] N DROP ; '] N r>> == == [r BS[-1]( u ) BS[-1]( v ) >> ] N r>> Statement.12. If exists S1B( S1( t ) ) , then S1B( S1( t ) ) == BS1( t ) Note. As has been shown above, S1B( S1( t ) ) == S1( S1B( t ) ) . Proof. %A N : %A >>:US1EI (b) N S1( S1B( t ) ) == == S1B( t ) N >>:US1EI == == S1B( t ) N [r >> ']:US1 DROP N ; == == S1B( t ) 'r[ N ; ] DROPB N >> == == 'r[ N ; ] t DROPB N >> == == BS1( t ) Statement.13. BS1( t:R ) == S1( t ) Proof. S1( t:R ) == S1( S1B( t ) ) == BS1( t ) Statement.14. BS1( t:REI ) == BDROP t Proof. The definition of BS1 is: (b) N BS1( t ) >>:US1EI == 'r[ N ; ] t N DROPB Let us show that BDROP t satisfies it. At first, since t:REI, %A z : [ t z ] == t . Therefore 'r[ N ; ] t N DROPB == 'r[ N ; ] t and since BDROP == >R 'r[ R> ; ] , and [r t:REI ']:R (b) 'r[ N ; ] t == N BDROP t 4.5 PRACTICALLY IMPORTANT STATEMENTS ABOUT 2-ND ORDER TCFs One more stack (the continuation stack, or L-stack) is used to implement backtracking. For brevity, we omit definitions of L[m]( t ) , BL[m]( t ) , t:L , etc., since they are the same as corresponding definitions for data stack ( S[m]( t ) , BS[m]( t ) , t:S and so on). In [Gas95] we have showed that PRO == R> >L 'r[ LDROP ; ] where PRO is defined as : PRO R> R> >L ENTER LDROP ; The word PRO is used in BacFORTH [Gas94] to complements a 1-st order call (i.e. ordinary one) to 2-nd order call (that is used for backtrackable code). The word CONT performs a success (a call of continuation) using the continuation address placed onto the L-stack by PRO. Statement.1. CONT == 'r[ >> ] L@ BLDROP >R ; Proof. By definition: CONT == [r L> >R R@ ENTER R> >L ; '] == == 'r[ >> ] L> >R R@ 'r[ R> >L ; ] >R ; BLDROP == L> >R 'r[ R> >L ; ] 'r[ >> ] L@ BLDROP >R ; == == 'r[ >> ] L@ L> >R 'r[ R> >L ; ] >R ; == == 'r[ >> ] L> >R R@ 'r[ R> >L ; ] >R ; == == CONT >> Statement.2. (b) [r PRO BL1( t ) CONT ; '] == t Proof. (b) [r PRO BL1( t ) CONT ; '] == == 'r[ >> ] R> >L LDROPB BL1( t ) CONT ; == == 'r[ >> ] R> >L LDROPB BL1( t ) CONT ; == == 'l[ >> ] LDROPB BL1( t ) 'r[ ; ] L@ BLDROP >R ; == == BL[-1]( BL1( t ) ) >> == t (2nd order call of BL1( t ) (a TCF which is "almost equivalent" to t but which does not affect (nor depends on) the L-stack top) is b-equivalent to t substituted inline). Consequence. (b) [r PRO t:L CONT ; '] == t (If a TCF does not depend on the L-stack, its 2nd order call is b-equivalent to its inline substitution) At this point we finish our equivalence-based formal description of control flow effects of return stack manipulations. We could not only describe traditional the control structures (call, EXIT, BRANCH, ?BRANCH) [Gas95], but to build an adequate description of such untrivial execution method as backtracking (constructing 2nd order threaded code, we implement backtracking). The cut statements, as well as the RP@ and RP! operators which access the return stack pointer directly (and therefore treat the return stack not as a stack) remain unexplored. Description of the 3rd order threaded code, which is found useful when the potentialities of classical backtracking are exhausted, remains beyond the scope of this work as well. Nevertheless, what has been done is sufficient to state that our system of notions copes well with the task. 5 THE REPEATABILITY PRINCIPLE 5.1 FORMULATION We want to present one more axiom which has not been found necessary so far. Axiom. (Repeatability principle in case of no interaction with external world) %A << %A >> %A t %E x,z : t == x t z DROP == NOOP t == x t DROP z This means that given the memory contents before execution of t, any value that t leaves on the stack may be calculated one more time (since we can keep the necessary information in a location unused by other code; the principle states that for any program such place can be found). This axiom enables us to name values in the stack (this could not be done by locals because each particular implementation of locals has its restrictions: scope, etc.). Note. The context ( << and >> ) may pass control to t many times; to avoid confusion, a notation that resembles locals can be introduced. Axiom. (Repeatability principle in general case) %A << %A >> %A t %E t1..tn %E x0..xn ,z0..zn : t == t1 .. tn t == x0 t1 x1 t2 x2 .. tn xn %A i=0..n : zi DROP == NOOP t == x0 z0 DROP t1 x1 t2 x2 .. tn xn t == x0 t1 z1 DROP x1 t2 x2 .. tn xn .... t == x0 t1 x1 t2 x2 .. tn xn zn DROP This means that given the memory statebefore execution and results of all interactions with the external world, one can predict any value that can be placed on the stack anywhere in the program. 5.2 DISCUSSION OF THE REPEATABILITY PRINCIPLE Thus, the repeatability principle enables us to introduce notions equivalent to the notion of a value of the stack. In other words, it enables a theorist who knows axioms but nothing about the stack to infer that values on the stack exist. Although it postulates an important property of the threaded code, we could do without it so far. We did not use the repeatability principle intentionally: it significantly differs from other axioms. All other axioms describe properties of the equivalence and do not introduce new objects. Subdivision of t into t1...tn and finding x0...xn and z0...zn is not a trivial operation: it requires understanding of the program as a whole. It states a property of program, not a property of a local piece of code. It is not an elementary property. Finally, the repeatability principle looks like a property very specific to executable code. Although we do not know of another object with similar properties, we would like to avoid making our system so specific. 5 DISCUSSION The effects of return stack manipulations may be statically predicted. One of the most important properties of code fragments with respect to the return addresses is to leave them unchanged. In particular, this enables us to convert TC fragments to linear form which is often the most convenient form for the analysis: indeed, it is easier to follow the course of linear execution. However, practice has showed that this form is not well-suited for most problems: it is easier to cope with programs subdivided into units which reflect our concepts on the problem, and creation of an interpreter of these natural units pays for itself. This idea may be found behind such approaches as macroprocessing, problem-oriented languages, use of procedures, backtracking, data-driven approach. So, it is important to find the most natural notions, most natural syntax, and be able to construct an interpreter for this syntax (for a complex problem or for a class of similar problems). It is evident that a backtrackable procedure uses its continuation as a parameter (it calls it when it succeeds). It is less evident that any Forth procedure uses its continuation as a parameter. Indeed, EXIT ignores its continuation, BRANCH fetches a destination address from it, while most procedures pass control to it. We have introduced the notion of 2nd order threaded code. The 2-order threaded code (i.e. backtrackable code) enables one to implement a syntax that resembles a dataflow diagram: To implement backtracking we have used the "success as continuation call" method. In the same way, via continuation call, one can build 3rd and higher order threaded codes. A 3-order procedure success will consist in calling continuation as a fragment of 2-order code that ends with CONT EXIT (code that performs success). A 3 order procedure is described by 8 stack states, which is 1 more that the mean psychological constant, but it is not a problem for a person who can use backtracking (the trick is to have most of the states empty). The 3 order threaded code enables one to program in terms of backtrackable (bi-directional) dataflows, i.e. enables one to return to this convenient code arrangement when the capabilities of backtracking have already been exhausted. The only real problem where it has happened is an experimental program to reconstruct the syntax tree for a sentence in an inflective natural language (namely, Russian). 6 RESULTS We have shown that in most cases the effects of return stack manipulations may be statically predicted. The concept of position (active, passive, etc.) is introduced that represents the differences between code, data, self-modifying code, etc. A notation has been developed that adequately describes processing of [threaded] code by the code interpreter and effect of return address manipulations on it. Formal description of how return addresses manipulations are reflected in the control flow changes is given. The calculus enables one to analyse return addresses manipulations, prove their correctness, perform equivalent transformations of code fragments. The calculus have been successfully used to describe changing to a different execution mechanism--backtracking. Notions of 2nd and 3rd order codes have been introduced. The 3-order threaded code can help in cases where the capabilities of backtracking are exhausted. The theorem of 2-order call equivalence has been proved. The theorems of 1st and 2nd order call equivalence provide a clue for construction of optimizing Forth compilers. REFERENCES [Gas93R] Gassanenko M.L. Novye sintaksicheskie konstruktsii i be`ktreking dlja jazyka Fort.//Problemy tehnologii programmirovanija - SPb: SPIIRAN, 1991-93, p.148-162. (New Control Structures and Backtracking for the Forth Language, in Russian.) [Gas94] Gassanenko, M.L. BacFORTH: An Approach to New Control Structures. Proc. of the EuroForth'94 conference, 4-6 November 1994, Royal Hotel, Winchester, UK, p.39-41. [Gas95] Gassanenko, M.L. Formalization of Return Addresses Manipulations and Control Transfers. Proc. of the EuroForth'95 conference, 27-29 October 1995, International Centre for Informatics, Dagstuhl Castle, Germany. [Poi91] Poial J. Multiple Stack-effects of Forth Programs. 1991 FORML Conference Proceedings, euroFORML'91 Conference, Oct 11-13, 1991, Marianske Lazne, Czechoslovakia, Forth Interest Group, Inc., Oakland, USA, 1992, 400-406. [Poi93] Poial J. Some Ideas on Formal Specification of Forth Programs. 9th euroFORTH conference on the FORTH programming language and FORTH processors, Oct 15-18, 1993, Marianske Lazne, Czech Republic, 1993, 4 pp. [Poi94] Poial, Jaanus. Forth and Formal Language Theory. Proc. of the EuroForth'94 conference, 4-6 November 1994, Royal Hotel, Winchester, UK, 6 pp. [StK92] Stoddart B., Knaggs P. The (Almost) Complete Theory of Forth Type Inference. Proc. EuroForth Conference 1992. [StK93] Stoddart B., Knaggs P. Type Inference in Stack Based Languages. Formal Aspects of Computing, BCS, 1993, 5, 289-298. [Sto93] Stoddart B. Self Reference and Type Inference in Forth. 9th euroFORTH conference on the FORTH programming language and FORTH processors, Oct 15-18, 1993, Marianske Lazne, Czech Republic, 1993, 9 pp.