Subversion Repositories tendra.SVN

Rev

Blame | Last modification | View Log | RSS feed

<!-- Crown Copyright (c) 1998 -->
<HTML>
<HEAD>
<TITLE>SORTs and TOKENs</TITLE>
</HEAD>
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">
<A NAME=S3>
<H1>TDF Guide, Issue 4.0 </H1>
<H3>January 1998</H3>
<A HREF="guide5.html"><IMG SRC="../images/next.gif" ALT="next section">
</A> <A HREF="guide1.html">
<IMG SRC="../images/prev.gif" ALT="previous section"></A>
<A HREF="guide1.html"><IMG SRC="../images/top.gif" ALT="current document"></A>
<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">
</A>
<IMG SRC="../images/no_index.gif" ALT="document index"><P>
<HR>
<DL>
<DT><A HREF="#S4"><B>2.1 </B> - Token applications and first-class
SORTs</A><DD>
<DT><A HREF="#S5"><B>2.2 </B> - Token definitions and declarations</A><DD>
<DT><A HREF="#S6"><B>2.3 </B> - A simple use of a TOKEN</A><DD>
<DT><A HREF="#S7"><B>2.4 </B> - Second class SORTs</A><DD>
</DL>
<HR>
<H1>2  SORTs and TOKENs</H1>
In the syntax of language like C or Pascal, we find various syntactic
units like &lt;Expression&gt;, &lt;Identifier&gt; etc. A SORT bears
the same relation to TDF as these syntactic units bear to the language;
roughly speaking, the syntactic unit &lt;Expression&gt; corresponds
to the SORT EXP and &lt;Identifier&gt; to TAG . However, instead of
using BNF to compose syntactic units from others, TDF uses explicit
constructors to compose its SORTs; each constructor uses other pieces
of TDF of specified SORTs to make a piece of its result SORT. For
example, the constructor plus uses an ERROR_TREATMENT and two EXPs
to make another EXP.<P>
At the moment, there are 58 different SORTS, from ACCESS to VARIETY
given in tables 1 and 2. Some of these have familiar analogues in
standard language construction as with EXP and TAG above. Others will
be less familiar since TDF must concern itself with issues not normally
addressed in language definitions. For example, the process of linking
together TDF programs is at the root of the architecture neutrality
of TDF and so must form an integral part of its definition. On the
other hand, TDF is not meant to be a language readily accessible to
the human reader or writer; computers handle it much more easily.
Thus a great many choices have been made in the definition which would
be intolerable in a standard language definition for the human programmer
but which, paradoxically enough, make it much simpler for a computer
to produce and analyse TDF<P>
The SORTs and constructors in effect form a multi-sorted algebra.
There were two principal reasons for choosing this algebraic form
of definition. First, it is easy to extend - a new operation on existing
constructs simply requires a new constructor. Secondly, the algebraic
form is highly amenable to the automatic construction of programs.
Large parts of both TDF producers and TDF translators have been created
by automatic transformation of the text of the specification document
itself, by extracting the algebraic signature and constructing C program
which can read or produce TDF. To this extent, one can regard the
specification document as a formal description of the free algebra
of TDF SORTs and constructors. Of course, most of the interesting
parts of the definition of TDF lies in the equivalences of parts of
TDF, so this formality only covers the easy bit.<P>
Another distinction between the TDF definition and language syntactic
description is that TDF is to some extent conscious of its own SORTs
so that it can specify a new construction of a given SORT. The analogy
in normal languages would be that one could define a new construction
with new syntax and say this is an example of an &lt;Expression&gt;,
for example; I don't know of any standard language which permits this,
although those of you with a historical bent might remember Algol-N
which made a valiant attempt at it. Of course, the algebraic method
of description makes it much easier to specify, rather than having
to give syntax to provide the syntax for the new construction in a
language.<P>
<A NAME=S4>
<HR><H2>2.1. Token applications and first-class SORTs</H2>
A new construction is introduced by the SORT TOKEN; the constructors
involving TOKENs allow one to give an expansion for the TOKEN in terms
of other pieces of TDF, possibly including parameters. We can encapsulate
a (possibly parameterised) fragment of TDF of a suitable SORT by giving
it a TOKEN as identification. Not all of the SORTs are available for
this kind of encapsulation - only those which have a SORTNAME constructor
(from access to variety). These are the &quot;first-class&quot; SORTs
given in table 1 on page 8</A>. Each of these have an appropriate
_apply_token constructor (e.g. exp_apply_token ) give the expansion.
Most of these also have _cond constructors (e.g.see exp_cond in 
<A HREF="guide11.html#1">section 9.1 on page 43</A>) which allows
translate time conditional expansion of the SORT.<P>
<P>
<BR><IMG SRC="table6.gif"><BR>
<P>
Every TOKEN has a result SORT, i.e. the SORT of its resulting expansion
and before it can be expanded, one must have its parameter SORTs.
Thus, you can regard a TOKEN as having a type defined by its result
and parameter SORTs and the _apply_token as the operator which expands
the encapsulation and substitutes the parameters. <P>
However, if we look at the signature of exp_apply_token:<P>
<PRE>
        <I>token_value</I>:     TOKEN 
        <I>token_args</I>:      BITSTREAM <I>param_sorts(token_value)</I>
                   -&gt; EXP <I>x</I>
<I></I>
</PRE>
we are confronted by the mysterious BITSTREAM where one might expect
to find the actual parameters of the TOKEN.<P>
To explain BITSTREAMs requires a diversion into the bit-encoding of
TDF. Constructors for a particular SORT are represented in a number
of bits depending on the number of constructors for that SORT; the
context will determine the SORT required, so no more bits are required.
Thus since there is only one constructor for UNITs, no bits are required
to represent make_unit; there are about 120 different constructors
for EXPs so 7 bits are required to cover all the EXPs. The parameters
of each constructor have known SORTs and so their representations
are just concatenated after the representation of the constructor
<A NAME=footnote70 HREF="footnote.html#70">*</A>. While this is a
very compact representation, it suffers from the defect that one must
decode it even just to skip over it. This is very irksome is some
applications, notably the TDF linker which is not interested detailed
expansions. Similarly, in translators there are places where one wishes
to skip over a token application without knowledge of the SORTs of
its parameters. Thus a BITSTREAM is just an encoding of some TDF,
preceded by the number of bits it occupies. Applications can then
skip over BITSTREAMs trivially. Similar considerations apply to BYTESTREAMs
used elsewhere; here the encoding is preceded by the number of bytes
in the encoding and is aligned to a byte boundary to allow fast copying.<P>
<A NAME=S5>
<HR><H2>2.2. Token definitions and declarations</H2>
Thus the <I>token_args</I> parameter of exp_apply_token is just the
BITSTREAM formed from the actual parameters in the sequence described
by the definition of the <I>token_valu</I><I>e</I> parameter. This
will be given in a TOKEN_DEFN somewhere with constructor token_definition:<P>
<PRE>
        <I>result_sort</I>:     SORTNAME
        <I>tok_params</I>:      LIST(TOKFORMALS)
        <I>body</I>:    <I>result_sort</I>
                   -&gt; TOKEN_DEFN
</PRE>
The <I>result_sort</I> is the SORT of the construction of <I>body</I>;
e.g. if <I>result_sort</I> is formed from exp then <I>body</I> would
be constructed using the EXP constructors and one would use exp_apply_token
to give the expansion. The list <I>tok_params</I> gives the formal
parameters of the definition in terms of TOKFORMALS constructed using
make_tok_formals:<P>
<PRE>
        <I>sn</I>:      SORTNAME
        <I>tk</I>:      TDFINT
                   -&gt; TOKFORMALS
</PRE>
The TDFINT <I>tk</I> will be the integer representation of the formal
parameter expressed as a TOKEN whose result sort is <I>sn</I> (see
more about name representation in <A HREF="guide5.html#2">section
3.1 on page 13</A>). To use the parameter in the body of the TOKEN_DEFN,
one simply uses the _apply_token appropriate to <I>sn</I>.Note that
sn may be a TOKEN but the <I>result_sort</I> may not.<P>
Hence the BITSTREAM <I>param_sorts</I>(<I>token_value</I>) in the
actual parameter of exp_apply_token above is simply formed by the
catenation of constructions of the SORTs given by the SORTNAMEs in
the <I>tok_params</I> of the TOKEN being expanded.<P>
Usually one gives a name to a TOKEN_DEFN using to form a TOKDEF using
make_tokdef :<P>
<PRE>
        <I>tok</I>:     TDFINT
        <I>signature</I>:       OPTION(STRING)
        <I>def</I>:     BITSTREAM TOKEN_DEFN
                   -&gt; TOKDEF
</PRE>
Here, <I>tok</I> gives the name that will be used to identify the
TOKEN whose expansion is given by <I>def</I>. Any use of this TOKEN
(e.g. in exp_apply_token) will be given by make_token(<I>tok 
</I>) . Once again, a BITSTREAM is used to encapsulate the TOKEN_DEFN.
<P>
The significance of the signature parameter is discussed in  
<A HREF="guide5.html#37">section 3.2.2 on page 18</A>.<P>
Often, one wishes a token without giving its definition - the definition
could, for example, be platform-dependent. A TOKDEC introduces such
a token using make_tokdec:<P>
<PRE>
        <I>tok</I>:     TDFINT
        <I>signature</I>:       OPTION(STRING)
        <I>s</I>:       SORTNAME
                   -&gt; TOKDEC
</PRE>
Here the SORTNAME, <I>s</I>, is given by token:<P>
<PRE>
        <I>result</I>:  SORTNAME
        <I>params</I>:  LIST(SORTNAME)
                   -&gt; SORTNAME
</PRE>
which gives the result and parameter SORTs of <I>tok</I>. <P>
<P>
<P>
One can also use a TOKEN_DEFN in an anonymous fashion by giving it
as an actual parameter of a TOKEN which itself demands a TOKEN parameter.
To do this one simply uses use_tokdef :<P>
<PRE>
        <I>tdef</I>:    BITSTREAM TOKEN_DEFN
                   -&gt; TOKEN
</PRE>
<A NAME=S6>
<HR><H2>2.3. A simple use of a TOKEN</H2>
<P>
The crucial use of TOKENs in TDF is to provide abstractions of APIs
(see <A HREF="guide12.html#0">section 10 on page 45</A>) but they
are also used as shorthand for commonly occurring constructions. For
example, given the TDF constructor plus, mentioned above, we could
define a plus with only two EXP parameters more suitable to C by using
the wrap constructor as the ERROR_TREATMENT:<P>
<PRE>
make_tokdef (C_plus, empty,
        token_definition( 
                exp(), 
                (make_tokformals(exp(), l), make_tokformals(exp(), r)),
                plus(wrap(), exp_apply_token(l, ()), exp_apply_token(r,())
        )
)
</PRE>
<A NAME=S7>
<HR><H2>2.4. Second class SORTs</H2>
Second class SORTs (given in table 2 on page 11</A>) cannot be TOKENised.
These are the &quot;syntactic units&quot; of TDF which the user cannot
extend; he can only produce them using the constructors defined in
core-TDF.<P>
Some of these constructors are implicit. For example, there are no
explicit constructors for LIST or SLIST which are both used to form
lists of SORTs; their construction is simply part of the encoding
of TDF. However, it is forseen that LIST constructors would be highly
desireable and there will probably extensions to TDF to promote LIST
from a second-class SORT to a first-class one. This will not apply
to SLIST or to the other SORTs which have implicit constructions.
These include BITSTREAM, BYTESTREAM, TDFINT, TDFIDENT and TDFSTRING.<BR>
<P>
<BR><IMG SRC="table4.gif"><BR>
<P>
<P>
<HR>
<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown
Copyright &copy; 1998.</I></P>
</BODY>
</HTML>