Subversion Repositories tendra.SVN

Rev

Rev 2 | Blame | Compare with Previous | Last modification | View Log | RSS feed

<!-- Crown Copyright (c) 1998 -->
<HTML>
<HEAD>
<TITLE>Structure of TDF</TITLE>
</HEAD>
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">
<H1><A NAME=S4>TDF Specification, Issue 4.0</A></H1>
<H3>January 1998</H3>
<A HREF="spec6.html"><IMG SRC="../images/next.gif" ALT="next section"></A>
<A HREF="spec4.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>
<A HREF="spec1.html"><IMG SRC="../images/top.gif" ALT="current document"></A>
<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">
</A>
<A HREF="spec12.html"><IMG SRC="../images/index.gif" ALT="document index"></A>
<P>
<HR>
<DL>
<DT><A HREF="#S5"><B>2.1</B> - The Overall Structure</A><DD>
<DT><A HREF="#S6"><B>2.2</B> - Tokens</A><DD>
<DT><A HREF="#S7"><B>2.3</B> - Tags</A><DD>
<DT><A HREF="#S8"><B>2.4</B> - Extending the format</A><DD>
</DL>
<HR>
<H1>2. Structure of TDF</H1>
<A NAME=M1>Each piece of TDF program is classified as being of a particular
<CODE>SORT</CODE>. Some pieces of TDF are 
<CODE>LABEL</CODE>s, some are <CODE>TAG</CODE>s, some are 
<CODE>ERROR_TREATMENT</CODE>s and so on (to list some of the more
transparently named <CODE>SORT</CODE>s).  The <CODE>SORT</CODE>s of
the arguments and result of each construct of the TDF format are specified.
For instance, <I>plus</I> is defined to have three arguments - an
<CODE>ERROR_TREATMENT</CODE> and two <CODE>EXP</CODE>s (short for
&quot;expression&quot;) - and to produce an <CODE>EXP</CODE>; 
<I>goto</I> has a single <CODE>LABEL</CODE> argument and produces
an 
<CODE>EXP</CODE>. The specification of the <CODE>SORT</CODE>s of the
arguments and results of each construct constitutes the syntax of
the TDF format. When TDF is represented as a parsed tree it is structured
according to this syntax. When it is constructed and read it is in
terms of this syntax. 
<P>
<A NAME=S5>
<H2>2.1. <A NAME=2>The Overall Structure</H2>
<A NAME=M3>A separable piece of TDF is called a <CODE>CAPSULE</CODE>.
A producer generates a <CODE>CAPSULE</CODE>; the TDF linker links
<CODE>CAPSULE</CODE>s together to form a <CODE>CAPSULE</CODE>; and
the final translation process turns a <CODE>CAPSULE</CODE> into an
object file. 
<P>
The structure of capsules is designed so that the process of linking
two or more capsules consists almost entirely of copying large byte-aligned
sections of the source files into the destination file, without changing
or even examining these sections. Only a small amount of interface
information has to be modified and this is made easily accessible.
The translation process only requires an extra indirection to account
for this interface information, so it is also fast. The description
of TDF at the capsule level is almost all about the organisation of
the interface information. 
<P>
There are three major kinds of entity which are used inside a capsule
to name its constituents. The first are called tags; they are used
to name the procedures, functions, values and variables which are
the components of the program. The second are called tokens; they
identify pieces of TDF which can be used for substitution - a little
like macros. The third are the alignment tags, used to name alignments
so that circular types can be described. Because these internal names
are used for linking pieces of TDF together, they are collectively
called <I>linkable entities</I>. The interface information relates
these linkable entities to each other and to the world outside the
capsule. 
<P>
The most important part of a capsule, the part which contains the
real information, consists of a sequence of groups of units. Each
group contains units of the same kind, and all the units of the same
kind are in the same group. The groups always occur in the same order,
though it is not necessary for each kind to be present. 
<CENTER>
<BR><IMG SRC="../images/capsule4.gif"><BR>
</CENTER>
<P>
<A NAME=M4>The order is as follows: 
<UL>
<LI><A NAME=M5><I>tld</I> unit. Every capsule has exactly one tld
unit. It gives information to the TDF linker about those items in
the capsule which are visible externally.<P>
<LI><I>versions</I> unit. These units contain information about the
versions of TDF used. Every capsule will have at least one such unit.<P>
<LI><A NAME=M6><I>tokdec</I> units. These units contain declarations
for tokens. They bear the same relationship to the following tokdef
units that C declarations do to C definitions. However, they are not
necessary for the translator, and the current ANSI C producer does
not provide them by default.<P>
<LI><A NAME=M7><I>tokdef</I> units. These units contain definitions
of tokens.<P>
<LI><A NAME=M8><I>aldef</I> units. These units give the definitions
of alignment tags.<P>
<LI><I>diagtype</I> units. These units give diagnostic information
about types.<P>
<LI><A NAME=M9><I>tagdec</I> units. These units contain declarations
of tags, which identify values, procedures and run-time objects in
the program. The declarations give information about the size, alignment
and other properties of the values. They bear the same relationship
to the following tagdef units that C declarations do to C definitions.<P>
<LI><I>diagdef</I> units. These units give diagnostic information
about the values and procedures defined in the capsule.<P>
<LI><A NAME=M10><I>tagdef</I> units. These units contain the definitions
of tags, and so describe the procedures and the values they manipulate.<P>
<LI><I>linkinfo</I> units. These units give information about the
linking of objects. 
</UL>
This organisation is imposed to help installers, by ensuring that
the information needed to process a unit has been provided before
that unit arrives. For example, the token definitions occur before
any tag definition, so that, during translation, the tokens may be
expanded as the tag definitions are being read (in a capsule which
is ready for translation all tokens used must be defined, but this
need not apply to an arbitrary capsule). 
<P>
The tags and tokens in a capsule have to be related to the outside
world. For example, there might be a tag standing for <I>printf</I>,
used in the appropriate way inside the capsule. When an object file
is produced from the capsule the identifier <I>printf</I> must occur
in it, so that the system linker can associate it with the correct
library procedure. In order to do this, the capsule has a table of
tags at the capsule level, and a set of external links which provide
external names for some of these tags. 
<CENTER>
<BR><IMG SRC="../images/capsule1.gif"><BR>
</CENTER>
<P>
In just the same way, there are tables of tokens and alignment tags
at the capsule level, and external links for these as well. 
<P>
The tags used inside a unit have to be related to these capsule tags,
so that they can be properly named. A similar mechanism is used, with
a table of tags at the unit level, and links between these and the
capsule level tags. 
<CENTER>
<BR><IMG SRC="../images/capsule2.gif"><BR>
</CENTER>
<P>
Again the same technique is used for tokens and alignment tags. 
<P>
It is also necessary for a tag used in one unit to refer to the same
thing as a tag in another unit. To do this a tag at the capsule level
is used, which may or may not have an external link. 
<CENTER>
<BR><IMG SRC="../images/capsule3.gif"><BR>
</CENTER>
<P>
The same technique is used for tokens and alignment tags. 
<P>
So when the TDF linker is joining two capsules, it has to perform
the following tasks: 
<UL>
<LI>It creates new sets of capsule level tags, tokens and alignment
tags by identifying those which have the same external name, and otherwise
creating different entries.<P>
<LI>It similarly joins the external links, suppressing any names which
are no longer to be external.<P>
<LI>It produces new link tables for the units, so that the entities
used inside the units are linked to the new positions in the capsule
level tables.<P>
<LI>It re-organises the units so that the correct order is achieved.
</UL>
This can be done without looking into the interior of the units (except
for the <I>tld</I> unit), simply copying the units into their new
place. 
<P>
During the process of installation the values associated with the
linkable entities can be accessed by indexing into an array followed
by one indirection. These are the kinds of object which in a programming
language are referred to by using identifiers, which involves using
hash tables for access. This is an example of a general principle
of the design of TDF; speed is required in the linking and installing
processes, if necessary at the expense of time in the production of
TDF. 
<P>
<A NAME=S6>
<H2>2.2. Tokens</H2>
<A NAME=M11>Tokens are used (applied) in the TDF at the point where
substitutions are to be made. Token definitions provide the substitutions
and usually reside on the target machine and are linked in there.
<P>
A typical token definition has parameters from various 
<CODE>SORT</CODE>s and produces a result of a given <CODE>SORT</CODE>.
As an example of a simple token definition, written here in a C-like
notation, consider the following. 
<PRE>
        <I>EXP ptr_add (EXP par0, EXP par1, SHAPE par2)
        {
            add_to_ptr(
                par0,
                offset_mult(
                    offset_pad(
                        alignment(par2),
                        shape_offset(par2)),
                    par1))
        }</I>
</PRE>
This defines the token, <I>ptr_add</I>, to produce something of 
<CODE>SORT</CODE> <CODE>EXP</CODE>. It has three parameters, of 
<CODE>SORT</CODE>s <CODE>EXP</CODE>, <CODE>EXP</CODE> and 
<CODE>SHAPE</CODE>. The <I>add_to_ptr</I>, <I>offset_mult</I>, 
<I>offset_pad</I>, <I>alignment</I> and <I>shape_offset</I>
constructions are TDF constructions producing respectively an 
<CODE>EXP</CODE>, an <CODE>EXP</CODE>, an <CODE>EXP</CODE>, an 
<CODE>ALIGNMENT</CODE> and an <CODE>EXP</CODE>. 
<P>
A typical use of this token is: 
<PRE>
        <I>ptr_add(
            obtain_tag(tag41),
            contents(integer(~signed_int), obtain_tag(tag62)),
            integer(~char))</I>
</PRE>
The effect of this use is to produce the TDF of the definition with
<I>par0</I>, <I>par1</I> and <I>par2</I> substituted by the actual
parameters. 
<P>
There is no way of obtaining anything like a side-effect. A token
without parameters is therefore just a constant. 
<P>
Tokens can be used for various purposes. They are used to make the
TDF shorter by using tokens for commonly used constructions (<I>ptr_add</I>
is an example of this use). They are used to make target dependent
substitutions (<I>~char</I> in the use of <I>ptr_add</I> is an example
of this, since <I>~char</I> may be signed or unsigned on the target).
<P>
A particularly important use is to provide definitions appropriate
to the translation of a particular language. Another is to abstract
those features which differ from one ABI to another. This kind of
use requires that sets of tokens should be standardised for these
purposes, since otherwise there will be a proliferation of such definitions.
<P>
<A NAME=S7>
<H2>2.3. Tags</H2>
Tags are used to identify the actual program components. They can
be declared or defined. A declaration gives the <CODE>SHAPE</CODE>
of a tag (a <CODE>SHAPE</CODE> is the TDF analogue of a type). A definition
gives an <CODE>EXP</CODE> for the tag (an <CODE>EXP</CODE> describes
how the value is to be made up). 
<P>
<A NAME=S8>
<H2>2.4. Extending the format</H2>
<A NAME=M12>TDF can be extended for two major reasons. 
<P>
First, as part of the evolution of TDF, new features will from time
to time be identified. It is highly desirable that these can be added
without disturbing the current encoding, so that old TDF can still
be installed by systems which recognise the new constructions. Such
changes should only be made infrequently and with great care, for
stability reasons, but nevertheless they must be allowed for in the
design. 
<P>
Second, it may be required to add extra information to TDF to permit
special processing. TDF is a way of describing programs and it clearly
may be used for other reasons than portability and distribution. In
these uses it may be necessary to add extra information which is closely
integrated with the program. Diagnostics and profiling can serve as
examples. In these cases the extra kinds of information may not have
been allowed for in the TDF encoding. 
<P>
Some extension mechanisms are described below and related to these
reasons: 
<UL>
<LI>The encoding of every <CODE>SORT</CODE> in TDF can be extended
indefinitely (except for certain auxiliary <CODE>SORT</CODE>s).  This
mechanism should only be used for extending standard TDF to the next
standard, since otherwise extensions made by different groups of people
might conflict with each other. See <A HREF="spec11.html#13">Extendable
integer encoding</A>.<P>
<LI>Basic TDF has three kinds of linkable entity and seven kinds of
unit. It also contains a mechanism for extending these so that other
information can be transmitted in a capsule and properly related to
basic TDF. The rules for linking this extra information are also laid
down. See <A HREF="spec8.html#M53">make_capsule</A>. 
<P>
If a new kind of unit is added, it can contain any information, but
if it is to refer to the tags and tokens of other units it must use
the linkable entities. Since new kinds of unit might need extra kinds
of linkable entity, a method for adding these is also provided. All
this works in a uniform way, with capsule level tables of the new
entities, and external and internal links for them. 
<P>
If new kinds of unit are added, the order of groups must be the same
in any capsules which are linked together. As an example of the use
of this kind of extension, the diagnostic information is introduced
in just this way. It uses two extra kinds of unit and one extra kind
of linkable entity. The extra units need to refer to the tags in the
other units, since these are the object of the diagnostic information.
This mechanism can be used for both purposes.<P>
<LI>The parameters of tokens are encoded in such a way that foreign
information (that is, information which cannot be expressed in the
TDF <CODE>SORT</CODE>s) can be supplied. This mechanism should only
be used for the second purpose, though it could be used to experiment
with extensions for future standards. See 
<A HREF="spec11.html#8"><CODE>BITSTREAM</CODE></A>. 
</UL>
<P>
<HR>
<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown
Copyright &copy; 1998.</I></P>
</BODY>
</HTML>