6 |
7u83 |
1 |
<?xml version="1.0" standalone="no"?>
|
|
|
2 |
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
|
|
|
3 |
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
|
|
|
4 |
|
|
|
5 |
<!--
|
|
|
6 |
$Id$
|
|
|
7 |
-->
|
|
|
8 |
|
|
|
9 |
<book>
|
|
|
10 |
<bookinfo>
|
|
|
11 |
<title>A Guide to the TDF Specification, Issue 4.0</title>
|
|
|
12 |
|
|
|
13 |
<corpauthor>The TenDRA Project</corpauthor>
|
|
|
14 |
|
|
|
15 |
<author>
|
|
|
16 |
<firstname>Jeroen</firstname>
|
|
|
17 |
<surname>Ruigrok van der Werven</surname>
|
|
|
18 |
</author>
|
|
|
19 |
<authorinitials>JRvdW</authorinitials>
|
|
|
20 |
<pubdate>2005</pubdate>
|
|
|
21 |
|
|
|
22 |
<copyright>
|
|
|
23 |
<year>2004</year>
|
|
|
24 |
<year>2005</year>
|
|
|
25 |
|
|
|
26 |
<holder>The TenDRA Project</holder>
|
|
|
27 |
</copyright>
|
|
|
28 |
|
|
|
29 |
<copyright>
|
|
|
30 |
<year>1998</year>
|
|
|
31 |
|
|
|
32 |
<holder>DERA</holder>
|
|
|
33 |
</copyright>
|
|
|
34 |
</bookinfo>
|
|
|
35 |
|
|
|
36 |
<chapter id="introduction">
|
|
|
37 |
<title>Introduction</title>
|
|
|
38 |
|
|
|
39 |
<para>This memo is intended to be a fairly detailed commentary on the
|
|
|
40 |
specification of TDF, a kind of Talmud to the Torah. If it conflicts
|
|
|
41 |
with the specification document, it is wrong. The aim is elucidate the
|
|
|
42 |
various constructions of TDF, giving examples of usages both from the
|
|
|
43 |
point of view of a producer of TDF and how it is used to construct
|
|
|
44 |
programs on particular platforms using various installers or
|
|
|
45 |
translators. In addition, some attempt is made to give the reasons why
|
|
|
46 |
the particular constructions have been chosen. Most of the commentary is
|
|
|
47 |
a distillation of questions and answers raised by people trying to learn
|
|
|
48 |
TDF from the specification document.</para>
|
|
|
49 |
|
|
|
50 |
<para>Throughout this document, references like (S5.1) are headings in the
|
|
|
51 |
<ulink url="../tdf/spec1.html">TDF specification, Issue 4.0</ulink>.
|
|
|
52 |
I use the term "compiling" or "producing" to mean the production of TDF
|
|
|
53 |
from some source language and "translating" to mean making a program for
|
|
|
54 |
some specific platform from TDF.</para>
|
|
|
55 |
|
|
|
56 |
<para>I use the first person where I am expressing my own opinions or
|
|
|
57 |
preferences; these should not be taken as official opinions of DRA or
|
|
|
58 |
the TenDRA team.</para>
|
|
|
59 |
</chapter>
|
|
|
60 |
|
|
|
61 |
<chapter>
|
|
|
62 |
<sect1 id="sorts-and-tokens">
|
|
|
63 |
<title>SORTs and TOKENs</title>
|
|
|
64 |
|
|
|
65 |
<para>In the syntax of language like C or Pascal, we find various
|
|
|
66 |
syntactic units like <Expression>, <Identifier> etc. A
|
|
|
67 |
SORT bears the same relation to TDF as these syntactic units bear to
|
|
|
68 |
the language; roughly speaking, the syntactic unit <Expression>
|
|
|
69 |
corresponds to the SORT EXP and <Identifier> to TAG . However,
|
|
|
70 |
instead of using BNF to compose syntactic units from others, TDF uses
|
|
|
71 |
explicit constructors to compose its SORTs; each constructor uses
|
|
|
72 |
other pieces of TDF of specified SORTs to make a piece of its result
|
|
|
73 |
SORT. For example, the constructor plus uses an ERROR_TREATMENT and
|
|
|
74 |
two EXPs to make another EXP.</para>
|
|
|
75 |
|
|
|
76 |
<para>At the moment, there are 58 different SORTS, from ACCESS to
|
|
|
77 |
VARIETY given in tables 1 and 2. Some of these have familiar analogues
|
|
|
78 |
in standard language construction as with EXP and TAG above. Others
|
|
|
79 |
will be less familiar since TDF must concern itself with issues not
|
|
|
80 |
normally addressed in language definitions. For example, the process
|
|
|
81 |
of linking together TDF programs is at the root of the architecture
|
|
|
82 |
neutrality of TDF and so must form an integral part of its definition.
|
|
|
83 |
On the other hand, TDF is not meant to be a language readily
|
|
|
84 |
accessible to the human reader or writer; computers handle it much
|
|
|
85 |
more easily. Thus a great many choices have been made in the
|
|
|
86 |
definition which would be intolerable in a standard language
|
|
|
87 |
definition for the human programmer but which, paradoxically enough,
|
|
|
88 |
make it much simpler for a computer to produce and analyse TDF</para>
|
|
|
89 |
|
|
|
90 |
<para>The SORTs and constructors in effect form a multi-sorted algebra.
|
|
|
91 |
There were two principal reasons for choosing this algebraic form of
|
|
|
92 |
definition. First, it is easy to extend - a new operation on existing
|
|
|
93 |
constructs simply requires a new constructor. Secondly, the algebraic
|
|
|
94 |
form is highly amenable to the automatic construction of programs.
|
|
|
95 |
Large parts of both TDF producers and TDF translators have been
|
|
|
96 |
created by automatic transformation of the text of the specification
|
|
|
97 |
document itself, by extracting the algebraic signature and
|
|
|
98 |
constructing C program which can read or produce TDF. To this extent,
|
|
|
99 |
one can regard the specification document as a formal description of
|
|
|
100 |
the free algebra of TDF SORTs and constructors. Of course, most of the
|
|
|
101 |
interesting parts of the definition of TDF lies in the equivalences of
|
|
|
102 |
parts of TDF, so this formality only covers the easy bit.</para>
|
|
|
103 |
|
|
|
104 |
<para>Another distinction between the TDF definition and language
|
|
|
105 |
syntactic description is that TDF is to some extent conscious of its
|
|
|
106 |
own SORTs so that it can specify a new construction of a given SORT.
|
|
|
107 |
The analogy in normal languages would be that one could define a new
|
|
|
108 |
construction with new syntax and say this is an example of an
|
|
|
109 |
<Expression>, for example; I don't know of any standard language
|
|
|
110 |
which permits this, although those of you with a historical bent might
|
|
|
111 |
remember Algol-N which made a valiant attempt at it. Of course, the
|
|
|
112 |
algebraic method of description makes it much easier to specify,
|
|
|
113 |
rather than having to give syntax to provide the syntax for the new
|
|
|
114 |
construction in a language.</para>
|
|
|
115 |
|
|
|
116 |
<sect2>
|
|
|
117 |
<title>Token applications and first-class SORTs</title>
|
|
|
118 |
|
|
|
119 |
<para>A new construction is introduced by the SORT TOKEN; the
|
|
|
120 |
constructors involving TOKENs allow one to give an expansion for the
|
|
|
121 |
TOKEN in terms of other pieces of TDF, possibly including
|
|
|
122 |
parameters. We can encapsulate a (possibly parameterised) fragment
|
|
|
123 |
of TDF of a suitable SORT by giving it a TOKEN as identification.
|
|
|
124 |
Not all of the SORTs are available for this kind of encapsulation
|
|
|
125 |
- only those which have a SORTNAME constructor (from access to
|
|
|
126 |
variety). These are the "first-class" SORTs given in table 1. Each
|
|
|
127 |
of these have an appropriate _apply_token constructor (e.g.
|
|
|
128 |
exp_apply_token) give the expansion. Most of these also have _cond
|
|
|
129 |
constructors (e.g.see exp_cond in
|
|
|
130 |
<link linkend="cond-constructors">section 9.1</link>)
|
|
|
131 |
which allows translate time conditional expansion of the
|
|
|
132 |
SORT.</para>
|
|
|
133 |
|
|
|
134 |
<mediaobject>
|
|
|
135 |
<imageobject>
|
|
|
136 |
<imagedata fileref="images/guidetable3.png" format="PNG"/>
|
|
|
137 |
</imageobject>
|
|
|
138 |
<textobject>
|
|
|
139 |
<phrase>TBD</phrase>
|
|
|
140 |
</textobject>
|
|
|
141 |
<caption>
|
|
|
142 |
<para>TBD</para>
|
|
|
143 |
</caption>
|
|
|
144 |
</mediaobject>
|
|
|
145 |
|
|
|
146 |
<para>Every TOKEN has a result SORT, i.e. the SORT of its resulting
|
|
|
147 |
expansion and before it can be expanded, one must have its parameter
|
|
|
148 |
SORTs. Thus, you can regard a TOKEN as having a type defined by its
|
|
|
149 |
result and parameter SORTs and the _apply_token as the operator
|
|
|
150 |
which expands the encapsulation and substitutes the
|
|
|
151 |
parameters.</para>
|
|
|
152 |
|
|
|
153 |
<para>However, if we look at the signature of exp_apply_token:</para>
|
|
|
154 |
|
|
|
155 |
<programlisting>
|
|
|
156 |
token_value: TOKEN
|
|
|
157 |
token_args: BITSTREAM param_sorts(token_value)
|
|
|
158 |
-> EXP x
|
|
|
159 |
</programlisting>
|
|
|
160 |
|
|
|
161 |
<para>we are confronted by the mysterious BITSTREAM where one might
|
|
|
162 |
expect to find the actual parameters of the TOKEN.</para>
|
|
|
163 |
|
|
|
164 |
<para>To explain BITSTREAMs requires a diversion into the bit-encoding
|
|
|
165 |
of TDF. Constructors for a particular SORT are represented in a
|
|
|
166 |
number of bits depending on the number of constructors for that
|
|
|
167 |
SORT; the context will determine the SORT required, so no more bits
|
|
|
168 |
are required. Thus since there is only one constructor for UNITs, no
|
|
|
169 |
bits are required to represent make_unit; there are about 120
|
|
|
170 |
different constructors for EXPs so 7 bits are required to cover all
|
|
|
171 |
the EXPs. The parameters of each constructor have known SORTs and so
|
|
|
172 |
their representations are just concatenated after the representation
|
|
|
173 |
of the constructor.
|
|
|
174 |
|
|
|
175 |
<footnote><para>There are facilities to allow extensions to the
|
|
|
176 |
number of constructors, so it is not quite as simple as
|
|
|
177 |
this.</para></footnote>
|
|
|
178 |
|
|
|
179 |
While this is a very compact representation, it suffers from the
|
|
|
180 |
defect that one must decode it even just to skip over it. This is
|
|
|
181 |
very irksome is some applications, notably the TDF linker which is
|
|
|
182 |
not interested detailed expansions. Similarly, in translators there
|
|
|
183 |
are places where one wishes to skip over a token application without
|
|
|
184 |
knowledge of the SORTs of its parameters. Thus a BITSTREAM is just
|
|
|
185 |
an encoding of some TDF, preceded by the number of bits it occupies.
|
|
|
186 |
Applications can then skip over BITSTREAMs trivially. Similar
|
|
|
187 |
considerations apply to BYTESTREAMs used elsewhere; here the
|
|
|
188 |
encoding is preceded by the number of bytes in the encoding and is
|
|
|
189 |
aligned to a byte boundary to allow fast copying.</para>
|
|
|
190 |
</sect2>
|
|
|
191 |
|
|
|
192 |
<sect2>
|
|
|
193 |
<title>Token definitions and declarations</title>
|
|
|
194 |
|
|
|
195 |
<para>Thus the <parameter>token_args</parameter> parameter of
|
|
|
196 |
exp_apply_token is just the BITSTREAM formed from the actual
|
|
|
197 |
parameters in the sequence described by the definition of the
|
|
|
198 |
<parameter>token_value</parameter> parameter. This will be given in
|
|
|
199 |
a TOKEN_DEFN somewhere with constructor token_definition:</para>
|
|
|
200 |
|
|
|
201 |
<programlisting>
|
|
|
202 |
result_sort: SORTNAME
|
|
|
203 |
tok_params: LIST(TOKFORMALS)
|
|
|
204 |
body: result_sort
|
|
|
205 |
-> TOKEN_DEFN
|
|
|
206 |
</programlisting>
|
|
|
207 |
|
|
|
208 |
<para>The <type>result_sort</type> is the SORT of the construction of
|
|
|
209 |
<type> body</type>; e.g. if <type>result_sort</type> is formed from
|
|
|
210 |
exp then <type>body</type> would be constructed using the EXP
|
|
|
211 |
constructors and one would use exp_apply_token to give the
|
|
|
212 |
expansion. The list <type>tok_params</type> gives the formal
|
|
|
213 |
parameters of the definition in terms of TOKFORMALS constructed
|
|
|
214 |
using make_tok_formals:</para>
|
|
|
215 |
|
|
|
216 |
<programlisting>
|
|
|
217 |
sn: SORTNAME
|
|
|
218 |
tk: TDFINT
|
|
|
219 |
-> TOKFORMALS
|
|
|
220 |
</programlisting>
|
|
|
221 |
|
|
|
222 |
<para>The TDFINT <type>tk</type> will be the integer representation of
|
|
|
223 |
the formal parameter expressed as a TOKEN whose result sort is
|
|
|
224 |
<type>sn</type> (see more about name representation in
|
|
|
225 |
<link linkend="make_capsule-and-name-spaces">section 3.1</link>). To
|
|
|
226 |
use the parameter in the body of the TOKEN_DEFN, one simply uses the
|
|
|
227 |
_apply_token appropriate to <type>sn</type>.Note that sn may be a
|
|
|
228 |
TOKEN but the <type>result_sort</type> may not.</para>
|
|
|
229 |
|
|
|
230 |
<para>Hence the BITSTREAM
|
|
|
231 |
<type>param_sorts</type>(<parameter>token_value</parameter>) in the
|
|
|
232 |
actual parameter of exp_apply_token above is simply formed by the
|
|
|
233 |
catenation of constructions of the SORTs given by the SORTNAMEs in
|
|
|
234 |
the <type>tok_params</type> of the TOKEN being expanded.</para>
|
|
|
235 |
|
|
|
236 |
<para>Usually one gives a name to a TOKEN_DEFN using to form a TOKDEF
|
|
|
237 |
using make_tokdef :</para>
|
|
|
238 |
|
|
|
239 |
<programlisting>
|
|
|
240 |
tok: TDFINT
|
|
|
241 |
signature: OPTION(STRING)
|
|
|
242 |
def: BITSTREAM TOKEN_DEFN
|
|
|
243 |
-> TOKDEF
|
|
|
244 |
</programlisting>
|
|
|
245 |
|
|
|
246 |
<para>Here, <type>tok</type> gives the name that will be used to
|
|
|
247 |
identify the TOKEN whose expansion is given by <type>def</type>. Any
|
|
|
248 |
use of this TOKEN (e.g. in exp_apply_token) will be given by
|
|
|
249 |
make_token(<type>tok</type>). Once again, a BITSTREAM is used to
|
|
|
250 |
encapsulate the TOKEN_DEFN.</para>
|
|
|
251 |
|
|
|
252 |
<para>The significance of the signature parameter is discussed in
|
|
|
253 |
<link linkend="declaration-and-definition-signatures">section
|
|
|
254 |
3.2.2</link>.</para>
|
|
|
255 |
|
|
|
256 |
<para>Often, one wishes a token without giving its definition - the
|
|
|
257 |
definition could, for example, be platform-dependent. A TOKDEC
|
|
|
258 |
introduces such a token using make_tokdec:</para>
|
|
|
259 |
|
|
|
260 |
<programlisting>
|
|
|
261 |
tok: TDFINT
|
|
|
262 |
signature: OPTION(STRING)
|
|
|
263 |
s: SORTNAME
|
|
|
264 |
-> TOKDEC
|
|
|
265 |
</programlisting>
|
|
|
266 |
|
|
|
267 |
<para>Here the SORTNAME, <type>s</type>, is given by token:</para>
|
|
|
268 |
|
|
|
269 |
<programlisting>
|
|
|
270 |
result: SORTNAME
|
|
|
271 |
params: LIST(SORTNAME)
|
|
|
272 |
-> SORTNAME
|
|
|
273 |
</programlisting>
|
|
|
274 |
|
|
|
275 |
<para>which gives the result and parameter SORTs of
|
|
|
276 |
<type>tok</type>.</para>
|
|
|
277 |
|
|
|
278 |
<para>One can also use a TOKEN_DEFN in an anonymous fashion by giving
|
|
|
279 |
it as an actual parameter of a TOKEN which itself demands a TOKEN
|
|
|
280 |
parameter. To do this one simply uses use_tokdef:</para>
|
|
|
281 |
|
|
|
282 |
<programlisting>
|
|
|
283 |
tdef: BITSTREAM TOKEN_DEFN
|
|
|
284 |
-> TOKEN
|
|
|
285 |
</programlisting>
|
|
|
286 |
</sect2>
|
|
|
287 |
|
|
|
288 |
<sect2>
|
|
|
289 |
<title>A simple use of a TOKEN</title>
|
|
|
290 |
|
|
|
291 |
<para>The crucial use of TOKENs in TDF is to provide abstractions of
|
|
|
292 |
APIs (see <link linkend="tokens-and-apis">section 10</link>) but
|
|
|
293 |
they are also used as shorthand for commonly occurring
|
|
|
294 |
constructions. For example, given the TDF constructor plus,
|
|
|
295 |
mentioned above, we could define a plus with only two EXP parameters
|
|
|
296 |
more suitable to C by using the wrap constructor as the
|
|
|
297 |
ERROR_TREATMENT:</para>
|
|
|
298 |
|
|
|
299 |
<programlisting>
|
|
|
300 |
make_tokdef(C_plus, empty, token_definition(exp(), (make_tokformals(exp(), l),
|
|
|
301 |
make_tokformals(exp(), r)),
|
|
|
302 |
plus(wrap(), exp_apply_token(l, ()), exp_apply_token(r, ())))
|
|
|
303 |
</programlisting>
|
|
|
304 |
</sect2>
|
|
|
305 |
|
|
|
306 |
<sect2>
|
|
|
307 |
<title>Second class SORTs</title>
|
|
|
308 |
|
|
|
309 |
<para>Second class SORTs (given in table 2) cannot be
|
|
|
310 |
TOKENised. These are the "syntactic units" of TDF which the user
|
|
|
311 |
cannot extend; he can only produce them using the constructors
|
|
|
312 |
defined in core-TDF.</para>
|
|
|
313 |
|
|
|
314 |
<para>Some of these constructors are implicit. For example, there are
|
|
|
315 |
no explicit constructors for LIST or SLIST which are both used to
|
|
|
316 |
form lists of SORTs; their construction is simply part of the
|
|
|
317 |
encoding of TDF. However, it is forseen that LIST constructors would
|
|
|
318 |
be highly desireable and there will probably extensions to TDF to
|
|
|
319 |
promote LIST from a second-class SORT to a first-class one. This
|
|
|
320 |
will not apply to SLIST or to the other SORTs which have implicit
|
|
|
321 |
constructions. These include BITSTREAM, BYTESTREAM, TDFINT, TDFIDENT
|
|
|
322 |
and TDFSTRING.</para>
|
|
|
323 |
</sect2>
|
|
|
324 |
</sect1>
|
|
|
325 |
|
|
|
326 |
<sect1>
|
|
|
327 |
<title>CAPSULEs and UNITs</title>
|
|
|
328 |
|
|
|
329 |
<para>A CAPSULE is typically the result of a single compilation - one
|
|
|
330 |
could regard it as being the TDF analogue of a Unix .o file. Just as
|
|
|
331 |
with .o files, a set of CAPSULEs can be linked together to form
|
|
|
332 |
another. Similarly, a CAPSULE may be translated to make program for
|
|
|
333 |
some platform, provided certain conditions are met. One of these
|
|
|
334 |
conditions is obviously that a translator exists for the platform, but
|
|
|
335 |
there are others. They basically state that any names that are
|
|
|
336 |
undefined in the CAPSULE can be supplied by the system in which it is
|
|
|
337 |
to be run. For example, the translator could produce assembly code
|
|
|
338 |
with external identifiers which will be supplied by some system
|
|
|
339 |
library.</para>
|
|
|
340 |
|
|
|
341 |
<sect2 id="make_capsule-and-name-spaces">
|
|
|
342 |
<title>make_capsule and name-spaces</title>
|
|
|
343 |
|
|
|
344 |
<para>The only constructor for a CAPSULE is make_capsule. Its basic
|
|
|
345 |
function is to compose together UNITs which contain the declarations
|
|
|
346 |
and definitions of the program. The signature of make_capsule looks
|
|
|
347 |
rather daunting and is probable best represented graphically.</para>
|
|
|
348 |
|
|
|
349 |
<mediaobject>
|
|
|
350 |
<imageobject>
|
|
|
351 |
<imagedata fileref="images/guide2.png" format="PNG"/>
|
|
|
352 |
</imageobject>
|
|
|
353 |
<textobject>
|
|
|
354 |
<phrase>TBD</phrase>
|
|
|
355 |
</textobject>
|
|
|
356 |
<caption>
|
|
|
357 |
<para>TBD</para>
|
|
|
358 |
</caption>
|
|
|
359 |
</mediaobject>
|
|
|
360 |
|
|
|
361 |
<para>The diagram gives an example of a CAPSULE using the same
|
|
|
362 |
components as in the following text.</para>
|
|
|
363 |
|
|
|
364 |
<para>Each CAPSULE has its own name-space, distinct from all other
|
|
|
365 |
CAPSULEs' name-spaces and also from the name-spaces of its component
|
|
|
366 |
UNITs (see <link linkened="#11">section 3.1.2</link>). There are
|
|
|
367 |
several different kinds of names in TDF and each name-space is
|
|
|
368 |
further subdivided into one for each kind of name. The number of
|
|
|
369 |
different kinds of names is potentially unlimited but only three are
|
|
|
370 |
used in core-TDF, namely "tag", "token" and "al_tag". Those names in
|
|
|
371 |
a "tag" name-space generally correspond to identifiers in normal
|
|
|
372 |
programs and I shall use these as the paradigm for the properties of
|
|
|
373 |
them all.</para>
|
|
|
374 |
|
|
|
375 |
<para>The actual representations of a "tag" name in a given name-space
|
|
|
376 |
is an integer, described as SORT TDFINT. These integers are drawn
|
|
|
377 |
from a contiguous set starting from 0 up to some limit given by the
|
|
|
378 |
constructor which introduces the name-space. For CAPSULE
|
|
|
379 |
name-spaces, this is given by the <type>capsule_linking</type>
|
|
|
380 |
parameter of make_capsule:</para>
|
|
|
381 |
|
|
|
382 |
<programlisting>
|
|
|
383 |
capsule_linking: SLIST(CAPSULE_LINK)
|
|
|
384 |
</programlisting>
|
|
|
385 |
|
|
|
386 |
<para>In the most general case in core-TDF, there would be three
|
|
|
387 |
entries in the list introducing limits using make_capsule_link for
|
|
|
388 |
each of the "tag", "token" and "al_tag" name-spaces for the CAPSULE.
|
|
|
389 |
Thus if:</para>
|
|
|
390 |
|
|
|
391 |
<programlisting>
|
|
|
392 |
capsule_linking = (make_capsule_link("tag", 5),
|
|
|
393 |
make_capsule_link("token", 6),
|
|
|
394 |
make_capsule_link("al_tag", 7))
|
|
|
395 |
</programlisting>
|
|
|
396 |
|
|
|
397 |
<para>there are 5 CAPSULE "tag" names used within the CAPSULE, namely
|
|
|
398 |
0, 1, 2, 3 and 4; similarly there are 6 "token" names and 7 "al_tag"
|
|
|
399 |
names.</para>
|
|
|
400 |
|
|
|
401 |
<sect3 id="external-linkages">
|
|
|
402 |
<title>External linkages</title>
|
|
|
403 |
|
|
|
404 |
<para>The context of usage will always determine when and how an
|
|
|
405 |
integer is to be interpreted as a name in a particular name-space.
|
|
|
406 |
For example, a TAG in a UNIT is constructed by make_tag applied to
|
|
|
407 |
a TDFINT which will be interpreted as a name from that UNIT's
|
|
|
408 |
"tag" name-space. An integer representing a name in the CAPSULE
|
|
|
409 |
name-space would be found in a LINKEXTERN of the
|
|
|
410 |
<type>external_linkage</type> parameter of make_capsule.</para>
|
|
|
411 |
|
|
|
412 |
<programlisting>
|
|
|
413 |
external_linkage: SLIST(EXTERN_LINK)
|
|
|
414 |
</programlisting>
|
|
|
415 |
|
|
|
416 |
<para>Each EXTERN_LINK is itself formed from an SLIST of LINKEXTERNs
|
|
|
417 |
given by make_extern_link . The order of the EXTERN_LINKs
|
|
|
418 |
determines which name-space one is dealing with; they are in the
|
|
|
419 |
same order as given by the <type>extern_linkage</type> parameter.
|
|
|
420 |
Thus, with the <type>extern_linkage</type> given above, the first
|
|
|
421 |
EXTERN_LINK would deal with the "tag" name-space; Each of its
|
|
|
422 |
component LINKEXTERNs constructed by make_linkextern would be
|
|
|
423 |
identifying a tag number with some name external to the CAPSULE;
|
|
|
424 |
for example one might be:</para>
|
|
|
425 |
|
|
|
426 |
<programlisting>
|
|
|
427 |
make_linkextern(4, string_extern("printf"))
|
|
|
428 |
</programlisting>
|
|
|
429 |
|
|
|
430 |
<para>This would mean: identify the CAPSULE's "tag" 4 with an name
|
|
|
431 |
called "printf", external to the module. The name "printf" would
|
|
|
432 |
be used to linkage external to the CAPSULE; any name required
|
|
|
433 |
outside the CAPSULE would have to be linked like
|
|
|
434 |
this.</para></sect3>
|
|
|
435 |
|
|
|
436 |
<sect3 id="units">
|
|
|
437 |
<title>UNITs</title>
|
|
|
438 |
|
|
|
439 |
<para>This name "printf", of course, does not necessarily mean the C
|
|
|
440 |
procedure in the system library. This depends both on the system
|
|
|
441 |
context in which the CAPSULE is translated and also the meaning of
|
|
|
442 |
the CAPSULE "tag" name 4 given by the component UNITs of the
|
|
|
443 |
CAPSULE in the <type>groups</type> parameter of
|
|
|
444 |
make_capsule:</para>
|
|
|
445 |
|
|
|
446 |
<programlisting>
|
|
|
447 |
groups: SLIST(GROUP)
|
|
|
448 |
</programlisting>
|
|
|
449 |
|
|
|
450 |
<para>Each GROUP in the <type>groups</type> SLIST will be formed by
|
|
|
451 |
sets of UNITs of the same kind. Once again, there are a
|
|
|
452 |
potentially unlimited number of kinds of UNITs but core-TDF only
|
|
|
453 |
uses those named "tld","al_tagdefs", "tagdecs", "tagdefs",
|
|
|
454 |
"tokdecs" and "tokdefs".
|
|
|
455 |
|
|
|
456 |
<footnote><para>The "tld" UNITs gives usage information for names
|
|
|
457 |
to aid the linker, tld, to discover which namess have
|
|
|
458 |
definitions and some usage information. The C producer also
|
|
|
459 |
optionally constructs "diagnostics" UNITs (to give run-time
|
|
|
460 |
diagnostic information).</para></footnote>
|
|
|
461 |
|
|
|
462 |
These names will appear (in the same order as in
|
|
|
463 |
<type>groups</type>) in the <type>prop_names</type> parameter of
|
|
|
464 |
make_capsule, one for each kind of UNIT appearing in the
|
|
|
465 |
CAPSULE:</para>
|
|
|
466 |
|
|
|
467 |
<programlisting>
|
|
|
468 |
prop_names: SLIST(TDFIDENT)
|
|
|
469 |
</programlisting>
|
|
|
470 |
|
|
|
471 |
<para>Thus if:</para>
|
|
|
472 |
|
|
|
473 |
<programlisting>
|
|
|
474 |
prop_names = ("tagdecs", "tagdefs")
|
|
|
475 |
</programlisting>
|
|
|
476 |
|
|
|
477 |
<para>then, the first element of <type>groups</type> would contain
|
|
|
478 |
only "tagdecs" UNITs and and the second would contain only
|
|
|
479 |
"tagdefs" UNITs. A "tagdecs" UNIT contains things rather like a
|
|
|
480 |
set of global identifier declarations in C, while a "tagdefs" UNIT
|
|
|
481 |
is like a set of global definitions of identifiers.</para>
|
|
|
482 |
</sect3>
|
|
|
483 |
|
|
|
484 |
<sect3 id="make_unit">
|
|
|
485 |
<title>make_unit</title>
|
|
|
486 |
|
|
|
487 |
<para>Now we come to the construction of UNITs using make_unit, as
|
|
|
488 |
in the diagram below
|
|
|
489 |
|
|
|
490 |
<mediaobject>
|
|
|
491 |
<imageobject>
|
|
|
492 |
<imagedata fileref="images/guide1.png" format="PNG"/>
|
|
|
493 |
</imageobject>
|
|
|
494 |
<textobject>
|
|
|
495 |
<phrase>TBD</phrase>
|
|
|
496 |
</textobject>
|
|
|
497 |
<caption>
|
|
|
498 |
<para>TBD</para>
|
|
|
499 |
</caption>
|
|
|
500 |
</mediaobject>
|
|
|
501 |
</para>
|
|
|
502 |
|
|
|
503 |
<para>First we give the limits of the various name-spaces local to
|
|
|
504 |
the UNIT in the <type>local_vars</type> parameter:</para>
|
|
|
505 |
|
|
|
506 |
<programlisting>
|
|
|
507 |
local_vars: SLIST(TDFINT)
|
|
|
508 |
</programlisting>
|
|
|
509 |
|
|
|
510 |
<para>Just in the same way as with <type>external_linkage</type>,
|
|
|
511 |
the numbers in local_vars correspond (in the same order) to the
|
|
|
512 |
spaces indicated in <type>capsule_linking</type> in
|
|
|
513 |
<link linkend="make_capsule-and-name-spaces">section 3.1</link>.
|
|
|
514 |
With our example,the first element of <type>local_vars</type>
|
|
|
515 |
gives the number of "tag" names local to the UNIT, the second
|
|
|
516 |
gives the number of "token" names local to the UNIT etc. These
|
|
|
517 |
will include all the names used in the body of the UNIT. Each
|
|
|
518 |
declaration of a TAG, for example, will use a new number from the
|
|
|
519 |
"tag" name-space; there is no hiding or reuse of names within a
|
|
|
520 |
UNIT.</para>
|
|
|
521 |
</sect3>
|
|
|
522 |
|
|
|
523 |
<sect3 id="link">
|
|
|
524 |
<title>LINK</title>
|
|
|
525 |
|
|
|
526 |
<para>Connections between the CAPSULE name-spaces and the UNIT
|
|
|
527 |
name-spaces are made by LINKs in the <parameter>lks</parameter>
|
|
|
528 |
parameter of make_unit:</para>
|
|
|
529 |
|
|
|
530 |
<programlisting>
|
|
|
531 |
lks: SLIST(LINKS)
|
|
|
532 |
</programlisting>
|
|
|
533 |
|
|
|
534 |
<para>Once again, <type>lks</type> is effectively indexed by the
|
|
|
535 |
kind of name-space a. Each LINKS is an SLIST of LINKs each of
|
|
|
536 |
which which establish an identity between names in the CAPSULE
|
|
|
537 |
name-space and names in the UNIT name-space. Thus if the first
|
|
|
538 |
element of <type>lks</type> contains:</para>
|
|
|
539 |
|
|
|
540 |
<programlisting>
|
|
|
541 |
make_link(42, 4)
|
|
|
542 |
</programlisting>
|
|
|
543 |
|
|
|
544 |
<para>then, the UNIT "tag" 42 is identical to the CAPSULE "tag"
|
|
|
545 |
4.</para>
|
|
|
546 |
|
|
|
547 |
<para>Note that names from the CAPSULE name-space only arise in two
|
|
|
548 |
places, LINKs and LINK_EXTERNs. Every other use of names are
|
|
|
549 |
derived from some UNIT name-space.</para>
|
|
|
550 |
</sect3>
|
|
|
551 |
</sect2>
|
|
|
552 |
|
|
|
553 |
<sect2 id="definitions-and-declarations">
|
|
|
554 |
<title>Definitions and declarations</title>
|
|
|
555 |
|
|
|
556 |
<para>The encoding in the <type>properties</type>:BYTESTREAM parameter
|
|
|
557 |
of a UNIT is a PROPS, for which there are five constructors
|
|
|
558 |
corresponding to the kinds of UNITs in core-TDF, make_al_tagdefs,
|
|
|
559 |
make_tagdecs, make_tagdefs, make_tokdefs and make_tokdecs. Each of
|
|
|
560 |
these will declare or define names in the appropriate UNIT
|
|
|
561 |
name-space which can be used by make_link in the UNIT's
|
|
|
562 |
<type>lks</type> parameter as well as elsewhere in the
|
|
|
563 |
<type>properties</type> parameter. The distinction between
|
|
|
564 |
"declarations" and "definitions" is rather similar to C usage; a
|
|
|
565 |
declaration provides the "type" of a name, while a definition gives
|
|
|
566 |
its meaning. For tags, the "type" is the SORT SHAPE (see below). For
|
|
|
567 |
tokens, the "type" is a SORTNAME constructed from the SORTNAMEs of
|
|
|
568 |
the parameters and result of the TOKEN using token:
|
|
|
569 |
|
|
|
570 |
<programlisting>
|
|
|
571 |
params: LIST(SORTNAME)
|
|
|
572 |
result: SORTNAME
|
|
|
573 |
-> SORTNAME
|
|
|
574 |
</programlisting>
|
|
|
575 |
|
|
|
576 |
Taking make_tagdefs as a paradigm for PROPS, we have:
|
|
|
577 |
|
|
|
578 |
<programlisting>
|
|
|
579 |
no_labels: TDFINT
|
|
|
580 |
tds: SLIST(TAGDEF)
|
|
|
581 |
-> TAGDEF_PROPS
|
|
|
582 |
</programlisting>
|
|
|
583 |
|
|
|
584 |
The <type>no_labels</type> parameter introduces the size of yet
|
|
|
585 |
another name-space local to the PROPS, this time for the LABELs used
|
|
|
586 |
in the TAGDEFs. Each TAGDEF in <type>tds</type> will define a "tag"
|
|
|
587 |
name in the UNIT's name-space. The order of these TAGDEFs is
|
|
|
588 |
immaterial since the initialisations of the tags are values which
|
|
|
589 |
can be solved at translate time, load time or as unordered dynamic
|
|
|
590 |
initialisations.</para>
|
|
|
591 |
|
|
|
592 |
<para>There are three constructors for TAGDEFs, each with slightly
|
|
|
593 |
different properties. The simplest is make_id_tagdef:</para>
|
|
|
594 |
|
|
|
595 |
<programlisting>
|
|
|
596 |
t: TDFINT
|
|
|
597 |
signature: OPTION(STRING)
|
|
|
598 |
e: EXP x
|
|
|
599 |
-> TAGDEF
|
|
|
600 |
</programlisting>
|
|
|
601 |
|
|
|
602 |
<para>Here, <type>t</type> is the tag name and the evaluation of
|
|
|
603 |
<type>e</type> will be the value of SHAPE <type>x</type> of an
|
|
|
604 |
obtain_tag(<type>t</type>) in an EXP. Note that t is not a variable;
|
|
|
605 |
the value of obtain_tag(<type>t</type>) will be invariant. The
|
|
|
606 |
<type>signature</type> parameter gives a STRING (see
|
|
|
607 |
<link linkend="sort-string">section 3.2.3</link>) which may be used
|
|
|
608 |
as an name for the tag, external to TDF and also as a check
|
|
|
609 |
introduced by the producer that a tagdef and its corresponding
|
|
|
610 |
tagdec have the same notion of the language-specific type of the
|
|
|
611 |
tag.</para>
|
|
|
612 |
|
|
|
613 |
<para>The two other constructors for TAGDEF, make_var_tagdef and
|
|
|
614 |
common_tagdef both define variable tags and have the same
|
|
|
615 |
signature:</para>
|
|
|
616 |
|
|
|
617 |
<programlisting>
|
|
|
618 |
t: TDFINT
|
|
|
619 |
opt_access: OPTION(ACCESS)
|
|
|
620 |
signature: OPTION(STRING)
|
|
|
621 |
e: EXP x
|
|
|
622 |
-> TAGDEF
|
|
|
623 |
</programlisting>
|
|
|
624 |
|
|
|
625 |
<para>Once again <type>t</type> is tag name but now <type>e</type> is
|
|
|
626 |
initialisation of the variable <type>t</type>. A use of
|
|
|
627 |
obtain_tag(<type>t</type>) will give a pointer to the variable (of
|
|
|
628 |
SHAPE POINTER x), rather than its contents.
|
|
|
629 |
|
|
|
630 |
<footnote><para>There is a similar distinction between tags
|
|
|
631 |
introduced to be locals of a procedure using identify and variable
|
|
|
632 |
(see <link linkend="identify-variable">section 5.3.1</link>).</para>
|
|
|
633 |
</footnote>
|
|
|
634 |
|
|
|
635 |
There can only be one make_var_tagdef of a given tag in a program,
|
|
|
636 |
but there may be more than one common_tagdef, possibly with
|
|
|
637 |
different initialisations; however these initialisations must
|
|
|
638 |
overlap consistently just as in common blocks in FORTRAN.</para>
|
|
|
639 |
|
|
|
640 |
<para>The ACCESS parameter gives various properties required for the
|
|
|
641 |
tag being defined and is discussed in
|
|
|
642 |
<link linkend="sort-access">section 5.3.2</link>.</para>
|
|
|
643 |
|
|
|
644 |
<para>The initialisation EXPs of TAGDEFs will be evaluated before the
|
|
|
645 |
"main" program is started. An initialiation EXP must either be a
|
|
|
646 |
constant (in the sense of
|
|
|
647 |
<link linkend="constants">section 9</link>) or reduce to (either
|
|
|
648 |
directly or by token or _cond expansions) to an initial_value:
|
|
|
649 |
|
|
|
650 |
<programlisting>
|
|
|
651 |
init: EXP s
|
|
|
652 |
-> EXP s
|
|
|
653 |
</programlisting>
|
|
|
654 |
|
|
|
655 |
The translator will arrange that <type>init</type> will be evaluated
|
|
|
656 |
once only before any procedure application, other than those
|
|
|
657 |
themselves involved in initial_values, but after any constant
|
|
|
658 |
initialisations. The order of evaluation of different initial_values
|
|
|
659 |
is arbitrary.</para>
|
|
|
660 |
|
|
|
661 |
<sect3 id="scopes-and-linking">
|
|
|
662 |
<title>Scopes and linking</title>
|
|
|
663 |
|
|
|
664 |
<para>Only names introduced by AL_TAGDEFS, TAGDEFS, TAGDECs, TOKDECs
|
|
|
665 |
and TOKDEFs can be used in other UNITs (and then, only via the
|
|
|
666 |
<type>lks</type> parameters of the UNITs involved). You can regard
|
|
|
667 |
them as being similar to C global declarations. Token definitions
|
|
|
668 |
include their declarations implicitly; however this is not true of
|
|
|
669 |
tags. This means that any CAPSULE which uses or defines a tag
|
|
|
670 |
across UNITs must include a TAGDEC for that tag in its "tagdecs"
|
|
|
671 |
UNITs. A TAGDEC is constructed using either make_id_tagdec,
|
|
|
672 |
make_var_tagdec or common_tagdec, all with the same form:
|
|
|
673 |
|
|
|
674 |
<programlisting>
|
|
|
675 |
t_intro: TDFINT
|
|
|
676 |
acc: OPTION(ACCESS)
|
|
|
677 |
signature: OPTION(STRING)
|
|
|
678 |
x: SHAPE
|
|
|
679 |
-> TAGDEC
|
|
|
680 |
</programlisting>
|
|
|
681 |
|
|
|
682 |
Here the tagname is given by <type>t_intro</type>; the SHAPE
|
|
|
683 |
<type>x</type> will defined the space and alignment required for the
|
|
|
684 |
tag (this is analogous to the type in a C declaration). The
|
|
|
685 |
<type>acc</type> field will define certain properties of the tag not
|
|
|
686 |
implicit in its SHAPE; I shall return to the kinds of properties
|
|
|
687 |
envisaged in discussing local declarations in
|
|
|
688 |
<link linkend="defining-and-using-locals">section 5.3</link>.</para>
|
|
|
689 |
|
|
|
690 |
<para>Most program will appear in the "tagdefs" UNITs - they will
|
|
|
691 |
include the definitions of the procedures of the program which in
|
|
|
692 |
turn will include local definitions of tags for the locals of the
|
|
|
693 |
procedures.</para>
|
|
|
694 |
|
|
|
695 |
<para>The standard TDF linker allows one to link CAPSULEs together
|
|
|
696 |
using the name identifications given in the LINKEXTERNs, perhaps
|
|
|
697 |
hiding some of them in the final CAPSULE. It does this just by
|
|
|
698 |
generating a new CAPSULE name-space, grouping together component
|
|
|
699 |
UNITs of the same kind and replacing their
|
|
|
700 |
<parameter>lks</parameter> parameters with values derived from the
|
|
|
701 |
new CAPSULE name-space without changing the UNITs' name-spaces or
|
|
|
702 |
their <parameter>props</parameter> parameters. The operation of
|
|
|
703 |
grouping together UNITs is effectively assumed to be associative,
|
|
|
704 |
commutative and idempotent e.g. if the same tag is declared in two
|
|
|
705 |
capsules it is assumed to be the same thing . It also means that
|
|
|
706 |
there is no implied order of evaluation of UNITs or of their
|
|
|
707 |
component TAGDEFs</para>
|
|
|
708 |
|
|
|
709 |
<para>Different languages have different conventions for deciding how
|
|
|
710 |
programs are actually run. For example, C requires the presence of a
|
|
|
711 |
suitably defined "main" procedure; this is usually enforced by
|
|
|
712 |
requiring the system ld utility to bind the name "main" along with
|
|
|
713 |
the definitions of any library values required. Otherwise, the C
|
|
|
714 |
conventions are met by standard TDF linking. Other languages have
|
|
|
715 |
more stringent requirements. For example, C++ requires dynamic
|
|
|
716 |
initialisation of globals, using initial_value. As the only runnable
|
|
|
717 |
code in TDF is in procedures, C++ would probably require an
|
|
|
718 |
additional linking phase to construct a "main" procedure which calls
|
|
|
719 |
the initialisation procedures of each CAPSULE involved if the system
|
|
|
720 |
linker did not provide suitable C++ linking.</para>
|
|
|
721 |
</sect3>
|
|
|
722 |
|
|
|
723 |
<sect3 id="declaration-and-definition-signatures">
|
|
|
724 |
<title>Declaration and definition signatures</title>
|
|
|
725 |
|
|
|
726 |
<para>The <type>signature</type> arguments of TAGDEFs and TAGDECs
|
|
|
727 |
are designed to allow a measure of cross-UNIT checking when
|
|
|
728 |
linking independently compiled CAPSULEs. Suppose that we have a
|
|
|
729 |
tag, <type>t</type>, used in one CAPSULE and defined in another;
|
|
|
730 |
the first CAPSULE would have to have a TAGDEC for <type>t</type>
|
|
|
731 |
whose TAGDEF is in the second. The <type>signature</type> STRING
|
|
|
732 |
of both could be arranged to represent the language-specific type
|
|
|
733 |
of <type>t</type> as understood at compilation-time. Clearly, when
|
|
|
734 |
the CAPSULEs are linked the types must be identical and hence
|
|
|
735 |
their STRING representation must be the same - a translator will
|
|
|
736 |
reject any attempt to link definitions and declarations of the
|
|
|
737 |
same object with different signatures.</para>
|
|
|
738 |
|
|
|
739 |
<para>Similar considerations apply to TOKDEFs and TOKDECs; the
|
|
|
740 |
"type" of a TOKEN may not have any familiar analogue in most HLLs,
|
|
|
741 |
but the principle remains the same.</para>
|
|
|
742 |
</sect3>
|
|
|
743 |
|
|
|
744 |
<sect3 id="sort-string">
|
|
|
745 |
<title>STRING</title>
|
|
|
746 |
|
|
|
747 |
<para>The SORT STRING is used in various constructs other than
|
|
|
748 |
declarations and definitions. It is a first-class SORT with
|
|
|
749 |
string_apply_token and string_cond. A primitive STRING is
|
|
|
750 |
constructed from a TDFSTRING(k,n) which is an encoding of n
|
|
|
751 |
integers,each of k bits, using make_string:
|
|
|
752 |
|
|
|
753 |
<programlisting>
|
|
|
754 |
arg: TDFSTRING(k, n)
|
|
|
755 |
-> STRING(k, n)
|
|
|
756 |
</programlisting>
|
|
|
757 |
|
|
|
758 |
STRINGs may be concatenated using concat_string:
|
|
|
759 |
|
|
|
760 |
<programlisting>
|
|
|
761 |
arg1: STRING(k, n)
|
|
|
762 |
arg2: STRING(k,m)
|
|
|
763 |
-> STRING(k, n + m)
|
|
|
764 |
</programlisting>
|
|
|
765 |
|
|
|
766 |
Being able to compose strings, including token applications etc,
|
|
|
767 |
means that late-binding is possible in <type>signature</type>
|
|
|
768 |
checking in definitions and declarations. This late-binding means
|
|
|
769 |
that the representation of platform-dependent HLL types need only
|
|
|
770 |
be fully expanded at install-time and hence the types could be
|
|
|
771 |
expressed in their representational form on the specific
|
|
|
772 |
platform.</para>
|
|
|
773 |
</sect3>
|
|
|
774 |
</sect2>
|
|
|
775 |
</sect1>
|
|
|
776 |
|
|
|
777 |
<sect1>
|
|
|
778 |
<title>SHAPEs, ALIGNMENTs and OFFSETs.</title>
|
|
|
779 |
|
|
|
780 |
<para>In most languages there is some notion of the type of a value.
|
|
|
781 |
This is often an uncomfortable mix of a definition of a representation
|
|
|
782 |
for the value and a means of choosing which operators are applicable
|
|
|
783 |
to the value. The TDF analogue of the type of value is its SHAPE
|
|
|
784 |
(S3.20). A SHAPE is only concerned with the representation of a value,
|
|
|
785 |
being an abstraction of its size and alignment properties. Clearly an
|
|
|
786 |
architecture-independent representation of a program cannot say, for
|
|
|
787 |
example, that a pointer is 32 bits long; the size of pointers has to
|
|
|
788 |
be abstracted so that translations to particular architectures can
|
|
|
789 |
choose the size that is apposite for the platform.</para>
|
|
|
790 |
|
|
|
791 |
<sect2 id="S19">
|
|
|
792 |
<title>Shapes</title>
|
|
|
793 |
|
|
|
794 |
<para>There are ten different basic constructors for the SORT SHAPE
|
|
|
795 |
from bitfield to top as shown in table 3. SHAPEs arising from those
|
|
|
796 |
constructors are used as qualifiers (just using an upper case
|
|
|
797 |
version of the constructor name) to various SORTs in the definition;
|
|
|
798 |
for example, EXP TOP is an expression with top SHAPE. This is just
|
|
|
799 |
used for definitional purposes only; there is no SORT SHAPENAME as
|
|
|
800 |
one has SORTNAME.</para>
|
|
|
801 |
|
|
|
802 |
<para>In the TDF specification of EXPs, you will observe that all EXPs
|
|
|
803 |
in constructor signatures are all qualified by the SHAPE name; for
|
|
|
804 |
example, a parameter might be EXP INTEGER(v). This merely means that
|
|
|
805 |
for the construct to be meaningful the parameter must be derived
|
|
|
806 |
from a constructor defined to be an EXP INTEGER(v). You might be
|
|
|
807 |
forgiven for assuming that TDF is hence strongly-typed by its
|
|
|
808 |
SHAPEs. This is not true; the producer must get it right. There are
|
|
|
809 |
some checks in translators, but these are not exhaustive and are
|
|
|
810 |
more for the benefit of translator writers than for the user. A tool
|
|
|
811 |
for testing the SHAPE correctness of a TDF program would be useful
|
|
|
812 |
but has yet to be written.</para>
|
|
|
813 |
|
|
|
814 |
<sect3 id="S20">
|
|
|
815 |
<title>TOP, BOTTOM, LUB</title>
|
|
|
816 |
|
|
|
817 |
<para>Two of the SHAPE constructions are rather specialised; these
|
|
|
818 |
are TOP and BOTTOM. The result of any expression with a TOP shape
|
|
|
819 |
will always be discarded; examples are those produced by assign
|
|
|
820 |
and integer_test . A BOTTOM SHAPE is produced by an expression
|
|
|
821 |
which will leave the current flow of control e.g. goto . The
|
|
|
822 |
significance of these SHAPEs only really impinges on the
|
|
|
823 |
computation of the shapes of constructs which have alternative
|
|
|
824 |
expressions as results. For example, the result of conditional is
|
|
|
825 |
the result of one of its component expressions. In this case, the
|
|
|
826 |
SHAPE of the result is described as the LUB of the SHAPEs of the
|
|
|
827 |
components. This simply means that if one of the component SHAPEs
|
|
|
828 |
is TOP then the resulting SHAPE is TOP; if one is BOTTOM then the
|
|
|
829 |
resulting SHAPE is the SHAPE of the other; otherwise both
|
|
|
830 |
component SHAPEs must be equal and is the resulting SHAPE. Since
|
|
|
831 |
this operation is associative, commutative and idempotent, we can
|
|
|
832 |
speak quite unambiguously of the LUB of several SHAPEs.</para>
|
|
|
833 |
</sect3>
|
|
|
834 |
|
|
|
835 |
<sect3 id="S21">
|
|
|
836 |
<title>INTEGER</title>
|
|
|
837 |
|
|
|
838 |
<para>Integer values in TDF have shape INTEGER(v) where v is of SORT
|
|
|
839 |
VARIETY. The constructor for this SHAPE is integer with a VARIETY
|
|
|
840 |
parameter. The basic constructor for VARIETY is var_limits which
|
|
|
841 |
has a pair of signed natural numbers as parameters giving the
|
|
|
842 |
limits of possible values that the integer can attain. The SHAPE
|
|
|
843 |
required for a 32 bit signed integer would be:
|
|
|
844 |
|
|
|
845 |
<programlisting>
|
|
|
846 |
integer(var_limits(-2<superscript>31</superscript>, 2<superscript>31</superscript> - 1))
|
|
|
847 |
</programlisting>
|
|
|
848 |
|
|
|
849 |
while an unsigned char is:
|
|
|
850 |
|
|
|
851 |
<programlisting>
|
|
|
852 |
integer(var_limits(0, 255))
|
|
|
853 |
</programlisting>
|
|
|
854 |
|
|
|
855 |
A translator should represent each integer variety by an object
|
|
|
856 |
big enough (or bigger) to contain all the possible values with
|
|
|
857 |
limits of the VARIETY. That being said, I must confess that most
|
|
|
858 |
current translators do not handle integers of more than the
|
|
|
859 |
maximum given naturally by the target architecture, but this will
|
|
|
860 |
be rectified in due course.
|
|
|
861 |
|
|
|
862 |
<para>The other way of constructing a VARIETY is to specify the
|
|
|
863 |
number of bits required for its 2s-complemennt representation
|
|
|
864 |
using var_width:</para>
|
|
|
865 |
|
|
|
866 |
<programlisting>
|
|
|
867 |
signed_width: BOOL
|
|
|
868 |
width: NAT
|
|
|
869 |
-> VARIETY
|
|
|
870 |
</programlisting>
|
|
|
871 |
</para>
|
|
|
872 |
</sect3>
|
|
|
873 |
|
|
|
874 |
<sect3 id="S22">
|
|
|
875 |
<title>FLOATING and complex</title>
|
|
|
876 |
|
|
|
877 |
<para>Similarly, floating point and complex numbers have shape
|
|
|
878 |
FLOATING qualified by a FLOATING_VARIETY.</para>
|
|
|
879 |
|
|
|
880 |
<para>A FLOATING_VARIETY for a real number is constructed using
|
|
|
881 |
fvar_parms:</para>
|
|
|
882 |
|
|
|
883 |
<programlisting>
|
|
|
884 |
base: NAT
|
|
|
885 |
mantissa_digits: NAT
|
|
|
886 |
minimum_exponent: NAT
|
|
|
887 |
maximum_exponent:: NAT
|
|
|
888 |
-> FLOATING_VARIETY
|
|
|
889 |
</programlisting>
|
|
|
890 |
|
|
|
891 |
<para>A FLOATING_VARIETY specifies the base, number of mantissa
|
|
|
892 |
digits, and maximum and minimum exponent. Once again, it is
|
|
|
893 |
intended that the translator will choose a representation which
|
|
|
894 |
will contain all possible values, but in practice only those which
|
|
|
895 |
are included in IEEE float, double and extended are actually
|
|
|
896 |
implemented.</para>
|
|
|
897 |
|
|
|
898 |
<para>Complex numbers have a floating variety constructed by
|
|
|
899 |
complex_parms which has the the same signature as fvar_parms. The
|
|
|
900 |
representation of these numbers is likely to be a pair of real
|
|
|
901 |
numbers each defined as if by fvar_parms with the same arguments.
|
|
|
902 |
The real and imaginary parts of of a complex number can be
|
|
|
903 |
extracted using real_part and imaginary_part; these could have
|
|
|
904 |
been injected ito the complex number using make_complex or any of
|
|
|
905 |
the complex operations. Many translators will simply transform
|
|
|
906 |
complex numbers into COMPOUNDs consisting of two floating point
|
|
|
907 |
numbers, transforming the complex operations into floating point
|
|
|
908 |
operations on the fields.</para>
|
|
|
909 |
</sect3>
|
|
|
910 |
|
|
|
911 |
<sect3 id="S23">
|
|
|
912 |
<title>BITFIELD</title>
|
|
|
913 |
|
|
|
914 |
<para>A number of contiguous bits have shape BITFIELD, qualified by
|
|
|
915 |
a BITFIELD_VARIETY (S3.4) which gives the number of bits involved
|
|
|
916 |
and whether these bits are to be treated as signed or unsigned
|
|
|
917 |
integers. Current translators put a maximum of 32 or 64 on the
|
|
|
918 |
number of bits.</para>
|
|
|
919 |
</sect3>
|
|
|
920 |
|
|
|
921 |
<sect3 id="S24">
|
|
|
922 |
<title>PROC</title>
|
|
|
923 |
|
|
|
924 |
<para>The representational SHAPEs of procedure values is given by
|
|
|
925 |
PROC with constructor proc . I shall return to this in the
|
|
|
926 |
description of the operations which use it.</para>
|
|
|
927 |
</sect3>
|
|
|
928 |
|
|
|
929 |
<sect3 id="S25">
|
|
|
930 |
<title>Non-primitive SHAPEs</title>
|
|
|
931 |
|
|
|
932 |
<para>The construction of the other four SHAPEs involves either
|
|
|
933 |
existing SHAPEs or the alignments of existing SHAPEs. These are
|
|
|
934 |
constructed by compound, nof, offset and pointer. Before
|
|
|
935 |
describing these, we require a digression into what is meant by
|
|
|
936 |
alignments and offsets.</para>
|
|
|
937 |
</sect3>
|
|
|
938 |
</sect2>
|
|
|
939 |
|
|
|
940 |
<sect2>
|
|
|
941 |
<title>Alignments</title>
|
|
|
942 |
|
|
|
943 |
<para>In most processor architectures there are limitations on how one
|
|
|
944 |
can address particular kinds of objects in convenient ways. These
|
|
|
945 |
limitations are usually defined as part of the ABI for the
|
|
|
946 |
processor. For example, in the MIPs processor the fastest way to
|
|
|
947 |
access a 32-bit integer is to ensure that the address of the integer
|
|
|
948 |
is aligned on a 4-byte boundary in the address space; obviously one
|
|
|
949 |
can extract a mis-aligned integer but not in one machine
|
|
|
950 |
instruction. Similarly, 16-bit integers should be aligned on a
|
|
|
951 |
2-byte boundary. In principle, each primitive object could have
|
|
|
952 |
similar restrictions for efficient access and these restrictions
|
|
|
953 |
could vary from platform to platform. Hence, the notion of alignment
|
|
|
954 |
has to be abstracted to form part of the architecture independent
|
|
|
955 |
TDF - we cannot assume that any particular alignment regime will
|
|
|
956 |
hold universally.</para>
|
|
|
957 |
|
|
|
958 |
<para>The abstraction of alignments clearly has to cover compound
|
|
|
959 |
objects as well as primitive ones like integers. For example, if a
|
|
|
960 |
field of structure in C is to be accessed efficiently, then the
|
|
|
961 |
alignment of the field will influence the alignment of the structure
|
|
|
962 |
as whole; the structure itself could be a component of a larger
|
|
|
963 |
object whose alignment must then depend on the alignment of the
|
|
|
964 |
structure and so on. In general, we find that a compound alignment
|
|
|
965 |
is given by the maximum alignment of its components, regardless of
|
|
|
966 |
the form of the compound object e.g. whether it is a structure,
|
|
|
967 |
union, array or whatever.</para>
|
|
|
968 |
|
|
|
969 |
<para>This gives an immediate handle on the abstraction of the
|
|
|
970 |
alignment of a compound object - it is just the set of abstractions
|
|
|
971 |
of the alignments of its components. Since "maximum" is associative,
|
|
|
972 |
commutative and idempotent, the component sets can be combined using
|
|
|
973 |
normal set-union rules. In other words, a compound alignment is
|
|
|
974 |
abstracted as the set of alignments of the primitive objects which
|
|
|
975 |
make up the compound object. Thus the alignment abstraction of a C
|
|
|
976 |
structure with only float fields is the singleton set containing the
|
|
|
977 |
alignment of a float while that of a C union of an int and this
|
|
|
978 |
structure is a pair of the alignments of an int and a float.</para>
|
|
|
979 |
|
|
|
980 |
<sect3 id="alignement-constructors">
|
|
|
981 |
<title>ALIGNMENT constructors</title>
|
|
|
982 |
|
|
|
983 |
<para>The TDF abstraction of an alignment has SORT ALIGNMENT. The
|
|
|
984 |
constructor, unite_alignments, gives the set-union of its
|
|
|
985 |
ALIGNMENT parameters; this would correspond to taking a maximum of
|
|
|
986 |
two real alignments in the translator.</para>
|
|
|
987 |
|
|
|
988 |
<para>The constructor, alignment, gives the ALIGNMENT of a given
|
|
|
989 |
SHAPE according to the rules given in the definition. These rules
|
|
|
990 |
effectively <type>define</type> the primitive ALIGNMENTs as in the
|
|
|
991 |
ALIGNMENT column of table 3. Those for PROC, all OFFSETs and all
|
|
|
992 |
POINTERs are constants regardless of any SHAPE qualifiers. Each of
|
|
|
993 |
the INTERGER VARIETYs, each of the FLOATING VARIETYs and each of
|
|
|
994 |
the BITFIELD VARIETYs have their own ALIGNMENTs. These ALIGNMENTs
|
|
|
995 |
will be bound to values apposite to the particular platform at
|
|
|
996 |
translate-time. The ALIGNMENT of TOP is conventionally taken to be
|
|
|
997 |
the empty set of ALIGNMENTs (corresponding to the minimum
|
|
|
998 |
alignment on the platform).</para>
|
|
|
999 |
|
|
|
1000 |
<para>The alignment of a procedure parameter clearly has to include
|
|
|
1001 |
the alignment of its SHAPE; however, most ABIs will mandate a
|
|
|
1002 |
greater alignment for some SHAPEs e.g. the alignment of a byte
|
|
|
1003 |
parameter is usually defined to be on a 32-bit rather than an
|
|
|
1004 |
8-bit boundary. The constructor, parameter_alignment, gives the
|
|
|
1005 |
ALIGNMENT of a parameter of given SHAPE.</para>
|
|
|
1006 |
</sect3>
|
|
|
1007 |
|
|
|
1008 |
<sect3 id="S28">
|
|
|
1009 |
<title>Special alignments</title>
|
|
|
1010 |
|
|
|
1011 |
<para>There are several other special ALIGNMENTs.</para>
|
|
|
1012 |
|
|
|
1013 |
<para>The alignment of a code address is {<type>code</type>} given
|
|
|
1014 |
by code_alignment; this will be the alignment of a pointer given
|
|
|
1015 |
by make_local_lv giving the value of a label.</para>
|
|
|
1016 |
|
|
|
1017 |
<para>The other special ALIGNMENTs are considered to include all of
|
|
|
1018 |
the others, but remain distinct. They are all concerned with
|
|
|
1019 |
offsets and pointers relevant to procedure frames, procedure
|
|
|
1020 |
parameters and local allocations and are collectively known as
|
|
|
1021 |
frame alignments. These frame alignments differ from the normal
|
|
|
1022 |
alignments in that their mapping to a given architecture is rather
|
|
|
1023 |
more than just saying that it describes some n-bit boundary. For
|
|
|
1024 |
example, alloca_alignment describes the alignment of dynamic space
|
|
|
1025 |
produced by local_alloc (roughly the C alloca). Now, an ABI could
|
|
|
1026 |
specify that the alloca space is a stack disjoint from the normal
|
|
|
1027 |
procedure stack; thus manipulations of space at alloca_alignment
|
|
|
1028 |
may involve different code to space generated in other
|
|
|
1029 |
ways.</para>
|
|
|
1030 |
|
|
|
1031 |
<para>Similar considerations apply to the other special alignments,
|
|
|
1032 |
callees_alignment(b), callers_alignment(b) and locals_alignment.
|
|
|
1033 |
The first two give the alignments of the bases of the two
|
|
|
1034 |
different parameter spaces in procedures (q.v.) and
|
|
|
1035 |
locals_alignment gives the alignment of the base of locally
|
|
|
1036 |
declared tags within a procedure. The exact interpretation of
|
|
|
1037 |
these depends on how the frame stack is defined in the target ABI,
|
|
|
1038 |
e.g. does the stack grow downwards or upwards?</para>
|
|
|
1039 |
|
|
|
1040 |
<para>The final special alignment is var_param_alignment. This
|
|
|
1041 |
describes the alignment of a special kind of parameter to a
|
|
|
1042 |
procedure which can be of arbitrary length (see
|
|
|
1043 |
<link linkend="vartag-varparam">section 5.1.1</link>).</para>
|
|
|
1044 |
</sect3>
|
|
|
1045 |
|
|
|
1046 |
<sect3 id="al_tag-make_al_tagdef">
|
|
|
1047 |
<title>AL_TAG, make_al_tagdef</title>
|
|
|
1048 |
|
|
|
1049 |
<para>Alignments can also be named as AL_TAGs using make_al_tagdef.
|
|
|
1050 |
There is no corresponding make_al_tagdec since AL_TAGs are
|
|
|
1051 |
implicitly declared by their constructor, make_al_tag. The main
|
|
|
1052 |
reason for having names for alignments is to allow one to resolve
|
|
|
1053 |
the ALIGNMENTs of recursive data structures. If, for example, we
|
|
|
1054 |
have mutually recursive structures, their ALIGNMENTs are best
|
|
|
1055 |
named and given as a set of equations formed by AL_TAGDEFs. A
|
|
|
1056 |
translator can then solve these equations trivially by
|
|
|
1057 |
substitution; this is easy because the only significant operation
|
|
|
1058 |
is set-union.</para>
|
|
|
1059 |
</sect3>
|
|
|
1060 |
</sect2>
|
|
|
1061 |
|
|
|
1062 |
<sect2 id="pointer-and-offset-shapes">
|
|
|
1063 |
<title>Pointer and offset SHAPEs</title>
|
|
|
1064 |
|
|
|
1065 |
<para>A pointer value must have a form which reflects the alignment of
|
|
|
1066 |
the object that it points to; for example, in the MIPs processor,
|
|
|
1067 |
the bottom two bits of a pointer to an integer must be zero. The TDF
|
|
|
1068 |
SHAPE for a pointer is POINTER qualified by the ALIGNMENT of the
|
|
|
1069 |
object pointed to. The constructor pointer uses this alignment to
|
|
|
1070 |
make a POINTER SHAPE.</para>
|
|
|
1071 |
|
|
|
1072 |
<sect3 id="offset">
|
|
|
1073 |
<title>OFFSET</title>
|
|
|
1074 |
|
|
|
1075 |
<para>Expressions which give sizes or offsets in TDF have an OFFSET
|
|
|
1076 |
SHAPE. These are always described as the difference between two
|
|
|
1077 |
pointers. Since the alignments of the objects pointed to could be
|
|
|
1078 |
different, an OFFSET is qualified by these two ALIGNMENTs. Thus an
|
|
|
1079 |
EXP OFFSET(X,Y) is the difference between an EXP POINTER(X) and an
|
|
|
1080 |
EXP POINTER(Y). In order for the alignment rules to apply, the set
|
|
|
1081 |
X of alignments must include Y. The constructor offset uses two
|
|
|
1082 |
such alignments to make an OFFSET SHAPE. However, many instances
|
|
|
1083 |
of offsets will be produced implicitly by the offset arithmetic,
|
|
|
1084 |
e.g., offset_pad:
|
|
|
1085 |
|
|
|
1086 |
<programlisting>
|
|
|
1087 |
a: ALIGNMENT
|
|
|
1088 |
arg1: EXP OFFSET(z, t)
|
|
|
1089 |
-> EXP OFFSET(z xc8 a, a)
|
|
|
1090 |
</programlisting>
|
|
|
1091 |
|
|
|
1092 |
This gives the next OFFSET greater or equal to <type>arg1</type>
|
|
|
1093 |
at which an object of ALIGNMENT <parameter>a</parameter> can be
|
|
|
1094 |
placed. It should be noted that the calculation of shapes and
|
|
|
1095 |
alignments are all translate-time activities; only EXPs should
|
|
|
1096 |
produce runnable code. This code, of course, may depend on the
|
|
|
1097 |
shapes and alignments involved; for example, offset_pad might
|
|
|
1098 |
round up <type>arg1</type> to be a multiple of four bytes if
|
|
|
1099 |
<parameter>a</parameter> was an integer ALIGNMENT and
|
|
|
1100 |
<parameter>z</parameter> was a character ALIGNMENT. Translators
|
|
|
1101 |
also do extensive constant analysis, so if <type>arg1</type> was a
|
|
|
1102 |
constant offset, then the round-off would be done at
|
|
|
1103 |
translate-time to produce another constant.</para>
|
|
|
1104 |
</sect3>
|
|
|
1105 |
</sect2>
|
|
|
1106 |
|
|
|
1107 |
<sect2>
|
|
|
1108 |
<title>Compound SHAPEs</title>
|
|
|
1109 |
|
|
|
1110 |
<para>The alignments of compound SHAPEs (i.e. those arising from the
|
|
|
1111 |
constructors compound and nof) are derived from the constructions
|
|
|
1112 |
which produced the SHAPE. To take the easy one first, the
|
|
|
1113 |
constructor nof has signature:
|
|
|
1114 |
|
|
|
1115 |
<programlisting>
|
|
|
1116 |
n: NAT
|
|
|
1117 |
s: SHAPE
|
|
|
1118 |
-> SHAPE
|
|
|
1119 |
</programlisting>
|
|
|
1120 |
|
|
|
1121 |
This SHAPE describes an array of <type>n</type> values all of SHAPE
|
|
|
1122 |
<type>s</type>; note that <type>n</type> is a natural number and
|
|
|
1123 |
hence is a constant known to the producer. Throughout the definition
|
|
|
1124 |
this is referred to as the SHAPE NOF(n, s). The ALIGNMENT of such a
|
|
|
1125 |
value is alignment(s); i.e. the alignment of an array is just the
|
|
|
1126 |
alignment of its elements.</para>
|
|
|
1127 |
|
|
|
1128 |
<para>The other compound SHAPEs are produced using compound:</para>
|
|
|
1129 |
|
|
|
1130 |
<programlisting>
|
|
|
1131 |
sz: EXP OFFSET(x, y)
|
|
|
1132 |
-> SHAPE
|
|
|
1133 |
</programlisting>
|
|
|
1134 |
|
|
|
1135 |
<para>The <type>sz</type> parameter gives the minimum size which can
|
|
|
1136 |
accommodate the SHAPE.</para>
|
|
|
1137 |
|
|
|
1138 |
<sect3 id="offset-arithmetic">
|
|
|
1139 |
<title>Offset arithmetic with compound shapes</title>
|
|
|
1140 |
|
|
|
1141 |
<para>The constructors offset_add, offset_zero and shape_offset are
|
|
|
1142 |
used together with offset_pad to implement (<emphasis>inter
|
|
|
1143 |
alia</emphasis>) selection from structures represented by
|
|
|
1144 |
COMPOUND SHAPEs. Starting from the zero OFFSET given by
|
|
|
1145 |
offset_zero, one can construct an EXP which is the offset of a
|
|
|
1146 |
field by padding and adding offsets until the required field is
|
|
|
1147 |
reached. The value of the field required could then be extracted
|
|
|
1148 |
using component or add_to_ptr. Most producers would define a TOKEN
|
|
|
1149 |
for the EXP OFFSET of each field of a structure or union used in
|
|
|
1150 |
the program simply to reduce the size of the TDF.</para>
|
|
|
1151 |
|
|
|
1152 |
<para>The SHAPE of a C structure consisting of an char followed by
|
|
|
1153 |
an int would require <varname>x</varname> to be the set consisting
|
|
|
1154 |
of two INTEGER VARIETYs, one for int and one for char, and
|
|
|
1155 |
<type>sz</type> would probably have been constructed like:</para>
|
|
|
1156 |
|
|
|
1157 |
<programlisting>
|
|
|
1158 |
sz = offset_add(offset_pad(int_al, shape_offset(char)), shape_offset(int))
|
|
|
1159 |
</programlisting>
|
|
|
1160 |
|
|
|
1161 |
<para>The various rules for the ALIGNMENT qualifiers of the OFFSETs
|
|
|
1162 |
give the required SHAPE; these rules also ensure that offset
|
|
|
1163 |
arithmetic can be implemented simply using integer arithmetic for
|
|
|
1164 |
standard architectures (see
|
|
|
1165 |
<link linkend="model-for-32bit-architecture">section 13.1</link>).
|
|
|
1166 |
Note that the OFFSET computed here is the minimum size for the
|
|
|
1167 |
SHAPE. This would not in general be the same as the difference
|
|
|
1168 |
between successive elements of an array of these structures which
|
|
|
1169 |
would have SHAPE OFFSET(<parameter>x</parameter>,
|
|
|
1170 |
<parameter>x</parameter>) as produced by
|
|
|
1171 |
offset_pad(<parameter>x</parameter>, <parameter>sz</parameter>).
|
|
|
1172 |
For examples of the use of OFFSETs to access and create
|
|
|
1173 |
structures, see
|
|
|
1174 |
<link linkend="tdf-expansions-of-offsets">section 12</link>.</para>
|
|
|
1175 |
</sect3>
|
|
|
1176 |
|
|
|
1177 |
<sect3 id="S34">
|
|
|
1178 |
<title>offset_mult</title>
|
|
|
1179 |
|
|
|
1180 |
<para>In C, all structures have size known at translate-time. This
|
|
|
1181 |
means that OFFSETs for all field selections of structures and
|
|
|
1182 |
unions are translate-time constants; there is never any need to
|
|
|
1183 |
produce code to compute these sizes and offsets. Other languages
|
|
|
1184 |
(notably Ada) do have variable size structures and so sizes and
|
|
|
1185 |
offsets within these structures may have to be computed
|
|
|
1186 |
dynamically. Indexing in C will require the computation of dynamic
|
|
|
1187 |
OFFSETs; this would usually be done by using offset_mult to
|
|
|
1188 |
multiply an offset expression representing the stride by an
|
|
|
1189 |
integer expression giving the index:
|
|
|
1190 |
|
|
|
1191 |
<programlisting>
|
|
|
1192 |
arg1: EXP OFFSET(x, x)
|
|
|
1193 |
arg2: EXP INTEGER(v)
|
|
|
1194 |
-> EXP OFFSET(x, x)
|
|
|
1195 |
</programlisting>
|
|
|
1196 |
|
|
|
1197 |
and using add_to_ptr with a pointer expression giving the base of
|
|
|
1198 |
the array with the resulting OFFSET.</para>
|
|
|
1199 |
</sect3>
|
|
|
1200 |
|
|
|
1201 |
<sect3 id="S35">
|
|
|
1202 |
<title>OFFSET ordering and representation</title>
|
|
|
1203 |
|
|
|
1204 |
<para>There is an ordering defined on OFFSETs with the same
|
|
|
1205 |
alignment qualifiers, as given by offset_test and offset_max
|
|
|
1206 |
having properties like:
|
|
|
1207 |
|
|
|
1208 |
<programlisting>
|
|
|
1209 |
shape_offset(S) xb3 offset_zero(alignment(S))
|
|
|
1210 |
A xb3 B iff offset_max(A,B) = A
|
|
|
1211 |
offset_add(A, B) xb3 A where B xb3 offset_zero(some compatible alignment)
|
|
|
1212 |
</programlisting>
|
|
|
1213 |
|
|
|
1214 |
In most machines, OFFSETs would be represented as single integer
|
|
|
1215 |
values with the OFFSET ordering corresponding to simple integer
|
|
|
1216 |
ordering. The offset_add constructor just translates to simple
|
|
|
1217 |
addition with offset_zero as 0 with similar correspondences for
|
|
|
1218 |
the other offset constructors. You might well ask why TDF does not
|
|
|
1219 |
simply use integers for offsets, instead of introducing the rather
|
|
|
1220 |
complex OFFSET SHAPE. The reasons are two-fold. First, following
|
|
|
1221 |
the OFFSET arithmetic rules concerned with the ALIGNMENT
|
|
|
1222 |
qualifiers will ensure that one never extracts a value from a
|
|
|
1223 |
pointer with the wrong alignment by, for example, applying
|
|
|
1224 |
contents to an add_to_pointer. This frees TDF from having to
|
|
|
1225 |
define the effect of strange operations like forming a float by
|
|
|
1226 |
taking the contents of a pointer to a character which may be
|
|
|
1227 |
mis-aligned with respect to floats - a heavy operation on most
|
|
|
1228 |
processors. The second reason is quite simple; there are machines
|
|
|
1229 |
which cannot represent OFFSETs by a single integer value.</para>
|
|
|
1230 |
|
|
|
1231 |
<para>The iAPX-432 is a fairly extreme example of such a machine; it
|
|
|
1232 |
is a "capability" machine which must segregate pointer values and
|
|
|
1233 |
non-pointer values into different spaces. On this machine a value
|
|
|
1234 |
of SHAPE POINTER({<type>pointer</type>, int}) (e.g. a pointer to a
|
|
|
1235 |
structure containing both integers and pointers) could have two
|
|
|
1236 |
components; one referring to the pointers and another to the
|
|
|
1237 |
integers. In general, offsets from this pointer would also have
|
|
|
1238 |
two components, one to pick out any pointer values and the other
|
|
|
1239 |
the integer values. This would obviously be the case if the
|
|
|
1240 |
original POINTER referred to an array of structures containing
|
|
|
1241 |
both pointers and integers; an offset to an element of the array
|
|
|
1242 |
would have SHAPE OFFSET({<type>pointer</type>, int},
|
|
|
1243 |
{<type>pointer</type>, int}); both elements of the offset would
|
|
|
1244 |
have to be used as displacements to the corresponding elements of
|
|
|
1245 |
the pointer to extract the structure element. The OFFSET ordering
|
|
|
1246 |
is now given by the comparison of both displacements. Using this
|
|
|
1247 |
method, one finds that pointers in store to non-pointer alignments
|
|
|
1248 |
are two words in different blocks and pointers to
|
|
|
1249 |
pointer-alignments are four words, two in one block and two in
|
|
|
1250 |
another. This sounds a very unwieldy machine compared to normal
|
|
|
1251 |
machines with linear addressing. However, who knows what similar
|
|
|
1252 |
strange machines will appear in future; the basic conflicts
|
|
|
1253 |
between security, integrity and flexibility that the iAPX-432
|
|
|
1254 |
sought to resolve are still with us. For more on the modelling of
|
|
|
1255 |
pointers and offsets see
|
|
|
1256 |
<link linkend="models-of-the-tdf-algebra">section 13</link>.
|
|
|
1257 |
</para>
|
|
|
1258 |
</sect3>
|
|
|
1259 |
</sect2>
|
|
|
1260 |
|
|
|
1261 |
<sect2 id="bitfield-alignments">
|
|
|
1262 |
<title>BITFIELD alignments</title>
|
|
|
1263 |
|
|
|
1264 |
<para>Even in standard machines, one finds that the size of a pointer
|
|
|
1265 |
may depend on the alignment of the data pointed at. Most machines do
|
|
|
1266 |
not allow one to construct pointers to bits with the same facility
|
|
|
1267 |
as other alignments. This usually means that pointers in memory to
|
|
|
1268 |
BITFIELD VARIETYs must be implemented as two words with an address
|
|
|
1269 |
and bit displacement. One might imagine that a translator could
|
|
|
1270 |
implement BITFIELD alignments so that they are the same as the
|
|
|
1271 |
smallest natural alignment of the machine and avoid the bit
|
|
|
1272 |
displacement, but this is not the intention of the definition. On
|
|
|
1273 |
any machine for which it is meaningful, the alignment of a BITFIELD
|
|
|
1274 |
must be one bit; in other words successive BITFIELDs are butted
|
|
|
1275 |
together with no padding bits.
|
|
|
1276 |
|
|
|
1277 |
<footnote>
|
|
|
1278 |
<para>Note that is not generally true for C bitfields; most C ABIs
|
|
|
1279 |
have (different) rules for putting in padding bits depending on
|
|
|
1280 |
the size of the bitfield and its relation with the natural
|
|
|
1281 |
alignments. This is a fruitful source of errors in data exchange
|
|
|
1282 |
between different C ABIs For more on similar limitations of
|
|
|
1283 |
bitfields in TDF (see
|
|
|
1284 |
<link linkend="assigning-and-extracting-bitfields">Assigning and
|
|
|
1285 |
extracting bitfields</link>).</para>
|
|
|
1286 |
</footnote>
|
|
|
1287 |
|
|
|
1288 |
Within the limits of what one can extract from BITFIELDs, namely
|
|
|
1289 |
INTEGER VARIETYs, this is how one should implement non-standard
|
|
|
1290 |
alignments, perhaps in constructing data, such as protocols, for
|
|
|
1291 |
exchange between machines. One could implement some Ada
|
|
|
1292 |
representational statements in this way; certainly the most commonly
|
|
|
1293 |
used ones.</para>
|
|
|
1294 |
|
|
|
1295 |
<para>TDF Version 3.0 does not allow one to construct a pointer of
|
|
|
1296 |
SHAPE POINTER(b) where b consists entirely of bitfield alignments;
|
|
|
1297 |
this relieves the translators of the burden of doing general
|
|
|
1298 |
bit-addressing. Of course, this simply shifts the burden to the
|
|
|
1299 |
producer. If the high level language requires to construct a pointer
|
|
|
1300 |
to an arbitrary bit position, then the producer is required to
|
|
|
1301 |
represent such a pointer as a pair consisting of pointer to some
|
|
|
1302 |
alignment including the required bitfield and an offset from this
|
|
|
1303 |
alignment to the bitfield. For example, Ada may require the
|
|
|
1304 |
construction of a pointer to a boolean for use as the parameter to a
|
|
|
1305 |
procedure; the SHAPE of the rep resentation of this Ada pointer
|
|
|
1306 |
could be a COMPOUND formed from a POINTER({x,b}) and an OFFSET({x,
|
|
|
1307 |
b}, b) where b is the alignment given by a 1 bit alignment. To
|
|
|
1308 |
access the boolean, the producer would use the elements of this pair
|
|
|
1309 |
as arguements to bitfield_assign and bitfield_contents
|
|
|
1310 |
(<link linkend="assigning-and-extracting-bitfields">Assigning and
|
|
|
1311 |
extracting bitfields</link>.).</para>
|
|
|
1312 |
</sect2>
|
|
|
1313 |
</sect1>
|
|
|
1314 |
|
|
|
1315 |
<sect1>
|
|
|
1316 |
<title>Procedures and Locals</title>
|
|
|
1317 |
|
|
|
1318 |
<para>All procedures in TDF are essentially global; the only values
|
|
|
1319 |
which are accessible from the body of a procedure are those which are
|
|
|
1320 |
derived from global TAGs (introduced by TAGDEFs or TAGDECs), local
|
|
|
1321 |
TAGs defined within the procedure and parameter TAGs of the
|
|
|
1322 |
procedure.</para>
|
|
|
1323 |
|
|
|
1324 |
<para>All executable code in TDF will arise from an EXP PROC made by
|
|
|
1325 |
either make_proc or make_general_proc. They differ in their treatment
|
|
|
1326 |
of how space for the actual parameters of a call is managed; in
|
|
|
1327 |
particular, is it the caller or the callee which deallocates the
|
|
|
1328 |
parameter space?</para>
|
|
|
1329 |
|
|
|
1330 |
<para>With make_proc, this management is conceptually done by the caller
|
|
|
1331 |
at an apply_proc; i.e. the normal C situation. This suffers from the
|
|
|
1332 |
limitation that tail-calls of procedures are then only possible in
|
|
|
1333 |
restricted circumstances (e.g. the space for the parameters of the
|
|
|
1334 |
tail-call must be capable of being included in caller's parameters)
|
|
|
1335 |
and could only be implemented as an optimisation within a translator.
|
|
|
1336 |
A producer could not predict these circumstances in a machine
|
|
|
1337 |
independent manner, whether or not it knew that a tail-call was
|
|
|
1338 |
valid.</para>
|
|
|
1339 |
|
|
|
1340 |
<para>An alternative would be to make the management of parameter space
|
|
|
1341 |
the responsibility of the called procedure. Rather than do this,
|
|
|
1342 |
make_general_proc (and apply_general_proc) splits the parameters into
|
|
|
1343 |
two sets, one whose allocation is the responsibility of the caller and
|
|
|
1344 |
the other whose allocation is dealt with by the callee. This allows an
|
|
|
1345 |
explicit tail_call to be made to a procedure with new callee
|
|
|
1346 |
parameters; the caller parameters for the tail_call will be the same
|
|
|
1347 |
as (or some initial subset of) the caller parameters of the procedure
|
|
|
1348 |
containing the tail_call .</para>
|
|
|
1349 |
|
|
|
1350 |
<para>A further refinement of make_general_proc is to allow access to
|
|
|
1351 |
the caller parameter space in a postlude at the call of the procedure
|
|
|
1352 |
using an apply_general_proc. This allows simple implementations of Ada
|
|
|
1353 |
out_parameters, or more generally, multiple results of
|
|
|
1354 |
procedures.</para>
|
|
|
1355 |
|
|
|
1356 |
<sect2 id="S38">
|
|
|
1357 |
<title>make_proc and apply_proc</title>
|
|
|
1358 |
|
|
|
1359 |
<para>The make_proc constructor has signature:
|
|
|
1360 |
|
|
|
1361 |
<programlisting>
|
|
|
1362 |
result_shape: SHAPE
|
|
|
1363 |
params_intro: LIST(TAGSHACC)
|
|
|
1364 |
var_intro: OPTION(TAGACC)
|
|
|
1365 |
body: EXP BOTTOM
|
|
|
1366 |
-> EXP PROC
|
|
|
1367 |
</programlisting>
|
|
|
1368 |
|
|
|
1369 |
The <type>params_intro</type> and <type>var_intro</type> parameters
|
|
|
1370 |
introduce the formal parameters of the procedure which may be used
|
|
|
1371 |
in <type>body</type>. The procedure result will have SHAPE
|
|
|
1372 |
<type>result_shape</type> and will be usually given by some return
|
|
|
1373 |
construction within <type>body</type>. The basic model is that space
|
|
|
1374 |
will be provided to copy actual parameters (into space supplied by
|
|
|
1375 |
some apply_proc) by value into these formals and the body will treat
|
|
|
1376 |
this space effectively as local variables.</para>
|
|
|
1377 |
|
|
|
1378 |
<para>Each straightforward formal parameter is introduced by an
|
|
|
1379 |
auxiliary SORT TAGSHACC using make_tagshacc:
|
|
|
1380 |
|
|
|
1381 |
<programlisting>
|
|
|
1382 |
sha: SHAPE
|
|
|
1383 |
opt_access: OPTION(LIST(ACCESS))
|
|
|
1384 |
tg_intro: TAG POINTER(alignment(sha))
|
|
|
1385 |
-> TAGSHACC
|
|
|
1386 |
</programlisting>
|
|
|
1387 |
</para>
|
|
|
1388 |
|
|
|
1389 |
<para>Within <type>body</type>, the formal will be accessed using
|
|
|
1390 |
<type>tg_intro</type>; it is always considered to be a pointer to
|
|
|
1391 |
the space of SHAPE <type>sha</type> allocated by apply_proc, hence
|
|
|
1392 |
the pointer SHAPE.</para>
|
|
|
1393 |
|
|
|
1394 |
<para>For example, if we had a simple procedure with one integer
|
|
|
1395 |
parameter, <type>var_intro</type> would be empty and
|
|
|
1396 |
<varname>params_intro</varname> might be:
|
|
|
1397 |
|
|
|
1398 |
<programlisting>
|
|
|
1399 |
params_intro = make_tagshacc(integer(v), empty, make_tag(13))
|
|
|
1400 |
</programlisting>
|
|
|
1401 |
|
|
|
1402 |
Then, TAG 13 from the enclosing UNIT's name-space is identified with
|
|
|
1403 |
the formal parameter with SHAPE POINTER(INTEGER(v)). Any use of
|
|
|
1404 |
obtain_tag(make_tag(13)) in <type>body</type> will deliver a pointer
|
|
|
1405 |
to the integer parameter. I shall return to the meaning of
|
|
|
1406 |
<type>opt_access</type> and the ramifications of the scope and extent
|
|
|
1407 |
of TAGs involved in conjunction with local declarations in
|
|
|
1408 |
<link linkend="identify-variable">section 5.3.1</link>.</para>
|
|
|
1409 |
|
|
|
1410 |
<para>Procedures, whether defined by make_proc or make_general_proc,
|
|
|
1411 |
will usually terminate and deliver its result with a return:
|
|
|
1412 |
|
|
|
1413 |
<programlisting>
|
|
|
1414 |
arg1: EXP x
|
|
|
1415 |
-> EXP BOTTOM
|
|
|
1416 |
</programlisting>
|
|
|
1417 |
|
|
|
1418 |
Here <varname>x</varname> must be identical to the
|
|
|
1419 |
<type>result_shape</type> of the call of the procedure There may be
|
|
|
1420 |
several returns in body; and the SHAPE <varname>x</varname> in each
|
|
|
1421 |
will be the same. Some languages allow different types to be returned
|
|
|
1422 |
depending on the particular call. The producer must resolve this
|
|
|
1423 |
issue. For example, C allows one to deliver void if the resulting
|
|
|
1424 |
value is not used. In TDF a dummy value must be provided at the
|
|
|
1425 |
return; for example make_value(<type>result_shape</type>).</para>
|
|
|
1426 |
|
|
|
1427 |
<para>Note that the <type>body</type> has SHAPE bottom since all
|
|
|
1428 |
possible terminations to a procedure have SHAPE BOTTOM..</para>
|
|
|
1429 |
|
|
|
1430 |
<para>Procedures defined by make_proc are called using
|
|
|
1431 |
apply_proc:</para>
|
|
|
1432 |
|
|
|
1433 |
<programlisting>
|
|
|
1434 |
result_shape: SHAPE
|
|
|
1435 |
arg1: EXP PROC
|
|
|
1436 |
arg2: LIST(EXP)
|
|
|
1437 |
varparam: OPTION(EXP)
|
|
|
1438 |
-> EXP result_shape
|
|
|
1439 |
</programlisting>
|
|
|
1440 |
|
|
|
1441 |
<para>Here <type>arg1</type> is the procedure to be called and
|
|
|
1442 |
<type>arg2</type> gives the actual parameters. There must be at least
|
|
|
1443 |
as many actual parameters as given (with the same SHAPE) in the
|
|
|
1444 |
<type>params_intro</type> of the corresponding make_proc for arg1.
|
|
|
1445 |
|
|
|
1446 |
<footnote>
|
|
|
1447 |
<para>The vararg construction in C are implemented by giving more
|
|
|
1448 |
actuals than formals; the extra parameters are accessed by offset
|
|
|
1449 |
arithmetic with a pointer to a formal, using parameter_alignment
|
|
|
1450 |
to pad the offsets.</para>
|
|
|
1451 |
</footnote>
|
|
|
1452 |
|
|
|
1453 |
The values of <type>arg2</type> will be copied into space managed by
|
|
|
1454 |
caller.</para>
|
|
|
1455 |
|
|
|
1456 |
<para>The SHAPE of the result of the call is given by
|
|
|
1457 |
<type>result_shape</type> which must be identical to the
|
|
|
1458 |
<type>result_shape</type> of the make_proc.</para>
|
|
|
1459 |
|
|
|
1460 |
<sect3 id="vartag-varparam">
|
|
|
1461 |
<title>vartag, varparam</title>
|
|
|
1462 |
|
|
|
1463 |
<para>Use of the <type>var_intro</type> OPTION in make_proc and the
|
|
|
1464 |
corresponding <type>varparam</type> in apply_proc allows one to
|
|
|
1465 |
have a parameter of any SHAPE, possibly differing from call to
|
|
|
1466 |
call where the actual SHAPE can be deduced in some way by the
|
|
|
1467 |
<type>body</type> of the make_proc . One supplies an extra actual
|
|
|
1468 |
parameter, <type>varparam</type>, which usually would be a
|
|
|
1469 |
structure grouping some set of values. The body of the procedure
|
|
|
1470 |
can then access these values using the pointer given by the TAG
|
|
|
1471 |
<type>var_intro</type>, using add_to_ptr with some computed
|
|
|
1472 |
offsets to pick out the individual fields.</para>
|
|
|
1473 |
|
|
|
1474 |
<para>This is a slightly different method of giving a variable
|
|
|
1475 |
number of parameters to a procedure, rather than simply giving
|
|
|
1476 |
more actuals than formals. The principle difference is in the
|
|
|
1477 |
alignment of the components of <type>varparam</type>; these will
|
|
|
1478 |
be laid out according to the default padding defined by the
|
|
|
1479 |
component shapes. In most ABIs, this padding is usually different
|
|
|
1480 |
to the way parameters are laid out; for example, character
|
|
|
1481 |
parameters are generally padded out to a full word. Thus a
|
|
|
1482 |
sequence of parameters of given shape has a different layout in
|
|
|
1483 |
store to the same sequence of shapes in a structure. If one wished
|
|
|
1484 |
to pass an arbitrary structure to a procedure, one would use the
|
|
|
1485 |
<type>varparam</type> option rather passing the fields
|
|
|
1486 |
individually as extra actual parameters.</para>
|
|
|
1487 |
</sect3>
|
|
|
1488 |
</sect2>
|
|
|
1489 |
|
|
|
1490 |
<sect2 id="S40">
|
|
|
1491 |
<title>make_general_proc and apply_general_proc</title>
|
|
|
1492 |
|
|
|
1493 |
<para>A make_general_proc has signature:
|
|
|
1494 |
|
|
|
1495 |
<programlisting>
|
|
|
1496 |
result_shape: SHAPE
|
|
|
1497 |
prcprops: OPTION(PROCPROPS)
|
|
|
1498 |
caller_intro: LIST(TAGSHACC)
|
|
|
1499 |
callee_intro: LIST(TAGSHACC)
|
|
|
1500 |
body: EXP BOTTOM
|
|
|
1501 |
-> EXP PROC
|
|
|
1502 |
</programlisting>
|
|
|
1503 |
|
|
|
1504 |
Here the formal parameters are split into two sets,
|
|
|
1505 |
<type>caller_intro</type> and <type>callee_intro</type>, each given
|
|
|
1506 |
by a list of TAGSHACCs just as in make_proc. The distinction between
|
|
|
1507 |
the two sets is that the make_general_proc is responsible for
|
|
|
1508 |
de_allocating any space required for the callee parameter set; this
|
|
|
1509 |
really only becomes obvious at uses of tail_call within
|
|
|
1510 |
<type>body</type>.</para>
|
|
|
1511 |
|
|
|
1512 |
<para>The <type>result_shape</type> and <type>body</type> have the
|
|
|
1513 |
same general properties as in make_proc. In addition
|
|
|
1514 |
<type>prcprops</type> gives other information both about
|
|
|
1515 |
<type>body</type> and the way that that the procedure is called.
|
|
|
1516 |
PROCPROPS are a set drawn from check_stack, inline,
|
|
|
1517 |
no_long_jump_dest, untidy, var_callees and var_callers. The set is
|
|
|
1518 |
composed using add_procprops. The PROCPROPS no_long_jump_dest is a
|
|
|
1519 |
property of <type>body</type> only; it indicates that none of the
|
|
|
1520 |
labels within <type>body</type> will be the target of a long_jump
|
|
|
1521 |
construct. The other properties should also be given consistently at
|
|
|
1522 |
all calls of the procedure; theu are discussed in
|
|
|
1523 |
<link linkend="procprops">section 5.2.2</link>.</para>
|
|
|
1524 |
|
|
|
1525 |
<para>A procedure, <type>p</type>, constructed by make_general_proc is
|
|
|
1526 |
called using apply_general_proc:</para>
|
|
|
1527 |
|
|
|
1528 |
<programlisting>
|
|
|
1529 |
result_shape: SHAPE
|
|
|
1530 |
prcprops: OPTION(PROCPROPS)
|
|
|
1531 |
p: EXP PROC
|
|
|
1532 |
caller_params: LIST(OTAGEXP)
|
|
|
1533 |
callee_params: CALLEES
|
|
|
1534 |
postlude: EXP TOP
|
|
|
1535 |
-> EXP result_shape
|
|
|
1536 |
</programlisting>
|
|
|
1537 |
|
|
|
1538 |
<para>The actual caller parameters are given by
|
|
|
1539 |
<type>caller_params</type> as a list of OTAGEXPs constructed using
|
|
|
1540 |
make_otagexp:</para>
|
|
|
1541 |
|
|
|
1542 |
<programlisting>
|
|
|
1543 |
tgopt: OPTION(TAG x / i>)
|
|
|
1544 |
e: EXP x / i>
|
|
|
1545 |
-> OTAGEXP
|
|
|
1546 |
</programlisting>
|
|
|
1547 |
|
|
|
1548 |
<para>Here, <type>e</type> is the value of the parameter and
|
|
|
1549 |
<type>tgopt</type>, if present, is a TAG which will bound to the
|
|
|
1550 |
final value of the parameter (after <type>body</type> is evaluated)
|
|
|
1551 |
in the <type>postlude</type> expression of the apply_general_proc.
|
|
|
1552 |
|
|
|
1553 |
<footnote>
|
|
|
1554 |
<para>If a formal parameter is to be used in this way, it should
|
|
|
1555 |
be marked as having out_par ACCESS in its corresponding TAGSHACC
|
|
|
1556 |
in <type>callers_intro</type>.</para>
|
|
|
1557 |
</footnote>
|
|
|
1558 |
|
|
|
1559 |
Clearly, this allows one to use a caller parameter as an extra
|
|
|
1560 |
result of the procedure; for example, as in Ada
|
|
|
1561 |
out-parameters.</para>
|
|
|
1562 |
|
|
|
1563 |
<para>The actual <type>callee_params</type> may be constructed in
|
|
|
1564 |
three different ways. The usual method is to use make_callee_list,
|
|
|
1565 |
giving a list of actual EXP parameters, corresponding to the
|
|
|
1566 |
<type>caller_intro</type> list in the obvious way.The constructor,
|
|
|
1567 |
same_callees allows one to use the callees of the current procedure
|
|
|
1568 |
as the callees of the call; this, of course, assumes that the
|
|
|
1569 |
formals of the current procedure are compatible with the formals
|
|
|
1570 |
required for the call The final method allows one to construct a
|
|
|
1571 |
dynamically sized set of CALLEES; make_dynamic_callees takes a
|
|
|
1572 |
pointer and a size (expressed as an OFFSET) to make the CALLEES;
|
|
|
1573 |
this will be used in conjunction with a var_callees PROCPROPS (see
|
|
|
1574 |
<link linkend="procprops">section 5.2.2</link>).</para>
|
|
|
1575 |
|
|
|
1576 |
<para>Some procedures can be expressed using either make_proc or
|
|
|
1577 |
make_general_proc. For example:</para>
|
|
|
1578 |
|
|
|
1579 |
<para>make_proc(S, L, empty, B) = make_general_proc(S, var_callers, L,
|
|
|
1580 |
empty, B)</para>
|
|
|
1581 |
|
|
|
1582 |
<sect3 id="tail_call">
|
|
|
1583 |
<title>tail_call</title>
|
|
|
1584 |
|
|
|
1585 |
<para>Often the result of a procedure, <function>f</function>, is
|
|
|
1586 |
simply given by the call of another (or the same) procedure,
|
|
|
1587 |
<function>g</function>. In appropriate circumstances, the same
|
|
|
1588 |
stack space can be used for the call of <function>g</function> as
|
|
|
1589 |
the call of <function>f</function>. This can be particularly
|
|
|
1590 |
important where heavily recursive routines are involved; some
|
|
|
1591 |
languages even use tail recursion as the preferred method of
|
|
|
1592 |
looping.</para>
|
|
|
1593 |
|
|
|
1594 |
<para>One condition for such a tail call to be applicable is knowing
|
|
|
1595 |
that <function>g</function> does not require any pointers to
|
|
|
1596 |
locals of <function>f</function>; this is often implicit in the
|
|
|
1597 |
language involved. Equally important is that the action on the
|
|
|
1598 |
return from <function>f</function> is indistiguishable from the
|
|
|
1599 |
return from <function>g</function>. For example, if it were the
|
|
|
1600 |
callers responsibility to pop the the space for the parameters on
|
|
|
1601 |
return from a call, then the tail call of <function>g</function>
|
|
|
1602 |
would only work if <function>g</function> had the same parameter
|
|
|
1603 |
space as <function>f</function>.</para>
|
|
|
1604 |
|
|
|
1605 |
<para>This is the justification for splitting the parameter set of a
|
|
|
1606 |
general proc; it is (at least conceptually) the caller's
|
|
|
1607 |
responsibility for popping the caller-parameters only - the
|
|
|
1608 |
callee-parameters are dealt with by the procedure itself. Hence we
|
|
|
1609 |
can define tail_call which uses the same caller-parameters, but a
|
|
|
1610 |
different set of callee-parameters:
|
|
|
1611 |
|
|
|
1612 |
<programlisting>
|
|
|
1613 |
prcprops: OPTION(PROCPROPS)
|
|
|
1614 |
p: EXP PROC
|
|
|
1615 |
callee_params: CALLEES
|
|
|
1616 |
-> EXP BOTTOM
|
|
|
1617 |
</programlisting>
|
|
|
1618 |
|
|
|
1619 |
The procedure p will be called with the same caller parameters as
|
|
|
1620 |
the current procedure and the new <type>callee_params</type> and
|
|
|
1621 |
return to the call site of the current procedure. Semantically, if
|
|
|
1622 |
S is the return SHAPE of the current procedure, and L is its
|
|
|
1623 |
caller-parameters:</para>
|
|
|
1624 |
|
|
|
1625 |
<para>tail_call(P, p, C) = return(apply_general_proc(S, P, p, L, C,
|
|
|
1626 |
make_top()))</para>
|
|
|
1627 |
|
|
|
1628 |
<para>However an implementation is expected to conserve stack by
|
|
|
1629 |
using the same space for the call of p as the current
|
|
|
1630 |
procedure.</para>
|
|
|
1631 |
</sect3>
|
|
|
1632 |
|
|
|
1633 |
<sect3 id="procprops">
|
|
|
1634 |
<title>PROCPROPS</title>
|
|
|
1635 |
|
|
|
1636 |
<para>The presence of var_callees (or var_callers) means that the
|
|
|
1637 |
procedure can be called with more actual callee (or caller)
|
|
|
1638 |
parameters than are indicated in <type>callee_intro</type> (or
|
|
|
1639 |
<type>caller_intro</type>). These extra parameters would be
|
|
|
1640 |
accessed within body using offset calculations with respect to the
|
|
|
1641 |
named parameters. The offsets should be calculated using
|
|
|
1642 |
parameter_alignment to give the packing of the parameter
|
|
|
1643 |
packs.</para>
|
|
|
1644 |
|
|
|
1645 |
<para>The presence of untidy means that <type>body</type> may be
|
|
|
1646 |
terminated by an untidy_return. This returns the result of the
|
|
|
1647 |
procedure as in return, but the lifetime of the local space of the
|
|
|
1648 |
procedure is extended (in practice this is performed by not
|
|
|
1649 |
returning the stack to its original value at the call). A
|
|
|
1650 |
procedure containing an untidy_return is a generalisation of a
|
|
|
1651 |
local_alloc(see <link linkend="local_alloc">section 5.3.4</link>).
|
|
|
1652 |
For example the procedure could do some complicated local
|
|
|
1653 |
allocation (a triangular array, say) and untidily return a pointer
|
|
|
1654 |
to it so that the space is still valid in the calling procedure.
|
|
|
1655 |
The space will remain valid for the lifetime of the calling
|
|
|
1656 |
procedure unless some local_free is called within it, just as if
|
|
|
1657 |
the space had been generated by a local_alloc in the calling
|
|
|
1658 |
procedure.</para>
|
|
|
1659 |
|
|
|
1660 |
<para>The presence of inline is just a hint to the translator that
|
|
|
1661 |
the procedure body is a good candidate for inlining at the
|
|
|
1662 |
call.</para>
|
|
|
1663 |
|
|
|
1664 |
<para>The presence of check_stack means that the static stack
|
|
|
1665 |
requirements of the procedure will be checked on entry to see that
|
|
|
1666 |
they do not exceed the limits imposed by set_stack_limit; if they
|
|
|
1667 |
are exceeded a TDF exception with ERROR_CODE stack_overflow (see
|
|
|
1668 |
<link linkend="exceptional-flow">section 6.3</link>) will be
|
|
|
1669 |
raised.</para>
|
|
|
1670 |
</sect3>
|
|
|
1671 |
</sect2>
|
|
|
1672 |
|
|
|
1673 |
<sect2 id="defining-and-using-locals">
|
|
|
1674 |
<title>Defining and using locals</title>
|
|
|
1675 |
|
|
|
1676 |
<sect3 id="identify-variable">
|
|
|
1677 |
<title>identify, variable</title>
|
|
|
1678 |
|
|
|
1679 |
<para>Local definitions within the <type>body</type> of a procedure
|
|
|
1680 |
are given by two EXP constructors which permit one to give names
|
|
|
1681 |
to values over a scope given by the definition. Note that this is
|
|
|
1682 |
somewhat different to declarations in standard languages where the
|
|
|
1683 |
declaration is usually embedded in a larger construct which
|
|
|
1684 |
defines the scope of the name; here the scope is explicit in the
|
|
|
1685 |
definition. The reason for this will become more obvious in the
|
|
|
1686 |
discussion of TDF transformations. The simpler constructor is
|
|
|
1687 |
identify:
|
|
|
1688 |
|
|
|
1689 |
<programlisting>
|
|
|
1690 |
opt_access: OPTION(ACCESS)
|
|
|
1691 |
name_intro: TAG x
|
|
|
1692 |
definition: EXP x
|
|
|
1693 |
body: EXP y
|
|
|
1694 |
-> EXP y
|
|
|
1695 |
</programlisting>
|
|
|
1696 |
|
|
|
1697 |
The <type>definition</type> is evaluated and its result is
|
|
|
1698 |
identified with the TAG given by <type>name_intro</type> within
|
|
|
1699 |
its scope <type>body</type>. Hence the use of any
|
|
|
1700 |
obtain_tag(<type>name_intro</type>) within <type>body</type> is
|
|
|
1701 |
equivalent to using this result. Anywhere else,
|
|
|
1702 |
obtain_tag(<type>name_intro</type>) is meaningless, including in
|
|
|
1703 |
other procedures.</para>
|
|
|
1704 |
|
|
|
1705 |
<para>The other kind of local definition is variable:
|
|
|
1706 |
|
|
|
1707 |
<programlisting>
|
|
|
1708 |
opt_access: OPTION(ACCESS)
|
|
|
1709 |
name_intro: TAG x
|
|
|
1710 |
init: EXP x
|
|
|
1711 |
body: EXP y
|
|
|
1712 |
-> EXP y
|
|
|
1713 |
</programlisting>
|
|
|
1714 |
|
|
|
1715 |
Here the <type>init</type> EXP is evaluated and its result serves
|
|
|
1716 |
as an initialisation of space of SHAPE <varname>x</varname> local
|
|
|
1717 |
to the procedure. The TAG name_intro is then identified with a
|
|
|
1718 |
pointer to that SPACE within body. A use of
|
|
|
1719 |
obtain_tag(<type>name_intro</type>) within <type>body</type> is
|
|
|
1720 |
equivalent to using this pointer and is meaningless outside
|
|
|
1721 |
<type>body</type> or in other procedures. Many variable
|
|
|
1722 |
declarations in programs are uninitialised; in this case, the
|
|
|
1723 |
<type>init</type> argument could be provided by make_value which
|
|
|
1724 |
will produce some value with SHAPE given by its parameter.</para>
|
|
|
1725 |
</sect3>
|
|
|
1726 |
|
|
|
1727 |
<sect3 id="sort-access">
|
|
|
1728 |
<title>ACCESS</title>
|
|
|
1729 |
|
|
|
1730 |
<para>The ACCESS SORT given in tag declarations is a way of
|
|
|
1731 |
describing a list of properties to be associated with the tag.
|
|
|
1732 |
They are basically divided into two classes, one which describes
|
|
|
1733 |
global properties of the tag with respect to the model for locals
|
|
|
1734 |
and the other which gives "hints" on how the value will be used.
|
|
|
1735 |
Any of these can be combined using add_access.</para>
|
|
|
1736 |
|
|
|
1737 |
<sect4 id="locals-model">
|
|
|
1738 |
<title>Locals model</title>
|
|
|
1739 |
|
|
|
1740 |
<para>At the moment there are just three possibilities in the
|
|
|
1741 |
first class of ACCESS constructors. They are standard_access
|
|
|
1742 |
(the default), visible, out_par and long_jump_access.</para>
|
|
|
1743 |
|
|
|
1744 |
<para>The basic model used for the locals and parameters of a
|
|
|
1745 |
procedure is a frame within a stack of nested procedure calls.
|
|
|
1746 |
One could implement a procedure by allocating space according to
|
|
|
1747 |
SHAPEs of all of the parameter and local TAGs so that the
|
|
|
1748 |
corresponding values are at fixed offsets either from the start
|
|
|
1749 |
of the frame or some pointer within it.</para>
|
|
|
1750 |
|
|
|
1751 |
<para>Indeed, if the ACCESS <type>opt_access</type> parameter in a
|
|
|
1752 |
TAG definition is produced by visible, then a translator is
|
|
|
1753 |
almost bound to do just that for that TAG. This is because it
|
|
|
1754 |
allows for the possibility of the value to be accessed in some
|
|
|
1755 |
way other than by using obtain_tag which is the standard way of
|
|
|
1756 |
recovering the value bound to the TAG. The principal way that
|
|
|
1757 |
this could happen within TDF is by the combined use of
|
|
|
1758 |
env_offset to give the offset and current_env to give a pointer
|
|
|
1759 |
to the current frame (see
|
|
|
1760 |
<link linkend="current_env">section 5.3.3</link>).</para>
|
|
|
1761 |
|
|
|
1762 |
<para>The out_par ACCESS is only applicable to caller parameters
|
|
|
1763 |
of procedures; it indicates that the value of the TAG concerned
|
|
|
1764 |
will accessed by the postlude part of an apply_general_proc.
|
|
|
1765 |
Hence, the value of the parameter must be accessible after the
|
|
|
1766 |
call; usually this will be on the stack in the callers
|
|
|
1767 |
frame.</para>
|
|
|
1768 |
|
|
|
1769 |
<para>The long_jump_access flag is used to indicate that the tag
|
|
|
1770 |
must be available after a long_jump. In practice, if either
|
|
|
1771 |
visible or long_jump_access is set, most translators would
|
|
|
1772 |
allocate the space for the declaration on the main-store stack
|
|
|
1773 |
rather than in an available register. If it is not set, then a
|
|
|
1774 |
translator is free to use its own criteria for whether space
|
|
|
1775 |
which can fit into a register is allocated on the stack or in a
|
|
|
1776 |
register, provided there is no observable difference (other than
|
|
|
1777 |
time or program size) between the two possibilities.</para>
|
|
|
1778 |
|
|
|
1779 |
<para>Some of these criteria are rather obvious; for example, if a
|
|
|
1780 |
pointer to local variable is passed outside the procedure in an
|
|
|
1781 |
opaque manner, then it is highly unlikely that one can allocate
|
|
|
1782 |
the variable in a register. Some might be less obvious. If the
|
|
|
1783 |
only uses of a TAG t was in obtain_tag(t)s which are operands of
|
|
|
1784 |
contents or the left-hand operands of assigns, most ABIs would
|
|
|
1785 |
allow the tag to be placed in a register. We do not necessarily
|
|
|
1786 |
have to generate a pointer value if it can be subsumed by the
|
|
|
1787 |
operations available.</para>
|
|
|
1788 |
</sect4>
|
|
|
1789 |
|
|
|
1790 |
<sect4 id="access-hints">
|
|
|
1791 |
<title>Access "hints"</title>
|
|
|
1792 |
|
|
|
1793 |
<para>A variable tag with ACCESS constant is a write-once value;
|
|
|
1794 |
once it is initialised the variable will always contain the
|
|
|
1795 |
initialisation. In other words the tag is a pointer to a
|
|
|
1796 |
constant value; translators can use this information to apply
|
|
|
1797 |
various optimisations.</para>
|
|
|
1798 |
|
|
|
1799 |
<para>A POINTER tag with ACCESS no_other_read or no_other_write is
|
|
|
1800 |
asserting that there are no "aliassed" accesses to the contents
|
|
|
1801 |
of the pointer. For example, when applied to a parameter of a
|
|
|
1802 |
procedure, it is saying that the original pointer of the tag is
|
|
|
1803 |
distinct from any other tags used (reading/writing) in the
|
|
|
1804 |
lifetime of the tag. These other tags could either be further
|
|
|
1805 |
parameters of the procedure or globals. Clearly, this is useful
|
|
|
1806 |
for describing the limitations imposed by Fortran parameters,
|
|
|
1807 |
for example.</para>
|
|
|
1808 |
</sect4>
|
|
|
1809 |
</sect3>
|
|
|
1810 |
|
|
|
1811 |
<sect3 id="current_env">
|
|
|
1812 |
<title>current_env, env_offset</title>
|
|
|
1813 |
|
|
|
1814 |
<para>The constructor current_env gives a pointer to the current
|
|
|
1815 |
procedure frame of SHAPE POINTER(<type>fa</type>) where
|
|
|
1816 |
<type>fa</type> is depends on how the procedure was defined and
|
|
|
1817 |
will be some set of the special frame ALIGNMENTs. This set will
|
|
|
1818 |
always include locals_alignment - the alignment of any locals
|
|
|
1819 |
defined within the procedure. If the procedure has any caller-
|
|
|
1820 |
parameters, the set will also include callers_alignment(b) where b
|
|
|
1821 |
indicates whether there can be a variable number of them;
|
|
|
1822 |
similarly for callee-parameters.</para>
|
|
|
1823 |
|
|
|
1824 |
<para>Offsets from the current_env of a procedure to a tag declared
|
|
|
1825 |
in the procedure are constructed by env_offset:
|
|
|
1826 |
|
|
|
1827 |
<programlisting>
|
|
|
1828 |
fa: ALIGNMENT
|
|
|
1829 |
y: ALIGNMENT
|
|
|
1830 |
t: TAG x
|
|
|
1831 |
-> EXP OFFSET(fa, y)
|
|
|
1832 |
</programlisting>
|
|
|
1833 |
|
|
|
1834 |
The frame ALIGNMENT <type>fa</type> will be the appropriate one
|
|
|
1835 |
for the TAG <type>t</type>; i.e. if <type>t</type> is a local then
|
|
|
1836 |
the <type>fa</type> will be locals_alignment; if <type>t</type> is
|
|
|
1837 |
a caller parameter, <type>fa</type> will be callers_alignment(b);
|
|
|
1838 |
if <type>t</type> is a callee_parameter, <type>fa</type> will be
|
|
|
1839 |
callees_alignment(b). The alignment <type>y</type> will be the
|
|
|
1840 |
alignment of the initialisation of <type>t</type>.</para>
|
|
|
1841 |
|
|
|
1842 |
<para>The offset arithmetic operations allow one to access the
|
|
|
1843 |
values of tags non-locally using values derived from current_env
|
|
|
1844 |
and env_offset. They are effectively defined by the following
|
|
|
1845 |
identities:
|
|
|
1846 |
|
|
|
1847 |
<programlisting>
|
|
|
1848 |
If TAG t is derived from a variable definition
|
|
|
1849 |
add_to_ptr(current_env(), env_offset(locals_alignment, A, t)) = obtain_tag(t)
|
|
|
1850 |
if TAG t is derived from an identify definition:
|
|
|
1851 |
contents(S, add_to_ptr(current_env(), env_offset(locals_alignment, A, t))) = obtain_tag(t)
|
|
|
1852 |
if TAG t is derived from a caller parameter:
|
|
|
1853 |
add_to_ptr(current_env(), env_offset(callers_alignment(b), A, t)) = obtain_tag(t)
|
|
|
1854 |
if TAG t is derived from a callee parameter:
|
|
|
1855 |
add_to_ptr(current_env(), env_offset(callees_alignment(b), A, t)) = obtain_tag(t)
|
|
|
1856 |
</programlisting>
|
|
|
1857 |
|
|
|
1858 |
These identities are valid throughout the extent of t, including
|
|
|
1859 |
in inner procedure calls. In other words, one can dynamically
|
|
|
1860 |
create a pointer to the value by composing current_env and
|
|
|
1861 |
env_offset.</para>
|
|
|
1862 |
|
|
|
1863 |
<para>The importance of this is that env_offset(t) is a constant
|
|
|
1864 |
OFFSET and can be used anywhere within the enclosing UNIT, in
|
|
|
1865 |
other procedures or as part of constant TAGDEF; remember that the
|
|
|
1866 |
TDFINT underlying t is unique within the UNIT. The result of a
|
|
|
1867 |
current_env could be passed to another procedure (as a parameter,
|
|
|
1868 |
say) and this new procedure could then access a local of the
|
|
|
1869 |
original by using its env_offset. This would be the method one
|
|
|
1870 |
would use to access non-local, non-global identifiers in a
|
|
|
1871 |
language which allowed one to define procedures within procedures
|
|
|
1872 |
such as Pascal or Algol. Of course, given the stack-based model,
|
|
|
1873 |
the value given by current_env becomes meaningless once the
|
|
|
1874 |
procedure in which it is invoked is exited.</para>
|
|
|
1875 |
</sect3>
|
|
|
1876 |
|
|
|
1877 |
<sect3 id="local_alloc">
|
|
|
1878 |
<title>local_alloc, local_free_all, last_local</title>
|
|
|
1879 |
|
|
|
1880 |
<para>The size of stack frame produced by variable and identify
|
|
|
1881 |
definitions is a translate-time constant since the frame is
|
|
|
1882 |
composed of values whose SHAPEs are known. TDF also allows one to
|
|
|
1883 |
produce dynamically sized local objects which are conceptually
|
|
|
1884 |
part of the frame. These are produced by local_alloc:
|
|
|
1885 |
|
|
|
1886 |
<programlisting>
|
|
|
1887 |
arg1: EXP OFFSET(x, y)
|
|
|
1888 |
-> EXP POINTER(alloca_alignment)
|
|
|
1889 |
</programlisting>
|
|
|
1890 |
|
|
|
1891 |
The operand <token>arg1</token> gives the size of the new object
|
|
|
1892 |
required and the result is a pointer to the space for this object
|
|
|
1893 |
"on top of the stack" as part of the frame. The quotation marks
|
|
|
1894 |
indicate that a translator writer might prefer to maintain a
|
|
|
1895 |
dynamic stack as well as static one. There are some disadvantages
|
|
|
1896 |
in putting everything into one stack which may well out-weigh the
|
|
|
1897 |
trouble of maintaining another stack which is relatively
|
|
|
1898 |
infrequently used. If a frame has a known size, then all
|
|
|
1899 |
addressing of locals can be done using a stack-front register; if
|
|
|
1900 |
it is dynamically sized, then another frame-pointer register must
|
|
|
1901 |
be used - some ABIs make this easy but not all. The majority of
|
|
|
1902 |
procedures contain no local_allocs, so their addressing of locals
|
|
|
1903 |
can always be done relative to a stack-front; only the others have
|
|
|
1904 |
to use another register for a frame pointer.</para>
|
|
|
1905 |
|
|
|
1906 |
<para>The alignment of pointer result is alloca_alignment which must
|
|
|
1907 |
include all SHAPE alignments.</para>
|
|
|
1908 |
|
|
|
1909 |
<para>There are two constructors for releasing space generated by
|
|
|
1910 |
local_alloc. To release all such space generated in the current
|
|
|
1911 |
procedure one does local_free_all(); this reduces the size of the
|
|
|
1912 |
current frame to its static size.</para>
|
|
|
1913 |
|
|
|
1914 |
<para>The other constructor is local_free whch is effectively a
|
|
|
1915 |
"pop" to local_alloc's "push":
|
|
|
1916 |
|
|
|
1917 |
<programlisting>
|
|
|
1918 |
a: EXP OFFSET(x, y)
|
|
|
1919 |
p: EXP POINTER(alloca_alignment)
|
|
|
1920 |
-> EXP TOP
|
|
|
1921 |
</programlisting>
|
|
|
1922 |
|
|
|
1923 |
Here <type>p</type> must evaluate to a pointer generated either by
|
|
|
1924 |
local_alloc or last_local . The effect is to free all of the space
|
|
|
1925 |
locally allocated after p. The usual implementation (with a
|
|
|
1926 |
downward growing stack) of this is that p becomes the "top of
|
|
|
1927 |
stack" pointer.</para>
|
|
|
1928 |
|
|
|
1929 |
<para>The use of a procedure with an untidy_return is just a
|
|
|
1930 |
generalisation of the idea of local_alloc and the space made
|
|
|
1931 |
available by its use can be freed in the same way as normal local
|
|
|
1932 |
allocations. Of course, given that it could be the result of the
|
|
|
1933 |
procedure it can be structured in an arbitrarily complicated
|
|
|
1934 |
way.</para>
|
|
|
1935 |
</sect3>
|
|
|
1936 |
</sect2>
|
|
|
1937 |
|
|
|
1938 |
<sect2 id="heap-storage">
|
|
|
1939 |
<title>Heap storage</title>
|
|
|
1940 |
|
|
|
1941 |
<para>At the moment, there are no explicit constructors of creating
|
|
|
1942 |
dynamic off-stack storage in TDF. Any off-stack storage requirements
|
|
|
1943 |
must be met by the API in which the system is embedded, using the
|
|
|
1944 |
standard procedural interface. For example, the ANSI C API allows
|
|
|
1945 |
the creation of heap space using standard library procedures like
|
|
|
1946 |
malloc.</para>
|
|
|
1947 |
</sect2>
|
|
|
1948 |
</sect1>
|
|
|
1949 |
|
|
|
1950 |
<sect1>
|
|
|
1951 |
<title>Control Flow within procedures</title>
|
|
|
1952 |
|
|
|
1953 |
<sect2>
|
|
|
1954 |
<title>Unconditional flow</title>
|
|
|
1955 |
|
|
|
1956 |
<sect3 id="sequence">
|
|
|
1957 |
<title>sequence</title>
|
|
|
1958 |
|
|
|
1959 |
<para>To perform a sequential set of operations in TDF, one uses the
|
|
|
1960 |
constructor sequence:
|
|
|
1961 |
|
|
|
1962 |
<programlisting>
|
|
|
1963 |
statements: LIST(EXP)
|
|
|
1964 |
result: EXP x
|
|
|
1965 |
-> EXP x
|
|
|
1966 |
</programlisting>
|
|
|
1967 |
|
|
|
1968 |
Each of the <type>statements</type> are evaluated in order,
|
|
|
1969 |
throwing away their results. Then, <type>result</type> is
|
|
|
1970 |
evaluated and its result is the result of the sequence.</para>
|
|
|
1971 |
|
|
|
1972 |
<para>A translator is free to rearrange the order of evaluation if
|
|
|
1973 |
there is no observable difference other than in time or space.
|
|
|
1974 |
This applies anywhere I say "something is evaluated and then ...".
|
|
|
1975 |
We find this kind of statement in definitions of local variables
|
|
|
1976 |
in <link linkend="defining-and-using-locals">section 5.3</link>,
|
|
|
1977 |
and in the controlling parts of the conditional constructions
|
|
|
1978 |
below.</para>
|
|
|
1979 |
|
|
|
1980 |
<para>For a more precise discussion of allowable reorderings see
|
|
|
1981 |
(<fix>S7.14</fix>) .</para>
|
|
|
1982 |
</sect3>
|
|
|
1983 |
</sect2>
|
|
|
1984 |
|
|
|
1985 |
<sect2 id="conditional-flow">
|
|
|
1986 |
<title>Conditional flow</title>
|
|
|
1987 |
|
|
|
1988 |
<sect3 id="labelled">
|
|
|
1989 |
<title>labelled, make_label</title>
|
|
|
1990 |
|
|
|
1991 |
<para>All simple changes of flow of control within a TDF procedure
|
|
|
1992 |
are done by jumps or branches to LABELs, mirroring what actually
|
|
|
1993 |
happens in most computers. There are three constructors which
|
|
|
1994 |
introduce LABELs; the most general is labelled which allows
|
|
|
1995 |
arbitrary jumping between its component EXPs:
|
|
|
1996 |
|
|
|
1997 |
<programlisting>
|
|
|
1998 |
placelabs_intro: LIST(LABEL)
|
|
|
1999 |
starter: EXP x
|
|
|
2000 |
places: LIST(EXP)
|
|
|
2001 |
-> EXP w
|
|
|
2002 |
</programlisting>
|
|
|
2003 |
|
|
|
2004 |
Each of the EXPs in <type>places</type> is labelled by the
|
|
|
2005 |
corresponding LABEL in <type>placelabs_intro</type>; these LABELs
|
|
|
2006 |
are constructed by make_label applied to a TDFINT uniquely drawn
|
|
|
2007 |
from the LABEL name-space introduced by the enclosing PROPS. The
|
|
|
2008 |
evaluation starts by evaluating <type>starter</type>; if this runs
|
|
|
2009 |
to completion the result of the labelled is the result of
|
|
|
2010 |
<type>starter.</type> If there is some jump to a LABEL in
|
|
|
2011 |
<type>placelabs_intro</type> then control passes to the
|
|
|
2012 |
corresponding EXP in <type>places</type> and so on. If any of
|
|
|
2013 |
these EXPS runs to completion then its result is the result of the
|
|
|
2014 |
labelled; hence the SHAPE of the result, w, is the LUB of the
|
|
|
2015 |
SHAPEs of the component EXPs.</para>
|
|
|
2016 |
|
|
|
2017 |
<para>Note that control does not automatically pass from one EXP to
|
|
|
2018 |
the next; if this is required the appropriate EXP must end with an
|
|
|
2019 |
explicit goto.</para>
|
|
|
2020 |
</sect3>
|
|
|
2021 |
|
|
|
2022 |
<sect3 id="goto">
|
|
|
2023 |
<title>goto, make_local_lv, goto_local_lv, long_jump,
|
|
|
2024 |
return_to_label</title>
|
|
|
2025 |
|
|
|
2026 |
<para>The unconditional goto is the simplest method of jumping. In
|
|
|
2027 |
common with all the methods of jumping using LABELs, its LABEL
|
|
|
2028 |
parameter must have been introduced in an enclosing construction,
|
|
|
2029 |
like labelled, which scopes it.</para>
|
|
|
2030 |
|
|
|
2031 |
<para>One can also pick up a label value of SHAPE POINTER {code}
|
|
|
2032 |
(usually implemented as a program address) using make_local_lv for
|
|
|
2033 |
later use by an "indirect jump" such as goto_local_lv . Here the
|
|
|
2034 |
same prohibition holds - the construction which introduced the
|
|
|
2035 |
LABEL must still be active.</para>
|
|
|
2036 |
|
|
|
2037 |
<para>The construction goto_local_lv only permits one to jump within
|
|
|
2038 |
the current procedure; if one wished to do a jump out of a
|
|
|
2039 |
procedure into a calling one, one uses long_jump which requires a
|
|
|
2040 |
pointer to the destination frame (produced by current_env in the
|
|
|
2041 |
destination procedure) as well as the label value. If a long_jump
|
|
|
2042 |
is made to a label, only those local TAGs which have been defined
|
|
|
2043 |
with a visible ACCESS are guaranteed to have preserved their
|
|
|
2044 |
values; the translator could allocate the other TAGs in scope as
|
|
|
2045 |
registers whose values are not necessarily preserved.</para>
|
|
|
2046 |
|
|
|
2047 |
<para>A slightly "shorter" long jump is given by return_to_label.
|
|
|
2048 |
Here the destination of the jump must a label value in the calling
|
|
|
2049 |
procedure. Usually this value would be passed as parameter of the
|
|
|
2050 |
call to provide an alternative exit to the procedure.</para>
|
|
|
2051 |
</sect3>
|
|
|
2052 |
|
|
|
2053 |
<sect3 id="integer_test">
|
|
|
2054 |
<title>integer_test, NTEST</title>
|
|
|
2055 |
|
|
|
2056 |
<para>Conditional branching is provided by the various _test
|
|
|
2057 |
constructors, one for each primitive SHAPE except BITFIELD. I
|
|
|
2058 |
shall use integer_test as the paradigm for them all:
|
|
|
2059 |
|
|
|
2060 |
<programlisting>
|
|
|
2061 |
nt: NTEST
|
|
|
2062 |
dest: LABEL
|
|
|
2063 |
arg1: EXP INTEGER(v)
|
|
|
2064 |
arg2: EXP INTEGER(v)
|
|
|
2065 |
-> EXP TOP
|
|
|
2066 |
</programlisting>
|
|
|
2067 |
|
|
|
2068 |
The NTEST <type>nt</type> chooses a dyadic test (e.g. =, >=,
|
|
|
2069 |
<, etc.) that is to be applied to the results of evaluating
|
|
|
2070 |
<type>arg1</type> and <type>arg2</type>. If <type>arg1</type>
|
|
|
2071 |
<type>nt</type> <type>arg2</type> then the result is TOP;
|
|
|
2072 |
otherwise control passes to the LABEL <type>dest</type>. In other
|
|
|
2073 |
words, integer_test acts like an assertion where if the assertion
|
|
|
2074 |
is false, control passes to the LABEL instead of continuing in the
|
|
|
2075 |
normal fashion.</para>
|
|
|
2076 |
|
|
|
2077 |
<para>Some of the constructors for NTESTs are disallowed for some
|
|
|
2078 |
_tests (e.g. proc_test) while others are redundant for some
|
|
|
2079 |
_tests; for example, not_greater_than is the same as
|
|
|
2080 |
less_than_or_equal for all except possibly floating_test. where
|
|
|
2081 |
the use of NaNs (in the IEEE sense) as operands may give different
|
|
|
2082 |
results.</para>
|
|
|
2083 |
</sect3>
|
|
|
2084 |
|
|
|
2085 |
<sect3 id="case">
|
|
|
2086 |
<title>case</title>
|
|
|
2087 |
|
|
|
2088 |
<para>There are only two other ways of changing flow of control
|
|
|
2089 |
using LABELs. One arises in ERROR_TREATMENTs which will be dealt
|
|
|
2090 |
with in the arithmetic operations. The other is case:
|
|
|
2091 |
|
|
|
2092 |
<programlisting>
|
|
|
2093 |
exhaustive: BOOL
|
|
|
2094 |
control: EXP INTEGER(v)
|
|
|
2095 |
branches: LIST(CASELIM)
|
|
|
2096 |
-> EXP (exhaustive ? BOTTOM : TOP)
|
|
|
2097 |
</programlisting>
|
|
|
2098 |
|
|
|
2099 |
Each CASELIM is constructed using make_caselim:
|
|
|
2100 |
|
|
|
2101 |
<programlisting>
|
|
|
2102 |
branch: LABEL
|
|
|
2103 |
lower: SIGNED_NAT
|
|
|
2104 |
upper: SIGNED_NAT
|
|
|
2105 |
-> CASELIM
|
|
|
2106 |
</programlisting>
|
|
|
2107 |
|
|
|
2108 |
In the case construction, the <type>control</type> EXP is
|
|
|
2109 |
evaluated and tested to see whether its value lies inclusively
|
|
|
2110 |
between some <type>lower</type> and <type>upper</type> in the list
|
|
|
2111 |
of CASELIMs. If so, control passes to the corresponding
|
|
|
2112 |
<type>branch</type>. The order in which these tests are performed
|
|
|
2113 |
is undefined, so it is probably best if the tests are exclusive.
|
|
|
2114 |
The exhaustive flag being true asserts that one of the branches
|
|
|
2115 |
will be taken and so the SHAPE of the result is BOTTOM.
|
|
|
2116 |
Otherwise, if none of the branches are taken, its SHAPE is TOP and
|
|
|
2117 |
execution carries on normally.</para>
|
|
|
2118 |
</sect3>
|
|
|
2119 |
|
|
|
2120 |
<sect3 id="conditional">
|
|
|
2121 |
<title>conditional, repeat</title>
|
|
|
2122 |
|
|
|
2123 |
<para>Besides labelled, two other constructors, conditional and
|
|
|
2124 |
repeat, introduce LABELs which can be used with the various jump
|
|
|
2125 |
instructions. Both can be expressed as labelled, but have extra
|
|
|
2126 |
constraints which make assertions about the use of the LABELs
|
|
|
2127 |
introduced and could usually be translated more efficiently; hence
|
|
|
2128 |
producers are advised to use them where possible. A conditional
|
|
|
2129 |
expression or statement would usually be done using conditional:
|
|
|
2130 |
|
|
|
2131 |
<programlisting>
|
|
|
2132 |
alt_label_intro: LABEL
|
|
|
2133 |
first: EXP x
|
|
|
2134 |
alt: EXP y
|
|
|
2135 |
-> EXP(LUB(x, y))
|
|
|
2136 |
</programlisting>
|
|
|
2137 |
|
|
|
2138 |
Here <type>first</type> is evaluated; if it terminates normally,
|
|
|
2139 |
its result is the result of the conditional. If a jump to
|
|
|
2140 |
<type>alt_label_intro</type> occurs then <type>alt</type> is
|
|
|
2141 |
evaluated and its result is the result of the conditional.
|
|
|
2142 |
Clearly, this, so far, is just the same as
|
|
|
2143 |
labelled((<type>alt_label_intro</type>), <type>first</type>,
|
|
|
2144 |
(<type>alt</type>)). However, conditional imposes the constraint
|
|
|
2145 |
that <type>alt</type> cannot use <type>alt_label_intro</type>. All
|
|
|
2146 |
jumps to <type>alt_label_intro</type> are "forward jumps" - a
|
|
|
2147 |
useful property to know in a translator.</para>
|
|
|
2148 |
|
|
|
2149 |
<para>Obviously, this kind of conditional is rather different to
|
|
|
2150 |
those found in traditional high-level languages which usually have
|
|
|
2151 |
three components, a boolean expression, a then-part and an
|
|
|
2152 |
else-part. Here, the <type>first</type> component includes both
|
|
|
2153 |
the boolean expression and the then-part; usually we find that it
|
|
|
2154 |
is a sequence of the tests (branching to
|
|
|
2155 |
<type>alt_label_intro</type>) forming the boolean expression
|
|
|
2156 |
followed by the else-part. This formulation means that HLL
|
|
|
2157 |
constructions like "andif" and "orelse" do not require special
|
|
|
2158 |
constructions in TDF.</para>
|
|
|
2159 |
|
|
|
2160 |
<para>A simple loop can be expressed using repeat:
|
|
|
2161 |
|
|
|
2162 |
<programlisting>
|
|
|
2163 |
repeat_label_intro: LABEL
|
|
|
2164 |
start: EXP TOP
|
|
|
2165 |
body: EXP y
|
|
|
2166 |
-> EXP y
|
|
|
2167 |
</programlisting>
|
|
|
2168 |
|
|
|
2169 |
The EXP <type>start</type> is evaluated, followed by
|
|
|
2170 |
<type>body</type> which is labelled by
|
|
|
2171 |
<type>repeat_label_intro</type>. If a jump to
|
|
|
2172 |
<type>repeat_label_intro</type> occurs in <type>body</type>, then
|
|
|
2173 |
<type>body</type> is re-evaluated. If <type>body</type> terminates
|
|
|
2174 |
normally then its result is the result of the repeat. This is just
|
|
|
2175 |
the same as:
|
|
|
2176 |
|
|
|
2177 |
<programlisting>
|
|
|
2178 |
labelled((repeat_label_intro), sequence((start), goto(repeat_label_intro)), (body))
|
|
|
2179 |
</programlisting>
|
|
|
2180 |
|
|
|
2181 |
except that no jumps to <type>repeat_label_intro</type> are
|
|
|
2182 |
allowed in <type>start</type> - a useful place to do
|
|
|
2183 |
initialisations for the loop.</para>
|
|
|
2184 |
|
|
|
2185 |
<para>Just as with conditionals, the tests for the continuing or
|
|
|
2186 |
breaking the loop are included in <type>body</type> and require no
|
|
|
2187 |
special constructions.</para>
|
|
|
2188 |
</sect3>
|
|
|
2189 |
</sect2>
|
|
|
2190 |
|
|
|
2191 |
<sect2 id="exceptional-flow">
|
|
|
2192 |
<title>Exceptional flow</title>
|
|
|
2193 |
|
|
|
2194 |
<para>A further way of changing the flow of control in a TDF program
|
|
|
2195 |
is by means of exceptions. TDF exceptions currently arise from three
|
|
|
2196 |
sources - overflow in arithmetic operations with trap
|
|
|
2197 |
ERROR_TREATMENT(see
|
|
|
2198 |
<link linkend="error_treatment">section 8.1.1</link>), an attempt to
|
|
|
2199 |
access a value via a nil pointer using assign_with_mode,
|
|
|
2200 |
contents_with_mode or move_some(see
|
|
|
2201 |
<link linkend="transfer_mode-operations">section 7.3</link>) or a
|
|
|
2202 |
stack overflow on procedure entry with PROCPROPS check_stack(see
|
|
|
2203 |
<link linkend="procprops">section 5.2.2</link>) or a
|
|
|
2204 |
local_alloc_check.</para>
|
|
|
2205 |
|
|
|
2206 |
<para>Each of these exceptions have an ERROR_CODE ascribed to them,
|
|
|
2207 |
namely overflow, nil_access and stack_overflow. Each ERROR_CODE can
|
|
|
2208 |
be made into a distinct NAT by means of the constructor error_val;
|
|
|
2209 |
these NATs could be used, for example, to discriminate the different
|
|
|
2210 |
kinds of errors using a case construction.</para>
|
|
|
2211 |
|
|
|
2212 |
<para>When one of these exceptions is raised, the translator will
|
|
|
2213 |
arrange that a TOKEN, ~Throw, is called with the appropriate
|
|
|
2214 |
ERROR_CODE as its (sole) parameter. Given that every language has a
|
|
|
2215 |
different way of both characterising and handling exceptions, the
|
|
|
2216 |
exact expansion of ~Throw must be given by the producer for that
|
|
|
2217 |
language - usually it will involve doing a long_jump to some label
|
|
|
2218 |
specifying a signal handler and translating the ERROR_CODE into its
|
|
|
2219 |
language-specific representation.</para>
|
|
|
2220 |
|
|
|
2221 |
<para>The expansion of ~Throw forms part of the language specific
|
|
|
2222 |
environment required for the translation of TDF derived from the
|
|
|
2223 |
language, just as the exact shape of FILE must be given for the
|
|
|
2224 |
translation of C.</para>
|
|
|
2225 |
</sect2>
|
|
|
2226 |
</sect1>
|
|
|
2227 |
|
|
|
2228 |
<sect1>
|
|
|
2229 |
<title>Values, variables and assignments.</title>
|
|
|
2230 |
|
|
|
2231 |
<para>TAGs in TDF fulfil the role played by identifiers in most
|
|
|
2232 |
programming languages. One can apply obtain_tag to find the value
|
|
|
2233 |
bound to the TAG. This value is always a constant over the scope of a
|
|
|
2234 |
particular definition of the TAG. This may sound rather strange to
|
|
|
2235 |
those used to the concepts of left-hand and right-hand values in C,
|
|
|
2236 |
for example, but is quite easily explained as follows.</para>
|
|
|
2237 |
|
|
|
2238 |
<para>If a TAG, id, is introduced by an identify, then the value bound
|
|
|
2239 |
is fixed by its <type>definition</type> argument. If, on the other
|
|
|
2240 |
hand, v was a TAG introduced by a variable definition, then the value
|
|
|
2241 |
bound to v is a pointer to fixed space in the procedure frame (i.e.
|
|
|
2242 |
the left-hand value in C).</para>
|
|
|
2243 |
|
|
|
2244 |
<sect2 id="S62">
|
|
|
2245 |
<title>contents</title>
|
|
|
2246 |
|
|
|
2247 |
<para>In order to get the contents of this space (the right-hand value
|
|
|
2248 |
in C), one must apply the contents operator to the pointer:
|
|
|
2249 |
|
|
|
2250 |
<programlisting>
|
|
|
2251 |
contents(shape(v), obtain_tag(v))
|
|
|
2252 |
</programlisting>
|
|
|
2253 |
|
|
|
2254 |
In general, the contents constructor takes a SHAPE and an
|
|
|
2255 |
expression delivering pointer:
|
|
|
2256 |
|
|
|
2257 |
<programlisting>
|
|
|
2258 |
s: SHAPE
|
|
|
2259 |
arg1: EXP POINTER(x)
|
|
|
2260 |
-> EXP s
|
|
|
2261 |
</programlisting>
|
|
|
2262 |
|
|
|
2263 |
It delivers the value of SHAPE <type>s</type>, pointed at by the
|
|
|
2264 |
evaluation of <type>arg1</type>. The alignment of <type>s</type>
|
|
|
2265 |
need not be identical to <varname>x</varname>. It only needs to be
|
|
|
2266 |
included in it; this would allow one, for example, to pick out the
|
|
|
2267 |
first field of a structure from a pointer to it.</para>
|
|
|
2268 |
</sect2>
|
|
|
2269 |
|
|
|
2270 |
<sect2 id="S63">
|
|
|
2271 |
<title>assign</title>
|
|
|
2272 |
|
|
|
2273 |
<para>A simple assignment in TDF is done using assign:
|
|
|
2274 |
|
|
|
2275 |
<programlisting>
|
|
|
2276 |
arg1: EXP POINTER(x)
|
|
|
2277 |
arg2: EXP y
|
|
|
2278 |
-> EXP TOP
|
|
|
2279 |
</programlisting>
|
|
|
2280 |
|
|
|
2281 |
The EXPs <type>arg1</type> and <type>arg2</type> are evaluated (no
|
|
|
2282 |
ordering implied) and the value of SHAPE <varname>y</varname> given by
|
|
|
2283 |
<type>arg2</type> is put into the space pointed at by
|
|
|
2284 |
<type>arg1</type>. Once again, the alignment of <varname>y</varname>
|
|
|
2285 |
need only be included in <varname>x</varname>, allowing the assignment
|
|
|
2286 |
to the first field of a structure using a pointer to the structure. An
|
|
|
2287 |
assignment has no obvious result so its SHAPE is TOP.</para>
|
|
|
2288 |
|
|
|
2289 |
<para>Some languages give results to assignments. For example, C defines
|
|
|
2290 |
the result of an assignment to be its right-hand expression, so that
|
|
|
2291 |
if the result of (v = exp) was required, it would probably be
|
|
|
2292 |
expressed as:
|
|
|
2293 |
|
|
|
2294 |
<programlisting>
|
|
|
2295 |
identify(empty, newtag, exp, sequence((assign(obtain_tag(v), obtain_tag(newtag))), obtain_tag(newtag)))
|
|
|
2296 |
</programlisting>
|
|
|
2297 |
|
|
|
2298 |
From the definition of assign, the destination argument,
|
|
|
2299 |
<type>arg1</type>, must have a POINTER shape. This means that given
|
|
|
2300 |
the TAG id above, assign(obtain_tag(id), lhs) is only legal if the
|
|
|
2301 |
<type>definition</type> of its identify had a POINTER SHAPE. A trivial
|
|
|
2302 |
example would be if id was defined:
|
|
|
2303 |
|
|
|
2304 |
<programlisting>
|
|
|
2305 |
identify(empty, id, obtain_tag(v), assign(obtain_tag(id), lhs))
|
|
|
2306 |
</programlisting>
|
|
|
2307 |
|
|
|
2308 |
This identifies id with the variable v which has a POINTER SHAPE, and
|
|
|
2309 |
assigns lhs to this pointer. Given that id does not occur in lhs, this
|
|
|
2310 |
is identical to:
|
|
|
2311 |
|
|
|
2312 |
<programlisting>
|
|
|
2313 |
assign(obtain_tag(v), lhs).
|
|
|
2314 |
</programlisting>
|
|
|
2315 |
|
|
|
2316 |
Equivalences like this are widely used for transforming TDF in
|
|
|
2317 |
translators and other tools (see
|
|
|
2318 |
<link linkend= "tdf-transformations">section 11</link>).</para>
|
|
|
2319 |
</sect2>
|
|
|
2320 |
|
|
|
2321 |
<sect2 id="transfer_mode-operations">
|
|
|
2322 |
<title>TRANSFER_MODE operations</title>
|
|
|
2323 |
|
|
|
2324 |
<para>The TRANSFER_MODE operations allow one to do assignment and
|
|
|
2325 |
contents operations with various qualifiers to control how the
|
|
|
2326 |
access is done in a more detailed manner than the standard contents
|
|
|
2327 |
and assign operations.</para>
|
|
|
2328 |
|
|
|
2329 |
<para>For example, the value assigned in assign has some fixed SHAPE;
|
|
|
2330 |
its size is known at translate-time. Variable sized objects can be
|
|
|
2331 |
moved by move_some:
|
|
|
2332 |
|
|
|
2333 |
<programlisting>
|
|
|
2334 |
md: TRANSFER_MODE
|
|
|
2335 |
arg1: EXP POINTER x
|
|
|
2336 |
arg2: EXP POINTER y
|
|
|
2337 |
arg3: EXP OFFSET(z, t)
|
|
|
2338 |
-> EXP TOP
|
|
|
2339 |
</programlisting>
|
|
|
2340 |
|
|
|
2341 |
The EXP <type>arg1</type> is the destination pointer, and
|
|
|
2342 |
<type>arg2</type> is a source pointer. The amount moved is given by
|
|
|
2343 |
the OFFSET <type>arg3</type>.</para>
|
|
|
2344 |
|
|
|
2345 |
<para>The TRANSFER_MODE <type>md</type> parameter controls the way
|
|
|
2346 |
that the move will be performed. If overlap is present, then the
|
|
|
2347 |
translator will ensure that the move is equivalent to moving the
|
|
|
2348 |
source into new space and then copying it to the destination; it
|
|
|
2349 |
would probably do this by choosing a good direction in which to step
|
|
|
2350 |
through the value. The alternative, standard_transfer_mode,
|
|
|
2351 |
indicates that it does not matter.</para>
|
|
|
2352 |
|
|
|
2353 |
<para>If the TRANSFER_MODE trap_on_nil is present and
|
|
|
2354 |
<type>arg1</type> is a nil pointer, a TDF exception with ERROR_CODE
|
|
|
2355 |
nil_access is raised.</para>
|
|
|
2356 |
|
|
|
2357 |
<para>There are variants of both the contents and assign constructors.
|
|
|
2358 |
The signature of contents_with_mode is:
|
|
|
2359 |
|
|
|
2360 |
<programlisting>
|
|
|
2361 |
md: TRANSFER_MODE
|
|
|
2362 |
s: SHAPE
|
|
|
2363 |
arg1: EXP POINTER(x)
|
|
|
2364 |
-> EXP s
|
|
|
2365 |
</programlisting>
|
|
|
2366 |
|
|
|
2367 |
Here, the only significant TRANSFER_MODE constructors
|
|
|
2368 |
<type>md</type> are trap_on_nil and volatile. The latter is
|
|
|
2369 |
principally intended to implement the C volatile construction; it
|
|
|
2370 |
certainly means that the contents_with_mode operation will never be
|
|
|
2371 |
"optimised" away.</para>
|
|
|
2372 |
|
|
|
2373 |
<para>Similar considerations apply to assign_with_mode; here the
|
|
|
2374 |
overlap TRANSFER_MODE is also possible with the same meaning as in
|
|
|
2375 |
move_some.</para>
|
|
|
2376 |
</sect2>
|
|
|
2377 |
|
|
|
2378 |
<sect2 id="assigning-and-extracting-bitfields">
|
|
|
2379 |
<title>Assigning and extracting bitfields</title>
|
|
|
2380 |
|
|
|
2381 |
<para>Since pointers to bits are forbidden, two special operations are
|
|
|
2382 |
provided to extract and assign bitfields. These require the use of
|
|
|
2383 |
a pointer value and a bitfield offset from the pointer. The
|
|
|
2384 |
signature of bitfield_contents which extracts a bitfield in this
|
|
|
2385 |
manner is:
|
|
|
2386 |
|
|
|
2387 |
<programlisting>
|
|
|
2388 |
v: BITFIELD_VARIETY
|
|
|
2389 |
arg1: EXP POINTER(x)
|
|
|
2390 |
arg2: EXP OFFSET(y,z)
|
|
|
2391 |
-> EXP bitfield(v)
|
|
|
2392 |
</programlisting>
|
|
|
2393 |
|
|
|
2394 |
Here <type>arg1</type> is a pointer to an alignment
|
|
|
2395 |
<parameter>x</parameter> which includes <type>v</type>, the required
|
|
|
2396 |
bitfield alignment. In practice, <parameter>x</parameter> must
|
|
|
2397 |
include an INTEGER VARIETY whose representation can contain the
|
|
|
2398 |
entire bitfield. Thus on a standard architecture, if v is a 15 bit
|
|
|
2399 |
bitfield, x must include at least a 16 bit integer variety; a 27
|
|
|
2400 |
bitfield would require a 32 bit integer variety and so on. Indeed
|
|
|
2401 |
the constraint is stronger than this since there must be an integer
|
|
|
2402 |
variety, accessible from arg1, which entirely contains the
|
|
|
2403 |
bitfield.</para>
|
|
|
2404 |
|
|
|
2405 |
<para>This constraint means that producers cannot expect that
|
|
|
2406 |
arbitrary bitfield lengths can be accomodated without extra padding;
|
|
|
2407 |
clearly it also means that the maximum bitfield length possible is
|
|
|
2408 |
the maximum size of integer variety that can be implemented on the
|
|
|
2409 |
translator concerned (this is defined to be at least 32). On
|
|
|
2410 |
standard architectures, the producer can expect that an array of
|
|
|
2411 |
bitfields of lenth 2<emphasis>n</emphasis> will be packed without
|
|
|
2412 |
padding; this, of course, includes arrays of booleans. For
|
|
|
2413 |
structures of several different bitfields, he can be sure of no
|
|
|
2414 |
extra padding bits if the total number of bits involved is less than
|
|
|
2415 |
or equal to 32; similarly if he can subdivide the bitfields so that
|
|
|
2416 |
each of the subdivisions (except the last) is exactly equal to 32
|
|
|
2417 |
and the last is less than or equal to 32. This could be relaxed to
|
|
|
2418 |
64 bits if the translator deals with 64 bit integer varieties, but
|
|
|
2419 |
would require that conditional TDF is produced to maintain
|
|
|
2420 |
portability to 32 bit platforms - and on these platforms the
|
|
|
2421 |
assurance of close packing would be lost.</para>
|
|
|
2422 |
|
|
|
2423 |
<para>Since a producer is ignorant of the exact representational
|
|
|
2424 |
varieties used by a translator, the onus is on the translator writer
|
|
|
2425 |
to provide standard tokens which can be used by a producer to
|
|
|
2426 |
achieve both optimum packing of bitfields and minimum alignments for
|
|
|
2427 |
COMPOUNDs containing them(see
|
|
|
2428 |
<link linkend="bitfield-offsets">Bitfield offsets</link>). These
|
|
|
2429 |
tokens would allow one to construct an offset of the form OFFSET(x,
|
|
|
2430 |
b) (where b is some bitfield alignment and x is the `minimum'
|
|
|
2431 |
alignment which could contain it) in a manner analogous to the
|
|
|
2432 |
normal padding operations for offsets. This offset could then used
|
|
|
2433 |
both in the construction of a compound shape and in the extraction
|
|
|
2434 |
and assignment constructors.</para>
|
|
|
2435 |
|
|
|
2436 |
<para>The assignment of bitfields follows the same pattern with the
|
|
|
2437 |
same constraints using bitfield_assign:
|
|
|
2438 |
|
|
|
2439 |
<programlisting>
|
|
|
2440 |
arg1: EXP POINTER(x)
|
|
|
2441 |
arg2: EXP OFFSET(y,z)
|
|
|
2442 |
arg3: EXP BITFIELD_VARIETY(v)
|
|
|
2443 |
-> EXP TOP
|
|
|
2444 |
</programlisting>
|
|
|
2445 |
</para>
|
|
|
2446 |
</sect2>
|
|
|
2447 |
</sect1>
|
|
|
2448 |
|
|
|
2449 |
<sect1>
|
|
|
2450 |
<title>Operations</title>
|
|
|
2451 |
|
|
|
2452 |
<para>Most of the arithmetic operations of TDF have familiar analogues
|
|
|
2453 |
in standard languages and processors. They differ principally in how
|
|
|
2454 |
error conditions (e.g. numeric overflow) are handled. There is a wide
|
|
|
2455 |
diversity in error handling in both languages and processors, so TDF
|
|
|
2456 |
tries to reduce it to the simplest primitive level compatible with
|
|
|
2457 |
their desired operation in languages and their implementation on
|
|
|
2458 |
processors. Before delving into the details of error handling, it is
|
|
|
2459 |
worthwhile revisiting the SHAPEs and ranges in arithmetic
|
|
|
2460 |
VARIETYs.</para>
|
|
|
2461 |
|
|
|
2462 |
<sect2 id="S67">
|
|
|
2463 |
<title>VARIETY and overflow</title>
|
|
|
2464 |
|
|
|
2465 |
<para>An INTEGER VARIETY, for example, is defined by some range of
|
|
|
2466 |
signed natural numbers. A translator will fit this range into some
|
|
|
2467 |
possibly larger range which is convenient for the processor in
|
|
|
2468 |
question. For example, the integers with variety(1,10) would
|
|
|
2469 |
probably be represented as unsigned characters with range (0..255),
|
|
|
2470 |
a convenient representation for both storage and arithmetic.</para>
|
|
|
2471 |
|
|
|
2472 |
<para>The question then arises of what is meant by overflow in an
|
|
|
2473 |
operation which is meant to deliver an integer of this VARIETY - is
|
|
|
2474 |
it when the integer result is outside the range (1..10) or outside
|
|
|
2475 |
the range (0..255)? For purely pragmatic reasons, TDF chooses the
|
|
|
2476 |
latter - the result is overflowed when it is outside its
|
|
|
2477 |
representational range (0..255). If the program insists that it must
|
|
|
2478 |
be within (1..10), then it can always test for it. If the program
|
|
|
2479 |
uses the error handling mechanism and the result is outside (1..10)
|
|
|
2480 |
but still within the representational limits, then, in order for the
|
|
|
2481 |
program to be portable, then the error handling actions must in some
|
|
|
2482 |
sense be "continuous" with the normal action. This would not be the
|
|
|
2483 |
case if, for example, the value was used to index an array with
|
|
|
2484 |
bounds (1..10), but will usually be the case where the value is used
|
|
|
2485 |
in further arithmetic operations which have similar error handling.
|
|
|
2486 |
The arithmetic will continue to give the mathematically correct
|
|
|
2487 |
result provided the representational bounds are not exceeded.</para>
|
|
|
2488 |
|
|
|
2489 |
<para>The limits in a VARIETY are there to provide a guide to its
|
|
|
2490 |
representation, and not to give hard limits to its possible values.
|
|
|
2491 |
This choice is consistent with the general TDF philosophy of how
|
|
|
2492 |
exceptions are to be treated. If, for example, one wishes to do
|
|
|
2493 |
array-bound checking, then it must be done by explicit tests on the
|
|
|
2494 |
indices and jumping to some exception action if they fail.
|
|
|
2495 |
Similarly, explicit tests can be made on an integer value, provided
|
|
|
2496 |
its representational limits are not exceeded. It is unlikely that a
|
|
|
2497 |
translator could produce any more efficient code, in general, if the
|
|
|
2498 |
tests were implicit. The representational limits can be exceeded in
|
|
|
2499 |
arithmetic operations, so facilities are provided to either to
|
|
|
2500 |
ignore it, to allow one to jump to a label, or to obey a TDF
|
|
|
2501 |
exception handler if it happens.</para>
|
|
|
2502 |
|
|
|
2503 |
<sect3 id="error_treatment">
|
|
|
2504 |
<title>ERROR_TREATMENT</title>
|
|
|
2505 |
|
|
|
2506 |
<para>Taking integer addition as an example, plus has signature:
|
|
|
2507 |
|
|
|
2508 |
<programlisting>
|
|
|
2509 |
ov_err: ERROR_TREATMENT
|
|
|
2510 |
arg1: EXP INTEGER(v)
|
|
|
2511 |
arg2: EXP INTEGER(v)
|
|
|
2512 |
-> EXP INTEGER(v)
|
|
|
2513 |
</programlisting>
|
|
|
2514 |
|
|
|
2515 |
The result of the addition has the same integer VARIETY as its
|
|
|
2516 |
parameters. If the representational bounds of
|
|
|
2517 |
<parameter>v</parameter> are exceeded, then the action taken
|
|
|
2518 |
depends on the ERROR_TREATMENT <type>ov_err</type>.</para>
|
|
|
2519 |
|
|
|
2520 |
<para>The ERROR_TREATMENT, impossible, is an assertion by the
|
|
|
2521 |
producer that overflow will not occur; on its head be it if it
|
|
|
2522 |
does.</para>
|
|
|
2523 |
|
|
|
2524 |
<para>The ERROR_TREATMENTS continue and wrap give "fixup" values for
|
|
|
2525 |
the result. For continue the fixup value is undefined. For wrap,
|
|
|
2526 |
the the answer will be modulo 2 to the power of the number of bits
|
|
|
2527 |
in the representational variety.Thus, integer arithmetic with byte
|
|
|
2528 |
representational variety is done modulo 256. This just corresponds
|
|
|
2529 |
to what happens in most processors and, incidentally, the
|
|
|
2530 |
definition of C.</para>
|
|
|
2531 |
|
|
|
2532 |
<para>The ERROR_TREATMENT that one would use if one wished to jump
|
|
|
2533 |
to a label is error_jump:
|
|
|
2534 |
|
|
|
2535 |
<programlisting>
|
|
|
2536 |
lab: LABEL
|
|
|
2537 |
-> ERROR_TREATMENT
|
|
|
2538 |
</programlisting>
|
|
|
2539 |
|
|
|
2540 |
A branch to <type>lab</type> will occur if the result
|
|
|
2541 |
overflows.</para>
|
|
|
2542 |
|
|
|
2543 |
<para>The ERROR_TREATMENT, trap(overflow) will raise a TDF
|
|
|
2544 |
exception(see <link linkend="exceptional-flow">section 6.3</link>)
|
|
|
2545 |
with ERROR_CODE overflow if overflow occurs.</para>
|
|
|
2546 |
</sect3>
|
|
|
2547 |
</sect2>
|
|
|
2548 |
|
|
|
2549 |
<sect2 id="S69">
|
|
|
2550 |
<title>Division and remainder</title>
|
|
|
2551 |
|
|
|
2552 |
<para>The various constructors in involving integer division (e.g.
|
|
|
2553 |
div1, rem1) have two ERROR_TREATMENT parameters, one for overflow
|
|
|
2554 |
and one for divide-by-zero e.g. div1 is:
|
|
|
2555 |
|
|
|
2556 |
<programlisting>
|
|
|
2557 |
div_by_zero_error: ERROR_TREATMENT
|
|
|
2558 |
ov_err: ERROR_TREATMENT
|
|
|
2559 |
arg1: EXP INTEGER(v)
|
|
|
2560 |
arg2: EXP INTEGER(v)
|
|
|
2561 |
-> EXP INTEGER(v)
|
|
|
2562 |
</programlisting>
|
|
|
2563 |
|
|
|
2564 |
There are two different kinds of division operators (with
|
|
|
2565 |
corresponding remainder operators) defined. The operators div2 and
|
|
|
2566 |
rem2 are those generally implemented directly by processor
|
|
|
2567 |
instructions giving the sign of the remainder the same as the sign
|
|
|
2568 |
of the quotient. The other pair, div1 and rem1, is less commonly
|
|
|
2569 |
implemented in hardware, but have rather more consistent
|
|
|
2570 |
mathematical properties; here the sign of remainder is the same as
|
|
|
2571 |
the sign of divisor. Thus, div1(x, 2) is the same as shift_right(x,
|
|
|
2572 |
1) which is only true for div2 if x is positive. The two pairs of
|
|
|
2573 |
operations give the same results if both operands have the same
|
|
|
2574 |
sign. The constructors div0 and rem0 allow the translator to choose
|
|
|
2575 |
whichever of the two forms of division is convenient - the producer
|
|
|
2576 |
is saying that he does not care which is used, as long as they are
|
|
|
2577 |
pairwise consistent. The precise definition of the divide operations
|
|
|
2578 |
is given in (<fix>S7.4</fix>).</para>
|
|
|
2579 |
</sect2>
|
|
|
2580 |
|
|
|
2581 |
<sect2 id="change_variety">
|
|
|
2582 |
<title>change_variety</title>
|
|
|
2583 |
|
|
|
2584 |
<para>Conversions between the various INTEGER varieties are provided
|
|
|
2585 |
for by change_variety:
|
|
|
2586 |
|
|
|
2587 |
<programlisting>
|
|
|
2588 |
ov_err: ERROR_TREATMENT
|
|
|
2589 |
r: VARIETY
|
|
|
2590 |
arg1: EXP INTEGER(v)
|
|
|
2591 |
-> EXP INTEGER(r)
|
|
|
2592 |
</programlisting>
|
|
|
2593 |
|
|
|
2594 |
If the value <type>arg1</type> is outside the limits of the
|
|
|
2595 |
representational variety of <type>r</type>, then the ERROR_TREATMENT
|
|
|
2596 |
<type>ov_err</type> will be invoked.</para>
|
|
|
2597 |
</sect2>
|
|
|
2598 |
|
|
|
2599 |
<sect2 id="S71">
|
|
|
2600 |
<title>and, or, not, xor</title>
|
|
|
2601 |
|
|
|
2602 |
<para>The standard logical operations, and, not, or and xor are
|
|
|
2603 |
provided for all integer varieties. Since integer varieties are
|
|
|
2604 |
defined to be represented in twos-complement the result of these
|
|
|
2605 |
operations are well defined.</para>
|
|
|
2606 |
</sect2>
|
|
|
2607 |
|
|
|
2608 |
<sect2 id="S72">
|
|
|
2609 |
<title>Floating-point operations, ROUNDING_MODE</title>
|
|
|
2610 |
|
|
|
2611 |
<para>All of the floating-point (including complex) operations include
|
|
|
2612 |
ERROR-TREATMENTs. If the result of a floating-point operation cannot
|
|
|
2613 |
be represented in the desired FLOATING_VARIETY, the error treatment
|
|
|
2614 |
is invoked. If the ERROR_TREATMENT is wrap or impossible, the result
|
|
|
2615 |
is undefined; otherwise the jump operates in the same way as for
|
|
|
2616 |
integer operations. Both floating_plus and floating_mult are defined
|
|
|
2617 |
as n-ary operations. In general, floating addition and
|
|
|
2618 |
multiplication are not associative, but a producer may not care
|
|
|
2619 |
about the order in which they are to be performed. Making them
|
|
|
2620 |
appear as though they were associative allows the translator to
|
|
|
2621 |
choose an order which is convenient to the hardware.</para>
|
|
|
2622 |
|
|
|
2623 |
<para>Conversions from integer to floating are done by float_int and
|
|
|
2624 |
from floating to integers by round_with_mode . This latter
|
|
|
2625 |
constructor has a parameter of SORT ROUNDING_MODE which effectively
|
|
|
2626 |
gives the IEEE rounding mode to be applied to the float to produce
|
|
|
2627 |
its integer result.</para>
|
|
|
2628 |
|
|
|
2629 |
<para>One can extract the real and imaginary parts of a complex
|
|
|
2630 |
FLOATING using real_part and imaginary_part. A complex FLOATING can
|
|
|
2631 |
be constructed using make_complex. Normal complex arithmetic applies
|
|
|
2632 |
to all the other FLOATING constructors except for those explicitly
|
|
|
2633 |
excluded (eg floating_abs, floating_max etc.)</para>
|
|
|
2634 |
</sect2>
|
|
|
2635 |
|
|
|
2636 |
<sect2 id="change_bitfield_to_int">
|
|
|
2637 |
<title>change_bitfield_to_int, change_int_to_bitfield</title>
|
|
|
2638 |
|
|
|
2639 |
<para>There are two bit-field operation, change_bitfield_to_int and
|
|
|
2640 |
change_int_to_bitfield to transform between bit-fields and integers.
|
|
|
2641 |
If the varieties do not fit the result is undefined; the producer
|
|
|
2642 |
can always get it right.</para>
|
|
|
2643 |
</sect2>
|
|
|
2644 |
|
|
|
2645 |
<sect2 id="make_compound">
|
|
|
2646 |
<title>make_compound, make_nof, n_copies</title>
|
|
|
2647 |
|
|
|
2648 |
<para>There is one operation to make values of COMPOUND SHAPE,
|
|
|
2649 |
make_compound:
|
|
|
2650 |
|
|
|
2651 |
<programlisting>
|
|
|
2652 |
arg1: EXP OFFSET(base, y)
|
|
|
2653 |
arg2: LIST(EXP)
|
|
|
2654 |
-> EXP COMPOUND(sz)
|
|
|
2655 |
</programlisting>
|
|
|
2656 |
|
|
|
2657 |
The OFFSET <type>arg1</type> is evaluated as a translate-time
|
|
|
2658 |
constant to give <parameter>sz</parameter>, the size of the compound
|
|
|
2659 |
object. The EXPs of arg2 are alternately OFFSETs (also
|
|
|
2660 |
translate-time constants) and values which will be placed at those
|
|
|
2661 |
offsets. This constructor is used to construct values given by
|
|
|
2662 |
structure displays; in C, these only occur with constant
|
|
|
2663 |
<constant>val[i]</constant> in global definitions. It is also used
|
|
|
2664 |
to provide union injectors; here <parameter>sz</parameter> would be
|
|
|
2665 |
the size of the union and the list would probably two elements with
|
|
|
2666 |
the first being an offset_zero.</para>
|
|
|
2667 |
|
|
|
2668 |
<para>Constant sized array values may be constructed using make_nof,
|
|
|
2669 |
make_nof_int, and n_copies. Again, they only occur in C as constants
|
|
|
2670 |
in global definitions.</para>
|
|
|
2671 |
</sect2>
|
|
|
2672 |
</sect1>
|
|
|
2673 |
|
|
|
2674 |
<sect1 id="constants">
|
|
|
2675 |
<title>Constants</title>
|
|
|
2676 |
|
|
|
2677 |
<para>The representation of constants clearly has peculiar difficulties
|
|
|
2678 |
in any architecture neutral format. Leaving aside any problems of how
|
|
|
2679 |
numbers are to be represented, we also have the situation where a
|
|
|
2680 |
"constant" can have different values on different platforms. An
|
|
|
2681 |
obvious example would be the size of a structure which, although it is
|
|
|
2682 |
a constant of any particular run of a program, may have different
|
|
|
2683 |
values on different machines. Further, this constant is in general the
|
|
|
2684 |
result of some computation involving the sizes of its components which
|
|
|
2685 |
are not known until the platform is chosen. In TDF, sizes are always
|
|
|
2686 |
derived from some EXP OFFSET constructed using the various OFFSET
|
|
|
2687 |
arithmetic operations on primitives like shape_offset and offset_zero.
|
|
|
2688 |
Most such EXP OFFSETs produced are in fact constants of the platform;
|
|
|
2689 |
they include field displacements of structure as well as their sizes.
|
|
|
2690 |
TDF assumes that, if these EXPs can be evaluated at translate-time
|
|
|
2691 |
(i.e. when the sizes and alignments of primitive objects are known),
|
|
|
2692 |
then they must be evaluated there. An example of why this is so arises
|
|
|
2693 |
in make_compound; the SHAPE of its result EXP depends on its
|
|
|
2694 |
<type>arg1</type> EXP OFFSET parameter and all SHAPEs must be
|
|
|
2695 |
translate-time values.</para>
|
|
|
2696 |
|
|
|
2697 |
<para>An initialisation of a TAGDEF is a constant in this sense
|
|
|
2698 |
|
|
|
2699 |
<footnote>
|
|
|
2700 |
<para>However see also initial_value in
|
|
|
2701 |
<link linkend="definitions-and-declarations">section 3.2</link>.
|
|
|
2702 |
</para>
|
|
|
2703 |
</footnote>
|
|
|
2704 |
|
|
|
2705 |
; this allows one to ignore any difficulties about their order of
|
|
|
2706 |
evaluation in the UNIT and consequently the order of evaluation of
|
|
|
2707 |
UNITs. Once again all the EXPs which are initialisations must be
|
|
|
2708 |
evaluated before the program is run; this obviously includes any
|
|
|
2709 |
make_proc or make_general_proc. . The limitation on an initialisation
|
|
|
2710 |
EXP to ensure this is basically that one cannot take the contents of a
|
|
|
2711 |
variable declared outside the EXP after all tokens and conditional
|
|
|
2712 |
evaluation is taken into account. In other words, each TDF translator
|
|
|
2713 |
effectively has an TDF interpreter which can do evaluation of
|
|
|
2714 |
expressions (including conditionals etc.) involving only constants
|
|
|
2715 |
such as numbers, sizes and addresses of globals. This corresponds very
|
|
|
2716 |
roughly to the kind of initialisations of globals that are permissible
|
|
|
2717 |
in C; for a more precise definition, see (<fix>S7.3</fix>).</para>
|
|
|
2718 |
|
|
|
2719 |
<sect2 id="cond-constructors">
|
|
|
2720 |
<title>_cond constructors</title>
|
|
|
2721 |
|
|
|
2722 |
<para>Another place where translate-time evaluation of constants is
|
|
|
2723 |
mandated is in the various _cond constructors which give a kind of
|
|
|
2724 |
"conditional compilation" facility; every SORT which has a SORTNAME,
|
|
|
2725 |
other that TAG, TOKEN and LABEL, has one of these constructors e.g.
|
|
|
2726 |
exp_cond:
|
|
|
2727 |
|
|
|
2728 |
<programlisting>
|
|
|
2729 |
control: EXP INTEGER(v)
|
|
|
2730 |
e1: BITSTREAM EXP x
|
|
|
2731 |
e2: BITSTREAM EXP y
|
|
|
2732 |
-> EXP x or EXP y
|
|
|
2733 |
</programlisting>
|
|
|
2734 |
|
|
|
2735 |
The constant, <type>control</type>, is evaluated at translate time.
|
|
|
2736 |
If it is not zero the entire construction is replaced by the EXP in
|
|
|
2737 |
<type>e1</type>; otherwise it is replaced by the one in
|
|
|
2738 |
<type>e2</type>. In either case, the other BITSTREAM is totally
|
|
|
2739 |
ignored; it even does not need to be sensible TDF. This kind of
|
|
|
2740 |
construction is use extensively in C pre-processing directives e.g.:
|
|
|
2741 |
|
|
|
2742 |
<programlisting>
|
|
|
2743 |
#if (sizeof(int) == sizeof(long)) ...
|
|
|
2744 |
</programlisting>
|
|
|
2745 |
</para>
|
|
|
2746 |
</sect2>
|
|
|
2747 |
|
|
|
2748 |
<sect2 id="primitive-constant-constructors">
|
|
|
2749 |
<title>Primitive constant constructors</title>
|
|
|
2750 |
|
|
|
2751 |
<para>Integer constants are constructed using make_int:
|
|
|
2752 |
|
|
|
2753 |
<programlisting>
|
|
|
2754 |
v: VARIETY
|
|
|
2755 |
value: SIGNED_NAT
|
|
|
2756 |
-> EXP INTEGER(v)
|
|
|
2757 |
</programlisting>
|
|
|
2758 |
|
|
|
2759 |
The SIGNED_NAT <type>value</type> is an encoding of the binary value
|
|
|
2760 |
required for the integer; this value must lie within the limits
|
|
|
2761 |
given by <type>v</type>. I have been rather slip-shod in writing
|
|
|
2762 |
down examples of integer constants earlier in this document; where I
|
|
|
2763 |
have written 1 as an integer EXP, for example, I should have written
|
|
|
2764 |
make_int(v, 1) where v is some appropriate VARIETY.</para>
|
|
|
2765 |
|
|
|
2766 |
<para>Constants for both floats and strings use STRINGs. A constant
|
|
|
2767 |
string is just an particular example of make_nof_int:
|
|
|
2768 |
|
|
|
2769 |
<programlisting>
|
|
|
2770 |
v: VARIETY
|
|
|
2771 |
str: STRING(k, n)
|
|
|
2772 |
-> EXP NOF(n, INTEGER(v))
|
|
|
2773 |
</programlisting>
|
|
|
2774 |
|
|
|
2775 |
Each unsigned integer in <type>str</type> must lie in the variety
|
|
|
2776 |
<type>v</type> and the result is the constant array whose elements
|
|
|
2777 |
are the integers considered to be of VARIETY <type>v</type>. An
|
|
|
2778 |
ASCI-C constant string might have <type>v</type> = variety(-128,127)
|
|
|
2779 |
and <parameter>k</parameter> = 7; however, make_nof_int can be used
|
|
|
2780 |
to make strings of any INTEGER VARIETY; a the elements of a Unicode
|
|
|
2781 |
string would be integers of size 16 bits.</para>
|
|
|
2782 |
|
|
|
2783 |
<para>A floating constant uses a STRING which contains the ASCI
|
|
|
2784 |
characters of a expansion of the number to some base in
|
|
|
2785 |
make_floating:
|
|
|
2786 |
|
|
|
2787 |
<programlisting>
|
|
|
2788 |
f: FLOATING_VARIETY
|
|
|
2789 |
rm: ROUNDING_MODE
|
|
|
2790 |
sign: BOOL
|
|
|
2791 |
mantissa: STRING(k, n)
|
|
|
2792 |
base: NAT
|
|
|
2793 |
exponent: SIGNED_NAT
|
|
|
2794 |
-> EXP FLOATING(f)
|
|
|
2795 |
</programlisting>
|
|
|
2796 |
|
|
|
2797 |
For a normal floating point number, each integer in
|
|
|
2798 |
<type>mantissa</type> is either the ASCII `.'-symbol or the ASCII
|
|
|
2799 |
representation of a digit of the representation in the given
|
|
|
2800 |
<type>base</type>; i.e. if c is the ASCII symbol, the digit value is
|
|
|
2801 |
c-'0'. The resulting floating point number has SHAPE FLOATING(f) and
|
|
|
2802 |
value <type>mantissa</type> * <type>base</type>
|
|
|
2803 |
<type>exponent</type> rounded according to <type>rm</type>. Usually
|
|
|
2804 |
the base will be 10 (sometimes 2) and the rounding mode to_nearest.
|
|
|
2805 |
Any floating-point evaluation of expressions done at translate-time
|
|
|
2806 |
will be done to an accuracy greater that implied by the
|
|
|
2807 |
FLOATING_VARIETY involved, so that floating constants will be as
|
|
|
2808 |
accurate as the platform permits.</para>
|
|
|
2809 |
|
|
|
2810 |
<para>The make_floating construct does not apply apply to a complex
|
|
|
2811 |
FLOATING_VARIETY <type>f</type>; to construct a complex constant use
|
|
|
2812 |
make_complex with two make_floating arguments.</para>
|
|
|
2813 |
|
|
|
2814 |
<para>Constants are also provided to give unique null values for
|
|
|
2815 |
pointers, label values and procs i.e.: make_null_ptr,
|
|
|
2816 |
make_null_local_lv and make_null_proc. Any significant use of these
|
|
|
2817 |
values (e.g. taking the contents of a null pointer) is undefined,
|
|
|
2818 |
but they can be assigned and used in tests in the normal way.</para>
|
|
|
2819 |
</sect2>
|
|
|
2820 |
</sect1>
|
|
|
2821 |
|
|
|
2822 |
<sect1 id="tokens-and-apis">
|
|
|
2823 |
<title>Tokens and APIs</title>
|
|
|
2824 |
|
|
|
2825 |
<para>All of the examples of the use of TOKENs so far given have really
|
|
|
2826 |
been as abbreviations for commonly used constructs, e.g. the EXP
|
|
|
2827 |
OFFSETS for fields of structures. However, the real justification for
|
|
|
2828 |
TOKENs are their use as abstractions for things defined in libraries
|
|
|
2829 |
or application program interfaces (APIs).</para>
|
|
|
2830 |
|
|
|
2831 |
<sect2 id="S79">
|
|
|
2832 |
<title>Application programming interfaces</title>
|
|
|
2833 |
|
|
|
2834 |
<para>APIs usually do not give complete language definitions of the
|
|
|
2835 |
operations and values that they contain; generally, they are defined
|
|
|
2836 |
informally in English giving relationships between the entities
|
|
|
2837 |
within them. An API designer should allow implementors the
|
|
|
2838 |
opportunity of choosing actual definitions which fit their hardware
|
|
|
2839 |
and the possibility of changing them as better algorithms or
|
|
|
2840 |
representations become available.</para>
|
|
|
2841 |
|
|
|
2842 |
<para>The most commonly quoted example is the representation of the
|
|
|
2843 |
type FILE and its related operations in C. The ANSI C definition
|
|
|
2844 |
gives no common representation for FILE; its implementation is
|
|
|
2845 |
defined to be platform-dependent. A TDF producer can assume nothing
|
|
|
2846 |
about FILE; not even that it is a structure. The only things that
|
|
|
2847 |
can alter or create FILEs are also entities in the Ansi-C API and
|
|
|
2848 |
they will always refer to FILEs via a C pointer. Thus TDF abstracts
|
|
|
2849 |
FILE as a SHAPE TOKEN with no parameters, make_tok(T_FILE) say. Any
|
|
|
2850 |
program that uses FILE would have to include a TOKDEC introducing
|
|
|
2851 |
T_FILE:
|
|
|
2852 |
|
|
|
2853 |
<programlisting>
|
|
|
2854 |
make_tokdec(T_FILE, empty, shape())
|
|
|
2855 |
</programlisting>
|
|
|
2856 |
|
|
|
2857 |
and anywhere that it wished to refer to the SHAPE of FILE it would
|
|
|
2858 |
do:
|
|
|
2859 |
|
|
|
2860 |
<programlisting>
|
|
|
2861 |
shape_apply_token(make_tok(T_FILE), ())
|
|
|
2862 |
</programlisting>
|
|
|
2863 |
|
|
|
2864 |
Before this program is translated on a given platform, the actual
|
|
|
2865 |
SHAPE of FILE must be supplied. This would be done by linking a TDF
|
|
|
2866 |
CAPSULE which supplies the TOKDEF for the SHAPE of FILE which is
|
|
|
2867 |
particular to the target platform.</para>
|
|
|
2868 |
|
|
|
2869 |
<para>Many of the C operations which use FILEs are explicitly allowed
|
|
|
2870 |
to be expanded as either procedure calls or as macros. For example,
|
|
|
2871 |
putc(c,f) may be implemented either as a procedure call or as the
|
|
|
2872 |
expansion of macro which uses the fields of f directly. Thus, it is
|
|
|
2873 |
quite natural for putc(c, f) to be represented in TDF as an EXP
|
|
|
2874 |
TOKEN with two EXP parameters which allows it to be expanded in
|
|
|
2875 |
either way. Of course, this would be quite distinct from the use of
|
|
|
2876 |
putc as a value (as a proc parameter of a procedure for example)
|
|
|
2877 |
which would require some other representation. One such
|
|
|
2878 |
representation that comes to mind might be to simply to make a
|
|
|
2879 |
TAGDEC for the putc value, supplying its TAGDEF in the Ansi API
|
|
|
2880 |
CAPSULE for the platform. This might prove to be rather
|
|
|
2881 |
short-sighted, since it denies us the possibility that the putc
|
|
|
2882 |
value itself might be expanded from other values and hence it would
|
|
|
2883 |
be better as another parameterless TOKEN. I have not come across an
|
|
|
2884 |
actual API expansion for the putc value as other than a simple TAG;
|
|
|
2885 |
however the FILE* value stdin is sometimes expressed as:
|
|
|
2886 |
|
|
|
2887 |
<programlisting>
|
|
|
2888 |
#define stdin &_iob[0]
|
|
|
2889 |
</programlisting>
|
|
|
2890 |
|
|
|
2891 |
which illustrates the point. It is better to have all of the
|
|
|
2892 |
interface of an API expressed as TOKENs to give both generality and
|
|
|
2893 |
flexibility across different platforms.</para>
|
|
|
2894 |
</sect2>
|
|
|
2895 |
|
|
|
2896 |
<sect2 id="S80">
|
|
|
2897 |
<title>Linking to APIs</title>
|
|
|
2898 |
|
|
|
2899 |
<para>In general, each API requires platform-dependent definitions to
|
|
|
2900 |
be supplied by a combination of TDF linking and system linking for
|
|
|
2901 |
that platform. This is illustrated in the following diagram giving
|
|
|
2902 |
the various phases involved in producing a runnable program.</para>
|
|
|
2903 |
|
|
|
2904 |
<mediaobject>
|
|
|
2905 |
<imageobject>
|
|
|
2906 |
<imagedata fileref="images/guide3.png" format="PNG"/>
|
|
|
2907 |
</imageobject>
|
|
|
2908 |
<textobject>
|
|
|
2909 |
<phrase>TBD</phrase>
|
|
|
2910 |
</textobject>
|
|
|
2911 |
<caption>
|
|
|
2912 |
<para>TBD</para>
|
|
|
2913 |
</caption>
|
|
|
2914 |
</mediaobject>
|
|
|
2915 |
|
|
|
2916 |
<para>There will be CAPSULEs for each API on each platform giving the
|
|
|
2917 |
expansions for the TOKENs involved, usually as uses of identifiers
|
|
|
2918 |
which will be supplied by system linking from some libraries. These
|
|
|
2919 |
CAPSULEs would be derived from the header files on the platform for
|
|
|
2920 |
the API in question, usually using some automatic tools. For
|
|
|
2921 |
example, there will be a TDF CAPSULE (derived from <stdio.h>)
|
|
|
2922 |
which defines the TOKEN T_FILE as the SHAPE for FILE, together with
|
|
|
2923 |
definitions for the TOKENs for putc, stdin, etc., in terms of
|
|
|
2924 |
identifiers which will be found in the library libc.a.</para>
|
|
|
2925 |
|
|
|
2926 |
<sect3 id="S81">
|
|
|
2927 |
<title>Target independent headers, unique_extern</title>
|
|
|
2928 |
|
|
|
2929 |
<para>Any producer which uses an API will use system independent
|
|
|
2930 |
information to give the common interface TOKENs for this API. In
|
|
|
2931 |
the C producer, this is provided by header files using pragmas,
|
|
|
2932 |
which tell the producer which TOKENs to use for the particular
|
|
|
2933 |
constructs of the API . In any target-independent CAPSULE which
|
|
|
2934 |
uses the API, these TOKENs would be introduced as TOKDECs and made
|
|
|
2935 |
globally accessible by using make_linkextern. For a world-wide
|
|
|
2936 |
standard API, the EXTERNAL "name" for a TOKEN used by
|
|
|
2937 |
make_linkextern should be provided by an application of
|
|
|
2938 |
unique_extern on a UNIQUE drawn from a central repository of names
|
|
|
2939 |
for entities in standard APIs; this repository would form a kind
|
|
|
2940 |
of super-standard for naming conventions in all possible APIs. The
|
|
|
2941 |
mechanism for controlling this super-standard has yet to be set
|
|
|
2942 |
up, so at the moment all EXTERN names are created by
|
|
|
2943 |
string_extern.</para>
|
|
|
2944 |
|
|
|
2945 |
<para>An interesting example in the use of TOKENs comes in
|
|
|
2946 |
abstracting field names. Often, an API will say something like
|
|
|
2947 |
"the type Widget is a structure with fields alpha, beta ..."
|
|
|
2948 |
without specifying the order of the fields or whether the list of
|
|
|
2949 |
fields is complete. The field selection operations for Widget
|
|
|
2950 |
should then be expressed using EXP OFFSET TOKENs; each field would
|
|
|
2951 |
have its own TOKEN giving its offset which will be filled in when
|
|
|
2952 |
the target is known. This gives implementors on a particular
|
|
|
2953 |
platform the opportunity to reorder fields or add to them as they
|
|
|
2954 |
like; it also allows for extension of the standard in the same
|
|
|
2955 |
way.</para>
|
|
|
2956 |
|
|
|
2957 |
<para>The most common SORTs of TOKENs used for APIs are SHAPEs to
|
|
|
2958 |
represent types, and EXPs to represent values, including
|
|
|
2959 |
procedures and constants. NATs and VARIETYs are also sometimes
|
|
|
2960 |
used where the API does not specify the types of integers
|
|
|
2961 |
involved. The other SORTs are rarely used in APIs; indeed it is
|
|
|
2962 |
difficult to imagine <emphasis>any</emphasis> realistic use of
|
|
|
2963 |
TOKENs of SORT BOOL. However, the criterion for choosing which
|
|
|
2964 |
SORTs are available for TOKENisation is not their immediate
|
|
|
2965 |
utility, but that the structural integrity and simplicity of TDF
|
|
|
2966 |
is maintained. It is fairly obvious that having BOOL TOKENs will
|
|
|
2967 |
cause no problems, so we may as well allow them.</para>
|
|
|
2968 |
</sect3>
|
|
|
2969 |
</sect2>
|
|
|
2970 |
|
|
|
2971 |
<sect2 id="S82">
|
|
|
2972 |
<title>Language programming interfaces</title>
|
|
|
2973 |
|
|
|
2974 |
<para>So far, I have been speaking as though a TOKENised API could
|
|
|
2975 |
only be some library interface, built on top of some language,
|
|
|
2976 |
like xpg3, posix, X etc. on top of C. However, it is possible to
|
|
|
2977 |
consider the constructions of the language itself as ideal
|
|
|
2978 |
candidates for TOKENisation. For example, the C for-statement
|
|
|
2979 |
could be expressed as TOKEN with four parameters. This TOKEN could
|
|
|
2980 |
be expanded in TDF in several different ways, all giving the
|
|
|
2981 |
correct semantics of a for-statement. A translator (or other
|
|
|
2982 |
tools) could choose the expansion it wants depending on context
|
|
|
2983 |
and the properties of the parameters. The C producer could give a
|
|
|
2984 |
default expansion which a lazy translator writer could use, but
|
|
|
2985 |
others might use expansions which might be more advantageous. This
|
|
|
2986 |
idea could be extended to virtually all the constructions of the
|
|
|
2987 |
language, giving what is in effect a C-language API; perhaps this
|
|
|
2988 |
might be called more properly a language programming interface
|
|
|
2989 |
(LPI). Thus, we would have TOKENs for C for-statements, C
|
|
|
2990 |
conditionals, C procedure calls, C procedure definitions etc.
|
|
|
2991 |
|
|
|
2992 |
<footnote>
|
|
|
2993 |
<para>Exercise for the reader: what are the SORTs of these
|
|
|
2994 |
parameters?</para>
|
|
|
2995 |
|
|
|
2996 |
<para>The current C producer does this for some of the
|
|
|
2997 |
constructs, but not in any systematic manner; perhaps it will
|
|
|
2998 |
change.</para>
|
|
|
2999 |
</footnote>
|
|
|
3000 |
</para>
|
|
|
3001 |
|
|
|
3002 |
<para>The notion of a producer for any language working to an LPI
|
|
|
3003 |
specific to the constructs of the language is very attractive. It
|
|
|
3004 |
could use different TOKENs to reflect the subtle differences
|
|
|
3005 |
between uses of similar constructs in different languages which
|
|
|
3006 |
might be difficult or impossible to detect from their expansions,
|
|
|
3007 |
but which could allow better optimisations in the object code.
|
|
|
3008 |
For example, Fortran procedures are slightly different from C
|
|
|
3009 |
procedures in that they do not allow aliasing between parameters
|
|
|
3010 |
and globals. While application of the standard TDF procedure calls
|
|
|
3011 |
would be semantically correct, knowledge of that the non-aliasing
|
|
|
3012 |
rule applies would allow some procedures to be translated to more
|
|
|
3013 |
efficient code. A translator without knowledge of the semantics
|
|
|
3014 |
implicit in the TOKENs involved would still produce correct code,
|
|
|
3015 |
but one which knew about them could take advantage of that
|
|
|
3016 |
knowledge.</para>
|
|
|
3017 |
|
|
|
3018 |
<para>I also think that LPIs would be a very useful tool for
|
|
|
3019 |
crystalising ideas on how languages should be translated, allowing
|
|
|
3020 |
one to experiment with expansions not thought of by the producer
|
|
|
3021 |
writer. This decoupling is also an escape clause allowing the
|
|
|
3022 |
producer writer to defer the implementation of a construct
|
|
|
3023 |
completely to translate-time or link-time, as is done at the
|
|
|
3024 |
moment in C for off-stack allocation. As such it also serves as a
|
|
|
3025 |
useful test-bed for TOKEN constructions which may in future become
|
|
|
3026 |
new constructors of core TDF.</para>
|
|
|
3027 |
</sect2>
|
|
|
3028 |
</sect1>
|
|
|
3029 |
|
|
|
3030 |
<sect1 id="tdf-transformations">
|
|
|
3031 |
<title>TDF transformations</title>
|
|
|
3032 |
|
|
|
3033 |
<para>TDF to TDF transformations form the basis of most of the tools of
|
|
|
3034 |
TDF, including translators. TDF has a rich set of easily performed
|
|
|
3035 |
transformations; this is mainly due to its algebraic nature, the
|
|
|
3036 |
liberality of its sequencing rules, and the ease with which one can
|
|
|
3037 |
introduce new names over limited scopes. For example, a translator is
|
|
|
3038 |
always free to transform:
|
|
|
3039 |
|
|
|
3040 |
<programlisting>
|
|
|
3041 |
assign(e1, e2)
|
|
|
3042 |
</programlisting>
|
|
|
3043 |
|
|
|
3044 |
to:
|
|
|
3045 |
|
|
|
3046 |
<programlisting>
|
|
|
3047 |
identify(empty, new_tag, e1, assign(obtain_tag(new_tag), e2))
|
|
|
3048 |
</programlisting>
|
|
|
3049 |
|
|
|
3050 |
i.e. identify the evaluation of the left-hand side of the assignment
|
|
|
3051 |
with a new TAG and use that TAG as the left-hand operand of a new
|
|
|
3052 |
assignment in the body of the identification. Note that the reverse
|
|
|
3053 |
transformation is only valid if the evaluation of e1 does not
|
|
|
3054 |
side-effect the evaluation of e2. A producer would have had to use the
|
|
|
3055 |
second form if it wished to evaluate e1 before e2. The definition of
|
|
|
3056 |
assign allows its operands to be evaluated in any order while identify
|
|
|
3057 |
insists that the evaluation of its <type>definition</type> is
|
|
|
3058 |
conceptually complete before starting on its <type>body</type>.</para>
|
|
|
3059 |
|
|
|
3060 |
<para>Why would a translator wish to make the more complicated form from
|
|
|
3061 |
the simpler one? This would usually depend on the particular forms of
|
|
|
3062 |
e1 and e2 as well as the machine idioms available for implementing the
|
|
|
3063 |
assignment. If, for example, the joint evaluation of e1 and e2 used
|
|
|
3064 |
more evaluation registers than is available, the transformation is
|
|
|
3065 |
probably a good idea. It would not necessarily commit one to putting
|
|
|
3066 |
the new tag value into the stack; some other more global criteria
|
|
|
3067 |
might lead one to allocate it into a register disjoint from the
|
|
|
3068 |
evaluation registers. In general, this kind of transformation is used
|
|
|
3069 |
to modify the operands of TDF constructions so that the
|
|
|
3070 |
code-production phase of the translator can just "churn the handle"
|
|
|
3071 |
knowing that the operands are already in the correct form for the
|
|
|
3072 |
machine idioms.</para>
|
|
|
3073 |
|
|
|
3074 |
<para>Transformations like this are also used to give optimisations
|
|
|
3075 |
which are largely independent of the target architecture. In general,
|
|
|
3076 |
provided that the sequencing rules are not violated, any EXP
|
|
|
3077 |
construction, F(X), say, where X is some inner EXP, can be replaced
|
|
|
3078 |
by:
|
|
|
3079 |
|
|
|
3080 |
<programlisting>
|
|
|
3081 |
identify(empty, new_tag, X, F(obtain_tag(new_tag))).
|
|
|
3082 |
</programlisting>
|
|
|
3083 |
|
|
|
3084 |
This includes the extraction of expressions which are constant over a
|
|
|
3085 |
loop; if F was some repeat construction and one can show that the EXP
|
|
|
3086 |
X is invariant over the repeat, the transformation does the constant
|
|
|
3087 |
extraction.</para>
|
|
|
3088 |
|
|
|
3089 |
<para>Most of the transformations performed by translators are of the
|
|
|
3090 |
above form, but there are many others. Particular machine idioms might
|
|
|
3091 |
lead to transformations like changing a test (i>=1) to (i>0)
|
|
|
3092 |
because the test against zero is faster; replacing multiplication by a
|
|
|
3093 |
constant integer by shifts and adds because multiplication is slow;
|
|
|
3094 |
and so on. Target independent transformations include things like
|
|
|
3095 |
procedure inlining and loop unrolling. Often these target independent
|
|
|
3096 |
transformations can be profitably done in terms of the TOKENs of an
|
|
|
3097 |
LPI; loop unrolling is an obvious example.</para>
|
|
|
3098 |
|
|
|
3099 |
<sect2 id="transformations-as-definitions">
|
|
|
3100 |
<title>Transformations as definitions</title>
|
|
|
3101 |
|
|
|
3102 |
<para>As well being a vehicle for expressing optimisation, TDF
|
|
|
3103 |
transformations can be used as the basis for defining TDF. In
|
|
|
3104 |
principle, if we were to define all of the allowable transformations
|
|
|
3105 |
of the TDF constructions, we would have a complete definition of TDF
|
|
|
3106 |
as the initial model of the TDF algebra. This would be a fairly
|
|
|
3107 |
impracticable project, since the totality of the rules including all
|
|
|
3108 |
the simple constructs would be very unwieldy, difficult to check for
|
|
|
3109 |
inconsistencies and would not add much illumination to the
|
|
|
3110 |
definition. However, knowledge of allowable transformations of TDF
|
|
|
3111 |
can often answer questions of the meaning of diverse constructs by
|
|
|
3112 |
relating them to a single construct. What follows is an alphabet of
|
|
|
3113 |
generic transformations which can often help to answer knotty
|
|
|
3114 |
questions. Here, E[X \ Y] denotes an EXP E with all internal
|
|
|
3115 |
occurrences of X replaced by Y.</para>
|
|
|
3116 |
|
|
|
3117 |
<para>If F is any non order-specifying
|
|
|
3118 |
|
|
|
3119 |
<footnote>
|
|
|
3120 |
<para>The order-specifying constructors are conditional,
|
|
|
3121 |
identify, repeat, labelled, sequence and
|
|
|
3122 |
variable.</para>
|
|
|
3123 |
</footnote>
|
|
|
3124 |
|
|
|
3125 |
EXP constructor and E is one of the EXP operands of F, then:
|
|
|
3126 |
|
|
|
3127 |
<programlisting>
|
|
|
3128 |
F(..., E, ...) xde identify(empty, newtag, E, F(..., obtain_tag(newtag), ...))
|
|
|
3129 |
</programlisting>
|
|
|
3130 |
</para>
|
|
|
3131 |
|
|
|
3132 |
<para>If E is a non side-effecting
|
|
|
3133 |
<footnote>
|
|
|
3134 |
<para>A sufficient condition for not side-effecting in this sense
|
|
|
3135 |
is that there are no apply_procs or local_allocs in E; that any
|
|
|
3136 |
assignments in E are to variables defined in E; and that any
|
|
|
3137 |
branches in E are to labels defined in conditionals in
|
|
|
3138 |
E.</para>
|
|
|
3139 |
</footnote>
|
|
|
3140 |
|
|
|
3141 |
EXP and none of the variables used in E are assigned to in B:
|
|
|
3142 |
identify(v, tag, E, B) xde B[obtain_tag(tag) \ E]</para>
|
|
|
3143 |
|
|
|
3144 |
<para>If all uses of tg in B are of the form contents(shape(E),
|
|
|
3145 |
obtain_tag(tg)):
|
|
|
3146 |
|
|
|
3147 |
<programlisting>
|
|
|
3148 |
variable(v, tg, E, B) xde identify(v, nt, E, B[contents(shape(E), obtain_tag(tg)) \ obtain_tag(nt)])
|
|
|
3149 |
sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>),
|
|
|
3150 |
sequence((P<subscript>1</subscript>, ..., P<subscript>m</subscript>), R) xdb
|
|
|
3151 |
sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>, P<subscript>1</subscript>, ..., P<subscript>m</subscript>), R)
|
|
|
3152 |
</programlisting>
|
|
|
3153 |
</para>
|
|
|
3154 |
|
|
|
3155 |
<para>If S<subscript>i</subscript> =
|
|
|
3156 |
sequence((P<subscript>1</subscript>, ...,
|
|
|
3157 |
P<subscript>m</subscript>), R):
|
|
|
3158 |
|
|
|
3159 |
<programlisting>
|
|
|
3160 |
sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>), T) xdb
|
|
|
3161 |
sequence((S<subscript>1</subscript>, ...,
|
|
|
3162 |
S<subscript>i-1</subscript>, P<subscript>1</subscript>, ..., P<subscript>m</subscript>, R,
|
|
|
3163 |
S<subscript>i+1</subscript>, ..., S<subscript>n</subscript>), T) E xdb sequence((), E)
|
|
|
3164 |
</programlisting>
|
|
|
3165 |
</para>
|
|
|
3166 |
|
|
|
3167 |
<para>If D is either identify or variable:
|
|
|
3168 |
|
|
|
3169 |
<programlisting>
|
|
|
3170 |
D(v, tag, sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>), R), B) xde
|
|
|
3171 |
sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>), D(v, tag, R, B))
|
|
|
3172 |
</programlisting>
|
|
|
3173 |
</para>
|
|
|
3174 |
|
|
|
3175 |
<para>If S<subscript>i</subscript> is an EXP BOTTOM, then:
|
|
|
3176 |
|
|
|
3177 |
<programlisting>
|
|
|
3178 |
sequence((S<subscript>1</subscript>, S<subscript>2</subscript>, ... S<subscript>n</subscript>), R) xde
|
|
|
3179 |
sequence((S<subscript>1</subscript>, ... S<subscript>i - 1</subscript>), S<subscript>i</subscript>)
|
|
|
3180 |
</programlisting>
|
|
|
3181 |
</para>
|
|
|
3182 |
|
|
|
3183 |
|
|
|
3184 |
<para>If E is an EXP BOTTOM, and if D is either identify or variable:
|
|
|
3185 |
|
|
|
3186 |
<programlisting>
|
|
|
3187 |
D(v, tag, E, B) xde E
|
|
|
3188 |
</programlisting>
|
|
|
3189 |
</para>
|
|
|
3190 |
|
|
|
3191 |
<para>If S<subscript>i</subscript> is make_top(), then:
|
|
|
3192 |
|
|
|
3193 |
<programlisting>
|
|
|
3194 |
sequence((S<subscript>1</subscript>,
|
|
|
3195 |
S<subscript>2</subscript>, ... S<subscript>n</subscript>), R) xdb
|
|
|
3196 |
sequence((S<subscript>1</subscript>, ... S<subscript>i - 1</subscript>,
|
|
|
3197 |
S<subscript>i + 1</subscript>, ...S<subscript>n</subscript>), R)
|
|
|
3198 |
</programlisting>
|
|
|
3199 |
</para>
|
|
|
3200 |
|
|
|
3201 |
<para>If S<subscript>n</subscript> is an EXP TOP:
|
|
|
3202 |
<programlisting>
|
|
|
3203 |
sequence((S<subscript>1</subscript>, ...
|
|
|
3204 |
S<subscript>n</subscript>), make_top()) xdb sequence((S<subscript>1</subscript>, ...,
|
|
|
3205 |
S<subscript>n - 1</subscript>), S<subscript>n</subscript>)
|
|
|
3206 |
</programlisting>
|
|
|
3207 |
</para>
|
|
|
3208 |
|
|
|
3209 |
<para>If E is an EXP TOP and E is not side-effecting then:
|
|
|
3210 |
<programlisting>
|
|
|
3211 |
E xde make_top()
|
|
|
3212 |
</programlisting>
|
|
|
3213 |
</para>
|
|
|
3214 |
|
|
|
3215 |
<para>If C is some non order-specifying and non side-effecting
|
|
|
3216 |
constructor, and S<subscript>i</subscript> is C(P<subscript>1</subscript>,...,
|
|
|
3217 |
P<subscript>m</subscript>) where P<subscript>1..m</subscript> are the EXP operands of C:
|
|
|
3218 |
<programlisting>
|
|
|
3219 |
sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>), R) xde
|
|
|
3220 |
sequence((S<subscript>1</subscript>, ..., S<subscript>i - 1</subscript>, P<subscript>1</subscript>,
|
|
|
3221 |
..., P<subscript>m</subscript>, S<subscript>i + 1</subscript>, ..., S<subscript>n</subscript>), R)
|
|
|
3222 |
</programlisting>
|
|
|
3223 |
</para>
|
|
|
3224 |
|
|
|
3225 |
<para>If none of the S<subscript>i</subscript> use the label L:
|
|
|
3226 |
<programlisting>
|
|
|
3227 |
conditional(L,
|
|
|
3228 |
sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>), R), A) xde
|
|
|
3229 |
sequence((S<subscript>1</subscript>, ..., S<subscript>n</subscript>), conditional(L, R, A))
|
|
|
3230 |
</programlisting>
|
|
|
3231 |
</para>
|
|
|
3232 |
|
|
|
3233 |
<para>If there are no uses of L in X
|
|
|
3234 |
|
|
|
3235 |
<footnote>
|
|
|
3236 |
<para>There are analogous rules for labelled and repeat with
|
|
|
3237 |
unused LABELs.</para>
|
|
|
3238 |
</footnote>
|
|
|
3239 |
|
|
|
3240 |
:
|
|
|
3241 |
<programlisting>
|
|
|
3242 |
conditional(L, X, Y) xde X conditional(L, E, goto(Z)) xde E[L \ Z]
|
|
|
3243 |
</programlisting>
|
|
|
3244 |
</para>
|
|
|
3245 |
|
|
|
3246 |
<para>If EXP X contains no use of the LABEL L:
|
|
|
3247 |
<programlisting>
|
|
|
3248 |
conditional(L, conditional(M, X, Y), Z) xde conditional(M, X, conditional(L, Y, Z))
|
|
|
3249 |
repeat(L, I, E) xde sequence((I), repeat(L, make_top(), E))
|
|
|
3250 |
repeat(L, make_top(), E) xde conditional(Z, E[L \ Z], repeat(L, make_top(), E))
|
|
|
3251 |
</programlisting>
|
|
|
3252 |
</para>
|
|
|
3253 |
|
|
|
3254 |
<para>If there are no uses of L in E:
|
|
|
3255 |
<programlisting>
|
|
|
3256 |
repeat(L, make_top(), sequence((S, E), make_top())) xde
|
|
|
3257 |
conditional(Z, S[L \ Z], repeat(L, make_top(), sequence((E, S), make_top())))
|
|
|
3258 |
</programlisting>
|
|
|
3259 |
</para>
|
|
|
3260 |
|
|
|
3261 |
<para>If f is a procedure defined
|
|
|
3262 |
|
|
|
3263 |
<footnote>
|
|
|
3264 |
<para>This has to be modified if B contains any uses of
|
|
|
3265 |
local_free_all or last_local.</para>
|
|
|
3266 |
</footnote>
|
|
|
3267 |
|
|
|
3268 |
as:
|
|
|
3269 |
|
|
|
3270 |
<programlisting>
|
|
|
3271 |
make_proc(rshape, formal<subscript>1 .. n</subscript>, vtg,
|
|
|
3272 |
B(return R<subscript>1</subscript>, ..., return R<subscript>m</subscript>))
|
|
|
3273 |
</programlisting>
|
|
|
3274 |
|
|
|
3275 |
where:
|
|
|
3276 |
|
|
|
3277 |
<programlisting>
|
|
|
3278 |
formal<subscript>i</subscript> = make_tagshacc(s<subscript>i</subscript>, vi, tgi)
|
|
|
3279 |
</programlisting>
|
|
|
3280 |
|
|
|
3281 |
and B is an EXP with all of its internal return constructors indicated
|
|
|
3282 |
parametrically then:
|
|
|
3283 |
|
|
|
3284 |
<programlisting>
|
|
|
3285 |
if Ai has SHAPE si apply_proc(rshape, f, (A1, ..., An), V) xde
|
|
|
3286 |
variable(empty, newtag, make_value((rshape=BOTTOM) ? TOP : rshape),
|
|
|
3287 |
labelled((L), variable(v1, tg1, A1, ...,
|
|
|
3288 |
variable(vn, tgn, An,
|
|
|
3289 |
variable(empty, vtg, V,
|
|
|
3290 |
B(sequence(assign(obtain_tag(newtag), R1), goto(L)),
|
|
|
3291 |
...,
|
|
|
3292 |
sequence(assign(obtain_tag(newtag), Rm), goto(L))))),
|
|
|
3293 |
contents(rshape, obtain_tag(newtag))))
|
|
|
3294 |
assign(E, make_top()) xde sequence((E), make_top())
|
|
|
3295 |
contents(TOP, E) xde sequence((E), make_top())
|
|
|
3296 |
make_value(TOP) xde make_top()
|
|
|
3297 |
component(s, contents(COMPOUND(S), E), D) xde contents(s, add_to_ptr(E, D))
|
|
|
3298 |
make_compound(S, ((E1, D1), ..., (En, Dn))) xde variable(empty, nt, make_value(COMPOUND(S)),
|
|
|
3299 |
sequence((assign(add_to_ptr(obtain_tag(nt), D1), E1), ...,
|
|
|
3300 |
assign(add_to_ptr(obtain_tag(nt), Dn), En)),
|
|
|
3301 |
contents(S, obtain_tag(nt))))
|
|
|
3302 |
</programlisting>
|
|
|
3303 |
</para>
|
|
|
3304 |
|
|
|
3305 |
<sect3 id="examples-of-transformations">
|
|
|
3306 |
<title>Examples of transformations</title>
|
|
|
3307 |
|
|
|
3308 |
<para>Any of these transformations may be performed by the TDF
|
|
|
3309 |
translators. The most important is probably {A} which allows one
|
|
|
3310 |
to reduce all of the EXP operands of suitable constructors to
|
|
|
3311 |
obtain_tags. The expansion rules for identification, {G}, {H} and
|
|
|
3312 |
{I}, gives definition to complicated operands as well as strangely
|
|
|
3313 |
formed ones, e.g. return(... return(X)...). Rule {A} also
|
|
|
3314 |
illustrates neatly the lack of ordering constraints on the
|
|
|
3315 |
evaluation of operands. For example, mult(et, exp1, exp2) could be
|
|
|
3316 |
expanded by applications of {A} to either:
|
|
|
3317 |
|
|
|
3318 |
<programlisting>
|
|
|
3319 |
identify(empty, t1, exp1, identify(empty, t2, exp2, mult(et, obtain_tag(t1), obtain_tag(t2))))
|
|
|
3320 |
</programlisting>
|
|
|
3321 |
|
|
|
3322 |
or:
|
|
|
3323 |
|
|
|
3324 |
<programlisting>
|
|
|
3325 |
identify(empty, t2, exp2, identify(empty, t1, exp1, mult(et, obtain_tag(t1), obtain_tag(t2))))
|
|
|
3326 |
</programlisting>
|
|
|
3327 |
|
|
|
3328 |
Both orderings of the evaluations of exp1 and exp2 are acceptable,
|
|
|
3329 |
regardless of any side-effects in them. There is no requirement
|
|
|
3330 |
that both expansions should produce the same answer for the
|
|
|
3331 |
multiplications; the only person who can say whether either result
|
|
|
3332 |
is "wrong" is the person who specified the program.</para>
|
|
|
3333 |
|
|
|
3334 |
<para>Many of these transformations often only come into play when
|
|
|
3335 |
some previous transformation reveals some otherwise hidden
|
|
|
3336 |
information. For example, after procedure in-lining given by {U}
|
|
|
3337 |
or loop un-rolling given by {S}, a translator can often deduce the
|
|
|
3338 |
behaviour of a _test constructor, replacing it by either a
|
|
|
3339 |
make_top or a goto. This may allow one to apply either {J} or {H}
|
|
|
3340 |
to eliminate dead code in sequences and in turn {N} or {P} to
|
|
|
3341 |
eliminate entire conditions and so on.</para>
|
|
|
3342 |
|
|
|
3343 |
<para>Application of transformations can also give expansions which
|
|
|
3344 |
are rather pathological in the sense that a producer is very
|
|
|
3345 |
unlikely to form them. For example, a procedure which returns no
|
|
|
3346 |
result would have result statements of the form
|
|
|
3347 |
return(make_top()). In-lining such a procedure by {U} would have a
|
|
|
3348 |
form like:
|
|
|
3349 |
|
|
|
3350 |
<programlisting>
|
|
|
3351 |
variable(empty, nt, make_shape(TOP),
|
|
|
3352 |
labelled((L),
|
|
|
3353 |
... sequence((assign(obtain_tag(nt), make_top())),
|
|
|
3354 |
goto(L)) ...
|
|
|
3355 |
contents(TOP, obtain_tag(nt))
|
|
|
3356 |
)
|
|
|
3357 |
)
|
|
|
3358 |
</programlisting>
|
|
|
3359 |
|
|
|
3360 |
The rules {V}, {W} and {X} allow this to be replaced by:
|
|
|
3361 |
|
|
|
3362 |
<programlisting>
|
|
|
3363 |
variable(empty, nt, make_top(),
|
|
|
3364 |
labelled((L),
|
|
|
3365 |
... sequence((obtain_tag(nt)), goto(L)) ...
|
|
|
3366 |
sequence((obtain_tag(nt)), make_top())
|
|
|
3367 |
)
|
|
|
3368 |
)
|
|
|
3369 |
</programlisting>
|
|
|
3370 |
|
|
|
3371 |
The obtain_tags can be eliminated by rule {M} and then the
|
|
|
3372 |
sequences by {F}. Sucessive applications of {C} and {B} then give:
|
|
|
3373 |
|
|
|
3374 |
<programlisting>
|
|
|
3375 |
labelled((L),
|
|
|
3376 |
... goto(L) ...
|
|
|
3377 |
make_top()
|
|
|
3378 |
)
|
|
|
3379 |
</programlisting>
|
|
|
3380 |
</para>
|
|
|
3381 |
</sect3>
|
|
|
3382 |
|
|
|
3383 |
<sect3 id="S86">
|
|
|
3384 |
<title>Programs with undefined values</title>
|
|
|
3385 |
|
|
|
3386 |
<para>The definitions of most of the constructors in the TDF
|
|
|
3387 |
specification are predicated by some conditions; if these
|
|
|
3388 |
conditions are not met the effect and result of the constructor is
|
|
|
3389 |
not defined for all possible platforms.
|
|
|
3390 |
|
|
|
3391 |
<footnote>
|
|
|
3392 |
<para>However, we may find that the mapping of a constraint
|
|
|
3393 |
allows extra relationships for a class of architectures which
|
|
|
3394 |
do not hold in all generality; this may mean that some
|
|
|
3395 |
constructions are defined on this class while still being
|
|
|
3396 |
undefined in others (see
|
|
|
3397 |
<link linkend="models-of-the-tdf-algebra">section 13</link>).
|
|
|
3398 |
</para>
|
|
|
3399 |
</footnote>
|
|
|
3400 |
|
|
|
3401 |
Any value which is dependent on the effect or result of an
|
|
|
3402 |
undefined construction is also undefined. This is not the same as
|
|
|
3403 |
saying that a program is undefined if it can construct an
|
|
|
3404 |
undefined value - the dynamics of the program might be such that
|
|
|
3405 |
the offending construction is never obeyed.</para>
|
|
|
3406 |
</sect3>
|
|
|
3407 |
</sect2>
|
|
|
3408 |
</sect1>
|
|
|
3409 |
|
|
|
3410 |
<sect1 id="tdf-expansions-of-offsets">
|
|
|
3411 |
<title>TDF expansions of offsets</title>
|
|
|
3412 |
|
|
|
3413 |
<para>Consider the C structure defined by:
|
|
|
3414 |
|
|
|
3415 |
<programlisting>
|
|
|
3416 |
typedef struct{ int i; double d; char c;} mystruct;
|
|
|
3417 |
</programlisting>
|
|
|
3418 |
|
|
|
3419 |
Given that sh_int, sh_char and sh_double are the SHAPEs for int, char
|
|
|
3420 |
and double, the SHAPE of <type>mystruct</type> is constructed by:
|
|
|
3421 |
|
|
|
3422 |
<programlisting>
|
|
|
3423 |
SH_mystruct = compound(S_c)
|
|
|
3424 |
</programlisting>
|
|
|
3425 |
|
|
|
3426 |
where:
|
|
|
3427 |
|
|
|
3428 |
<programlisting>
|
|
|
3429 |
S_c = offset_add(O_c, shape_offset(sh_char))
|
|
|
3430 |
</programlisting>
|
|
|
3431 |
|
|
|
3432 |
where:
|
|
|
3433 |
|
|
|
3434 |
<programlisting>
|
|
|
3435 |
O_c = offset_pad(alignment(sh_char), S_d)
|
|
|
3436 |
</programlisting>
|
|
|
3437 |
|
|
|
3438 |
where:
|
|
|
3439 |
|
|
|
3440 |
<programlisting>
|
|
|
3441 |
S_d = offset_add(O_d, shape_offset(sh_double))
|
|
|
3442 |
</programlisting>
|
|
|
3443 |
|
|
|
3444 |
where:
|
|
|
3445 |
|
|
|
3446 |
<programlisting>
|
|
|
3447 |
O_d = offset_pad(alignment(sh_double), S_i)
|
|
|
3448 |
</programlisting>
|
|
|
3449 |
|
|
|
3450 |
where
|
|
|
3451 |
|
|
|
3452 |
<footnote>
|
|
|
3453 |
<para>I could equally have given simply shape_offset(sh_int) for
|
|
|
3454 |
S_i, but the above formulation is more uniform with respect to
|
|
|
3455 |
selection OFFSETs.</para>
|
|
|
3456 |
</footnote>:
|
|
|
3457 |
|
|
|
3458 |
<programlisting>
|
|
|
3459 |
S_i = offset_add(O_i, shape_offset(sh_int))
|
|
|
3460 |
</programlisting>
|
|
|
3461 |
|
|
|
3462 |
and:
|
|
|
3463 |
|
|
|
3464 |
<programlisting>
|
|
|
3465 |
O_i = offset_zero(alignment(sh_int))
|
|
|
3466 |
</programlisting>
|
|
|
3467 |
|
|
|
3468 |
Each of S_c, S_d and S_i gives the minimum "size" of the space
|
|
|
3469 |
required to upto and including the field c, d and i respectively. Each
|
|
|
3470 |
of O_c, O_d and O_i gives the OFFSET "displacement" from a pointer to
|
|
|
3471 |
a <type>mystruct</type> required to select the fields c, d and i
|
|
|
3472 |
respectively. The C program fragment:
|
|
|
3473 |
|
|
|
3474 |
<programlisting>
|
|
|
3475 |
mystruct s;
|
|
|
3476 |
.... s.d = 1.0; ...
|
|
|
3477 |
</programlisting>
|
|
|
3478 |
|
|
|
3479 |
would translate to something like:
|
|
|
3480 |
|
|
|
3481 |
<programlisting>
|
|
|
3482 |
variable(empty, tag_s, make_value(compound(S_c)),
|
|
|
3483 |
sequence(...,
|
|
|
3484 |
assign(add_to_ptr(obtain_tag(tag_s), O_d), 1.0),
|
|
|
3485 |
...
|
|
|
3486 |
)
|
|
|
3487 |
)
|
|
|
3488 |
</programlisting>
|
|
|
3489 |
|
|
|
3490 |
Each of the OFFSET expressions above are ideal candidates for
|
|
|
3491 |
tokenisation; a producer would probably define tokens for each of them
|
|
|
3492 |
and use exp_apply_token to expand them at each of their uses.</para>
|
|
|
3493 |
|
|
|
3494 |
<para>From the definition, we find that:
|
|
|
3495 |
|
|
|
3496 |
<programlisting>
|
|
|
3497 |
S_c = shape_offset(SH_mystruct)
|
|
|
3498 |
i.e. an OFFSET(alignment(sh_int) xc8 alignment(sh_char) xc8 alignment(sh_double), {})
|
|
|
3499 |
</programlisting>
|
|
|
3500 |
|
|
|
3501 |
This would not be the OFFSET required to describe
|
|
|
3502 |
<type>sizeof(mystruct)</type> in C, since this is defined to be the
|
|
|
3503 |
difference between successive elements an array of
|
|
|
3504 |
<type>mystruct</type>s. The <type>sizeof</type> OFFSET would have to
|
|
|
3505 |
pad S_c to the alignment of SH_mystruct:
|
|
|
3506 |
|
|
|
3507 |
<programlisting>
|
|
|
3508 |
offset_pad(alignment(SH_mystruct), S_c)
|
|
|
3509 |
</programlisting>
|
|
|
3510 |
|
|
|
3511 |
This is the OFFSET that one would use to compute the displacement of
|
|
|
3512 |
an element of an array of <type>mystruct</type>s using offset_mult
|
|
|
3513 |
with the index.</para>
|
|
|
3514 |
|
|
|
3515 |
<para>The most common use of OFFSETs is in add_to_ptr to compute the
|
|
|
3516 |
address of a structure or array element. Looking again at its
|
|
|
3517 |
signature in a slightly different form:
|
|
|
3518 |
|
|
|
3519 |
<programlisting>
|
|
|
3520 |
arg1: EXP POINTER(y xc8 A)
|
|
|
3521 |
arg2: EXP OFFSET(y, z)
|
|
|
3522 |
-> EXP POINTER(z)
|
|
|
3523 |
... for any ALIGNMENT A
|
|
|
3524 |
</programlisting>
|
|
|
3525 |
|
|
|
3526 |
one sees that <type>arg2</type> can measure an OFFSET from a value of
|
|
|
3527 |
a "smaller" alignment than the value pointed at by <type>arg1</type>.
|
|
|
3528 |
If <type>arg2</type> were O_d, for example, then <type>arg1</type>
|
|
|
3529 |
could be a pointer to any structure of the form struct {int i, double
|
|
|
3530 |
d,...} not just <type>mystruct</type>. The general principle is that
|
|
|
3531 |
an OFFSET to a field constructed in this manner is independent of any
|
|
|
3532 |
fields after it, corresponding to normal usage in both languages and
|
|
|
3533 |
machines. A producer for a language which conflicts with this would
|
|
|
3534 |
have to produce less obvious TDF, perhaps by re-ordering the fields,
|
|
|
3535 |
padding the offsets by later alignments or taking maxima of the sizes
|
|
|
3536 |
of the fields.</para>
|
|
|
3537 |
|
|
|
3538 |
<sect2 id="bitfield-offsets">
|
|
|
3539 |
<title>Bitfield offsets</title>
|
|
|
3540 |
|
|
|
3541 |
<para>Bitfield offsets are governed by rather stricter rules. In order
|
|
|
3542 |
to extract or assign a bitfield, we have to find an integer variety
|
|
|
3543 |
which entirely contains the bitfield. Suppose that we wished to
|
|
|
3544 |
extract a bitfield by:
|
|
|
3545 |
|
|
|
3546 |
<programlisting>
|
|
|
3547 |
bitfield_contents(v, p:POINTER(X), b:OFFSET(Y, B))
|
|
|
3548 |
</programlisting>
|
|
|
3549 |
|
|
|
3550 |
Y must be an alignment(I) where I is some integer SHAPE, contained
|
|
|
3551 |
in X. Further, this has to be equivalent to:
|
|
|
3552 |
|
|
|
3553 |
<programlisting>
|
|
|
3554 |
bitfield_contents(v, add_ptr(p, d:OFFSET(Y,Y)), b':OFFSET(Y, B))
|
|
|
3555 |
</programlisting>
|
|
|
3556 |
|
|
|
3557 |
for some d and b' such that:
|
|
|
3558 |
|
|
|
3559 |
<programlisting>
|
|
|
3560 |
offset_pad(v, shape_offset(I)) >= b'
|
|
|
3561 |
</programlisting>
|
|
|
3562 |
|
|
|
3563 |
and
|
|
|
3564 |
|
|
|
3565 |
<programlisting>
|
|
|
3566 |
offset_add(offset_pad(v, offset_mult(d, sizeof(I)), b') = b
|
|
|
3567 |
</programlisting>
|
|
|
3568 |
|
|
|
3569 |
Clearly, we have a limitation on the length of bitfields to the
|
|
|
3570 |
maximum integer variety available; in addition, we cannot have a
|
|
|
3571 |
bitfield which overlaps two such varieties.</para>
|
|
|
3572 |
|
|
|
3573 |
<para>The difficulties inherent in this may be illustrated by
|
|
|
3574 |
attempting to construct an array of bitfields using the nof
|
|
|
3575 |
constructor. Assuming a standard architecture, we find that we
|
|
|
3576 |
cannot usefully define an object of SHAPE nof(N,
|
|
|
3577 |
bitfield(bfvar_bits(b, M))) without padding this shape out to some
|
|
|
3578 |
integer variety which can contain M bits. In addition, they can only
|
|
|
3579 |
be usefully indexed (using bitfield_contents)either if M is some
|
|
|
3580 |
power of 2 or M*N is less than the length of the maximum integer
|
|
|
3581 |
variety. Thus a producer must be sure that these conditions hold if
|
|
|
3582 |
he is to generate and index this object simply. Even here he is in
|
|
|
3583 |
some dificulty, since he does not know the representational
|
|
|
3584 |
varieties of the integers available to him; also it is difficult for
|
|
|
3585 |
him to ensure that the alignment of the entire array is in some
|
|
|
3586 |
sense minimal. Similar difficulties occur with bitfields in
|
|
|
3587 |
structures - they are not restricted to arrays.</para>
|
|
|
3588 |
|
|
|
3589 |
<para>The solution to this conundrum in its full generality requires
|
|
|
3590 |
knowledge of the available representational varieties. Particular
|
|
|
3591 |
languages have restrictions which means that sub-optimal solutions
|
|
|
3592 |
will satisfy its specification on the use of bitfields. For
|
|
|
3593 |
example, C is satisfied with bitfields of maximum length 32 and
|
|
|
3594 |
simple alignment constraints. However, for the general optimal
|
|
|
3595 |
solution, I can see no reasonable alternative to the installer
|
|
|
3596 |
defining some tokens to produce bitfield offsets which are
|
|
|
3597 |
guaranteed to obey the alignment rules and also give optimal packing
|
|
|
3598 |
of fields and alignments of the total object for the platform in
|
|
|
3599 |
question. I believe that three tokens are sufficient to do this;
|
|
|
3600 |
these are analogous to the constructors offset_zero, offset_pad and
|
|
|
3601 |
offset_mult with ordinary alignments and their signatures could be:
|
|
|
3602 |
|
|
|
3603 |
<programlisting>
|
|
|
3604 |
~Bitfield_offset_zero:
|
|
|
3605 |
n: NAT
|
|
|
3606 |
issigned: BOOL
|
|
|
3607 |
-> EXP OFFSET(A, bfvar_bits(issigned, n))
|
|
|
3608 |
</programlisting>
|
|
|
3609 |
|
|
|
3610 |
Here the result is a zero offset to the bitfield with `minimum'
|
|
|
3611 |
integer variety alignment A.
|
|
|
3612 |
|
|
|
3613 |
<programlisting>
|
|
|
3614 |
~Bitfield_offset_pad:
|
|
|
3615 |
n: NAT
|
|
|
3616 |
issigned: BOOL
|
|
|
3617 |
sh: SHAPE
|
|
|
3618 |
-> EXP OFFSET(alignment(sh) xc8 A, bfvar_bits(issigned, n))
|
|
|
3619 |
</programlisting>
|
|
|
3620 |
|
|
|
3621 |
Here the result is the shape_offset of <type>sh</type> padded with
|
|
|
3622 |
the `minimum' alignment A so that it can accomodate the bitfield.
|
|
|
3623 |
Note that this may involve padding <type>sh</type> with the
|
|
|
3624 |
alignment of the maximum integer variety if there are not enough
|
|
|
3625 |
bits left at the end of <type>sh.</type>
|
|
|
3626 |
|
|
|
3627 |
<programlisting>
|
|
|
3628 |
~Bitfield_offset_mult:
|
|
|
3629 |
n: NAT
|
|
|
3630 |
issigned: BOOL
|
|
|
3631 |
ind: EXP INTEGER(v)
|
|
|
3632 |
-> EXP OFFSET(A, bfvar_bits(issigned, n))
|
|
|
3633 |
</programlisting>
|
|
|
3634 |
|
|
|
3635 |
Here the result is an offset which gives the displacement of
|
|
|
3636 |
<type>ind</type>th element of an array of <type>n</type>-bit
|
|
|
3637 |
bitfields with `minimum' alignment A. Note that this will correspond
|
|
|
3638 |
to a normal multiplication only if <type>n</type> is a power of 2 or
|
|
|
3639 |
<type>ind</type> * <type>n</type> <= length of the maximum
|
|
|
3640 |
integer variety.</para>
|
|
|
3641 |
|
|
|
3642 |
<para>These tokens can be expressed in TDF if the lengths of the
|
|
|
3643 |
available varieties are known, i.e., they are installer defined.
|
|
|
3644 |
|
|
|
3645 |
<footnote>
|
|
|
3646 |
<para>For most architectures, these definition are dependent only
|
|
|
3647 |
on a few constants such as the maximum length of bitfield.,
|
|
|
3648 |
expessed as tokens for the target. The precise specification of
|
|
|
3649 |
such target dependent tokens is of current interest outside the
|
|
|
3650 |
scope of this document.</para>
|
|
|
3651 |
</footnote>
|
|
|
3652 |
|
|
|
3653 |
They ought to be used in place of offset_zero, offset_pad and
|
|
|
3654 |
offset_mult whereever the alignment or shape (required to construct
|
|
|
3655 |
a SHAPE or an argument to the bitfield constructs) is a pure
|
|
|
3656 |
bitfield. The constructor nof should never be used on a pure
|
|
|
3657 |
bitfield; instead it should be replaced by:
|
|
|
3658 |
|
|
|
3659 |
<programlisting>
|
|
|
3660 |
S = compound(~Bitfield_offset_mult(M, b, N))
|
|
|
3661 |
</programlisting>
|
|
|
3662 |
|
|
|
3663 |
to give a shape, S, representing an array of N M-bit bitfields. This
|
|
|
3664 |
may not be just N*M bits; for example ~Bitfield_offset_mult may be
|
|
|
3665 |
implemented to pack an array of 3-bit bitfields as 10 fields to a
|
|
|
3666 |
32-bit word. In any case, one would expect that normal rules for
|
|
|
3667 |
offset arithmetic are preserved, e.g.
|
|
|
3668 |
|
|
|
3669 |
<programlisting>
|
|
|
3670 |
offset_add(~Bitfield_offset_pad(M, b, S), size(bitfield(bfvar_bits(b, N)))) =
|
|
|
3671 |
~Bitfield_offset_mult(M, b, N + 1)
|
|
|
3672 |
|
|
|
3673 |
where size(X) = offset_pad(alignment(X), shape_offset(X))
|
|
|
3674 |
</programlisting>
|
|
|
3675 |
</para>
|
|
|
3676 |
</sect2>
|
|
|
3677 |
</sect1>
|
|
|
3678 |
|
|
|
3679 |
<sect1 id="models-of-the-tdf-algebra">
|
|
|
3680 |
<title>Models of the TDF algebra</title>
|
|
|
3681 |
|
|
|
3682 |
<para>TDF is a multi-sorted abstract algebra. Any implementation of TDF
|
|
|
3683 |
is a model of this algebra, formed by a mapping of the algebra into a
|
|
|
3684 |
concrete machine. An algebraic mapping gives a concrete representation
|
|
|
3685 |
to each of the SORTs in such a way that the representation of any
|
|
|
3686 |
construction of TDF is independent of context; it is a homomorphism.
|
|
|
3687 |
In other words if we define the mapping of a TDF constructor, C, as
|
|
|
3688 |
MAP[C] and the representation of a SORT, S, as REPR[S] then:
|
|
|
3689 |
|
|
|
3690 |
<programlisting>
|
|
|
3691 |
REPR[C(P1, ..., Pn)] = MAP[C](REPR(P1), ..., REPR(Pn))
|
|
|
3692 |
</programlisting>
|
|
|
3693 |
|
|
|
3694 |
Any mapping has to preserve the equivalences of the abstract algebra,
|
|
|
3695 |
such as those exemplified by the transformations {A} - {Z} in
|
|
|
3696 |
<link linkend="transformations-as-definitions">section 11.1</link>.
|
|
|
3697 |
Similarly, the mappings of any predicates on the constructions, such
|
|
|
3698 |
as those giving "well-formed" conditions, must be satisfied in terms
|
|
|
3699 |
of the mapped representations.</para>
|
|
|
3700 |
|
|
|
3701 |
<para>In common with most homomorphisms, the mappings of constructions
|
|
|
3702 |
can exhibit more equivalences than are given by the abstract algebra.
|
|
|
3703 |
The use of these extra equivalences is the basis of most of the
|
|
|
3704 |
target-dependent optimisations in a TDF translator; it can make use of
|
|
|
3705 |
"idioms" of the target architecture to produce equivalent
|
|
|
3706 |
constructions which may work faster than the "obvious" translation. In
|
|
|
3707 |
addition, we may find that may find that more predicates are satisfied
|
|
|
3708 |
in a mapping than would be in the abstract algebra. a particular
|
|
|
3709 |
concrete mapping might allow more constructions to be well-formed than
|
|
|
3710 |
are permitted in the abstract; a producer can use this fact to target
|
|
|
3711 |
its output to a particular class of architectures. in this case, the
|
|
|
3712 |
producer should produce tdf so that any translator not targeted to
|
|
|
3713 |
this class can fail gracefully.</para>
|
|
|
3714 |
|
|
|
3715 |
<para>giving a complete mapping for a particular architecture here is
|
|
|
3716 |
tantamount to writing a complete translator. however, the mappings for
|
|
|
3717 |
the small but important sub-algebra concerned with offsets and
|
|
|
3718 |
alignments illustrates many of the main principles. what follows is
|
|
|
3719 |
two sets of mappings for disparate architectures; the first gives a
|
|
|
3720 |
more or less standard meaning to alignments but the second may be less
|
|
|
3721 |
familiar.</para>
|
|
|
3722 |
|
|
|
3723 |
<sect2 id="model-for-32bit-architecture">
|
|
|
3724 |
<title>Model for a 32-bit standard architecture</title>
|
|
|
3725 |
|
|
|
3726 |
<para>Almost all current architectures use a "flat-store" model of
|
|
|
3727 |
memory. There is no enforced segregation of one kind of data from
|
|
|
3728 |
another - in general, one can access one unit of memory as a linear
|
|
|
3729 |
offset from any other. Here, TDF ALIGNMENTs are a reflection of
|
|
|
3730 |
constraints for the efficient access of different kinds of data
|
|
|
3731 |
objects - usually one finds that 32-bit integers are most
|
|
|
3732 |
efficiently accessed if they start at 32 bit boundaries and so
|
|
|
3733 |
on.</para>
|
|
|
3734 |
|
|
|
3735 |
<sect3 id="S91">
|
|
|
3736 |
<title>Alignment model</title>
|
|
|
3737 |
|
|
|
3738 |
<para>The representation of ALIGNMENT in a typical standard
|
|
|
3739 |
architecture is a single integer where:
|
|
|
3740 |
|
|
|
3741 |
<programlisting>
|
|
|
3742 |
REPR[{ }] = 1
|
|
|
3743 |
REPR[{bitfield}] = 1
|
|
|
3744 |
REPR[{char_variety}] = 8
|
|
|
3745 |
REPR[{short_variety}] = 16
|
|
|
3746 |
</programlisting>
|
|
|
3747 |
|
|
|
3748 |
Otherwise, for all other primitive ALIGNMENTS a:
|
|
|
3749 |
|
|
|
3750 |
<programlisting>
|
|
|
3751 |
REPR[{a}] = 32
|
|
|
3752 |
</programlisting>
|
|
|
3753 |
|
|
|
3754 |
The representation of a compound ALIGNMENT is given by:
|
|
|
3755 |
|
|
|
3756 |
<programlisting>
|
|
|
3757 |
REPR[A xc8 B] = Max(REPR[A], REPR[B])
|
|
|
3758 |
i.e. MAP[unite_alignment] = Max
|
|
|
3759 |
</programlisting>
|
|
|
3760 |
|
|
|
3761 |
while the ALIGNMENT inclusion predicate is given by:
|
|
|
3762 |
|
|
|
3763 |
<programlisting>
|
|
|
3764 |
REPR[A ... B]= REPR[A] xb3 REPR[B]
|
|
|
3765 |
</programlisting>
|
|
|
3766 |
|
|
|
3767 |
All the constructions which make ALIGNMENTs are represented here
|
|
|
3768 |
and they will always reduce to an integer known at translate-time.
|
|
|
3769 |
Note that the mappings for xc8 and ... must preserve the basic
|
|
|
3770 |
algebraic properties derived from sets; for example the mapping of
|
|
|
3771 |
xc8 must be idempotent, commutative and associative, which is true
|
|
|
3772 |
for Max.</para>
|
|
|
3773 |
</sect3>
|
|
|
3774 |
|
|
|
3775 |
<sect3 id="offset-and-pointer-model">
|
|
|
3776 |
<title>Offset and pointer model</title>
|
|
|
3777 |
|
|
|
3778 |
<para>Most standard architectures use byte addressing; to address
|
|
|
3779 |
bits requires more complication. Hence, a value with SHAPE
|
|
|
3780 |
POINTER(A) where REPR[A)]xb9 1 is represented by a 32-bit byte
|
|
|
3781 |
address.</para>
|
|
|
3782 |
|
|
|
3783 |
<para>We are not allowed to construct pointers where REPR[A] = 1,
|
|
|
3784 |
but we still have offsets whose second alignment is a bitfield.
|
|
|
3785 |
Thus a offsets to bitfield are represented differently to offsets
|
|
|
3786 |
to other alignments:</para>
|
|
|
3787 |
|
|
|
3788 |
<para>A value with SHAPE OFFSET(A, B) where REPR(B) xb9 1 is
|
|
|
3789 |
represented by a 32-bit byte-offset.</para>
|
|
|
3790 |
|
|
|
3791 |
<para>A value with SHAPE OFFSET(A, B) where REPR(B) = 1 is
|
|
|
3792 |
represented by a 32-bit bit-offset.</para>
|
|
|
3793 |
</sect3>
|
|
|
3794 |
|
|
|
3795 |
<sect3 id="size-model">
|
|
|
3796 |
<title>Size model</title>
|
|
|
3797 |
|
|
|
3798 |
<para>In principle, the representation of a SHAPE is a pair of an
|
|
|
3799 |
ALIGNMENT and a size, given by shape_offset applied to the SHAPE.
|
|
|
3800 |
This pair is constant which can be evaluated at translate time.
|
|
|
3801 |
The construction, shape_offset(S), has SHAPE OFFSET(alignment(s),
|
|
|
3802 |
{ }) and hence is represented by a bit-offset:
|
|
|
3803 |
|
|
|
3804 |
<programlisting>
|
|
|
3805 |
REPR[shape_offset(top())] = 0
|
|
|
3806 |
REPR[shape_offset(integer(char_variety))] = 8
|
|
|
3807 |
REPR[shape_offset(integer(short_variety))] = 16
|
|
|
3808 |
.... etc. for other numeric varieties
|
|
|
3809 |
REPR[shape_offset(pointer(A))]= 32
|
|
|
3810 |
REPR[shape_offset(compound(E))] = REPR[E]
|
|
|
3811 |
REPR[shape_offset(bitfield(bfvar_bits(b, N)))] = N
|
|
|
3812 |
REPR[shape_offset(nof(N, S))] = N * REPR[offset_pad(alignment(S), shape_offset(S))]
|
|
|
3813 |
where S is not a bitfield shape
|
|
|
3814 |
</programlisting>
|
|
|
3815 |
|
|
|
3816 |
Similar considerations apply to the other offset-arithmetic
|
|
|
3817 |
constructors. In general, we have:
|
|
|
3818 |
|
|
|
3819 |
<programlisting>
|
|
|
3820 |
REPR[offset_zero(A)] = 0 for all A
|
|
|
3821 |
|
|
|
3822 |
REPR[offset_pad(A, X:OFFSET(C, D)) =
|
|
|
3823 |
((REPR[X] + REPR[A] - 1) / (REPR[A])) * REPR[A] / 8
|
|
|
3824 |
if REPR[A] xb9 1 xd9 REPR[D] =1
|
|
|
3825 |
</programlisting>
|
|
|
3826 |
|
|
|
3827 |
Otherwise:
|
|
|
3828 |
|
|
|
3829 |
<programlisting>
|
|
|
3830 |
REPR[offset_pad(A, X:OFFSET(C, D)) =
|
|
|
3831 |
((REPR[X] + REPR[A] - 1) / (REPR[A])) * REPR[A]
|
|
|
3832 |
|
|
|
3833 |
REPR[offset_add(X:OFFSET(A, B), Y:OFFSET(C, D))] = REPR[X] * 8 + REPR[Y]
|
|
|
3834 |
if REPR[B] xb9 1 xd9 REPR[D] = 1
|
|
|
3835 |
</programlisting>
|
|
|
3836 |
|
|
|
3837 |
Otherwise:
|
|
|
3838 |
|
|
|
3839 |
<programlisting>
|
|
|
3840 |
REPR[offset_add(X, Y)] = REPR[X] + REPR[Y]
|
|
|
3841 |
|
|
|
3842 |
REPR[offset_max(X:OFFSET(A, B), Y:OFFSET(C, D))] = Max(REPR[X], 8 * REPR[Y]
|
|
|
3843 |
if REPR[B] = 1 xd9 REPR[D] xb9 1
|
|
|
3844 |
REPR[offset_max(X:OFFSET(A, B), Y:OFFSET(C, D))] = Max(8 * REPR[X], REPR[Y]
|
|
|
3845 |
if REPR[D] = 1 xd9 REPR[B] xb9 1
|
|
|
3846 |
</programlisting>
|
|
|
3847 |
|
|
|
3848 |
Otherwise:
|
|
|
3849 |
|
|
|
3850 |
<programlisting>
|
|
|
3851 |
REPR[offset_max(X, Y)] = Max(REPR[X], REPR[Y])
|
|
|
3852 |
|
|
|
3853 |
REPR[offset_mult(X, E)] = REPR[X] * REPR[E]
|
|
|
3854 |
</programlisting>
|
|
|
3855 |
|
|
|
3856 |
IA translator working to this model maps ALIGNMENTs into the
|
|
|
3857 |
integers and their inclusion constraints into numerical
|
|
|
3858 |
comparisons. As a result, it will correctly allow many OFFSETs
|
|
|
3859 |
which are disallowed in general; for example, OFFSET({pointer},
|
|
|
3860 |
{char_variety}) is allowed since REPR[{pointer}] xb3
|
|
|
3861 |
REPR[{char_variety}]. Rather fewer of these extra relationships
|
|
|
3862 |
are allowed in the next model considered.</para>
|
|
|
3863 |
</sect3>
|
|
|
3864 |
</sect2>
|
|
|
3865 |
|
|
|
3866 |
<sect2>
|
|
|
3867 |
<title>Model for machines like the iAPX-432</title>
|
|
|
3868 |
|
|
|
3869 |
<para>The iAPX-432 does not have a linear model of store. The address
|
|
|
3870 |
of a word in store is a pair consisting of a block-address and a
|
|
|
3871 |
displacement within that block. In order to take full advantage of
|
|
|
3872 |
the protection facilities of the machine, block-addresses are
|
|
|
3873 |
strictly segregated from scalar data like integers, floats,
|
|
|
3874 |
displacements etc. There are at least two different kind of blocks,
|
|
|
3875 |
one which can only contain block-addresses and the other which
|
|
|
3876 |
contains only scalar data. There are clearly difficulties here in
|
|
|
3877 |
describing data-structures which contain both pointers and scalar
|
|
|
3878 |
data.</para>
|
|
|
3879 |
|
|
|
3880 |
<para>Let us assume that the machine has bit-addressing to avoid the
|
|
|
3881 |
bit complications already covered in the first model. Also assume
|
|
|
3882 |
that instruction blocks are just scalar blocks and that block
|
|
|
3883 |
addresses are aligned on 32-bit boundaries.</para>
|
|
|
3884 |
|
|
|
3885 |
<sect3 id="S95">
|
|
|
3886 |
<title>Alignment model</title>
|
|
|
3887 |
|
|
|
3888 |
<para>An ALIGNMENT is represented by a pair consisting of an
|
|
|
3889 |
integer, giving the natural alignments for scalar data, and
|
|
|
3890 |
boolean to indicate the presence of a block-address. Denote this
|
|
|
3891 |
by:
|
|
|
3892 |
|
|
|
3893 |
<programlisting>
|
|
|
3894 |
(s:alignment_of_scalars, b:has_blocks)
|
|
|
3895 |
</programlisting>
|
|
|
3896 |
|
|
|
3897 |
We then have:
|
|
|
3898 |
|
|
|
3899 |
<programlisting>
|
|
|
3900 |
REPR[alignment({ })] = (s:1, b:FALSE)
|
|
|
3901 |
REPR[alignment({char_variety}) = (s:8, b:FALSE)
|
|
|
3902 |
... etc. for other numerical and bitfield varieties.
|
|
|
3903 |
REPR[alignment({pointer})] = (s:32, b:TRUE)
|
|
|
3904 |
REPR[alignment({proc})] = (s:32, b:TRUE)
|
|
|
3905 |
REPR[alignment({local_label_value})] = (s:32, b:TRUE)
|
|
|
3906 |
</programlisting>
|
|
|
3907 |
|
|
|
3908 |
The representation of a compound ALIGNMENT is given by:
|
|
|
3909 |
|
|
|
3910 |
<programlisting>
|
|
|
3911 |
REPR[A xc8 B] = (s:Max(REPR[A].s, REPR[B].s), b:REPR[A].b xda REPR[B].b)
|
|
|
3912 |
</programlisting>
|
|
|
3913 |
|
|
|
3914 |
and their inclusion relationship is given by:
|
|
|
3915 |
|
|
|
3916 |
<programlisting>
|
|
|
3917 |
REPR[A ... B] = (REPR[A].s xb3 REPR[B].s) xd9 (REPR[A].b xda REPR[B].b)
|
|
|
3918 |
</programlisting>
|
|
|
3919 |
</para>
|
|
|
3920 |
</sect3>
|
|
|
3921 |
|
|
|
3922 |
<sect3 id="S96">
|
|
|
3923 |
<title>Offset and pointer model</title>
|
|
|
3924 |
|
|
|
3925 |
<para>A value with SHAPE POINTER A where REPR[A].b is represented
|
|
|
3926 |
by a pair consisting of a block-address of a scalar block and an
|
|
|
3927 |
integer bit-displacement within that block. Denote this by:
|
|
|
3928 |
|
|
|
3929 |
<programlisting>
|
|
|
3930 |
(sb:scalar_block_address, sd:bit_displacement)
|
|
|
3931 |
</programlisting>
|
|
|
3932 |
|
|
|
3933 |
A value with SHAPE POINTER A where REPR[A].b is represented by a
|
|
|
3934 |
quad word consisting of two block-addresses and two
|
|
|
3935 |
bit-displacements within these blocks. One of these block
|
|
|
3936 |
addresses will contain the scalar information pointed at by one of
|
|
|
3937 |
the bit-displacements; similarly, the other pair will point at the
|
|
|
3938 |
block addresses in the data are held. Denote this by:
|
|
|
3939 |
|
|
|
3940 |
<programlisting>
|
|
|
3941 |
(sb:scalar_block_address, ab:address_block_address,
|
|
|
3942 |
sd:scalar_displacement, ad:address_displacement)
|
|
|
3943 |
</programlisting>
|
|
|
3944 |
</para>
|
|
|
3945 |
|
|
|
3946 |
<para>A value with SHAPE OFFSET(A, B) where REPR[A].b is represented
|
|
|
3947 |
by an integer bit-displacement.</para>
|
|
|
3948 |
|
|
|
3949 |
<para>A value with SHAPE OFFSET(A, B) where REPR[A].b is represented
|
|
|
3950 |
by a pair of bit-displacements, one relative to a scalar-block and
|
|
|
3951 |
the other to an address-block. Denote this by:
|
|
|
3952 |
|
|
|
3953 |
<programlisting>
|
|
|
3954 |
(sd:scalar_displacement, ad:address_displacement)
|
|
|
3955 |
</programlisting>
|
|
|
3956 |
</para>
|
|
|
3957 |
</sect3>
|
|
|
3958 |
|
|
|
3959 |
<sect3 id="S97">
|
|
|
3960 |
<title>Size model</title>
|
|
|
3961 |
|
|
|
3962 |
<para>The sizes given by shape_offset are now:
|
|
|
3963 |
|
|
|
3964 |
<programlisting>
|
|
|
3965 |
REPR[shape_offset(integer(char_variety))] = 8
|
|
|
3966 |
... etc. for other numerical and bitfield varieties.
|
|
|
3967 |
REPR[shape_offset(pointer(A))] = (REPR[A].b) ? (sd:64, ad:64) : (sd:32, ad:32)
|
|
|
3968 |
REPR[shape_offset(offset(A, B))] = (REPR[A].b) ? 64 : 32)
|
|
|
3969 |
REPR[shape_offset(proc)] = (sd:32, ad:32)
|
|
|
3970 |
REPR[shape_offset(compound(E))] = REPR[E]
|
|
|
3971 |
REPR[shape_offset(nof(N, S))] = N * REPR[offset_pad(alignment(S)), shape_offset(S))]
|
|
|
3972 |
REPR[shape_offset(top)] = 0
|
|
|
3973 |
</programlisting>
|
|
|
3974 |
</para>
|
|
|
3975 |
</sect3>
|
|
|
3976 |
|
|
|
3977 |
<sect3 id="S98">
|
|
|
3978 |
<title>Offset arithmetic</title>
|
|
|
3979 |
|
|
|
3980 |
<para>The other OFFSET constructors are given by:</para>
|
|
|
3981 |
|
|
|
3982 |
<programlisting>
|
|
|
3983 |
REPR[offset_zero(A)] = 0 if REPR[A].b
|
|
|
3984 |
REPR[offset_zero(A)] = (sd: 0, ad: 0) if REPR[A].b
|
|
|
3985 |
|
|
|
3986 |
REPR[offset_add(X:OFFSET(A, B), Y:OFFSET(C, D))] = REPR[X] + REPR[Y]
|
|
|
3987 |
if REPR[A].b xd9 REPR[C].b
|
|
|
3988 |
REPR[offset_add(X:OFFSET(A, B), Y:OFFSET(C, D))] =
|
|
|
3989 |
(sd:REPR[X].sd + REPR[Y].sd, ad: REPR[X].ad + REPR[Y].ad)
|
|
|
3990 |
if REPR[A].b xd9 REPR[C].b
|
|
|
3991 |
REPR[offset_add(X:OFFSET(A, B), Y:OFFSET(C, D))] =
|
|
|
3992 |
(sd:REPR[X].sd + REPR[Y], ad:REPR[X].ad)
|
|
|
3993 |
if REPR[A].b xd9 REPR[C].b
|
|
|
3994 |
|
|
|
3995 |
REPR[offset_pad(A, Y:OFFSET(C, D))] = (REPR[Y] + REPR[A].s - 1) / REPR[A].s
|
|
|
3996 |
if REPR[A].b xd9 REPR[C].b
|
|
|
3997 |
REPR[offset_pad(A, Y:OFFSET(C, D))] =
|
|
|
3998 |
(sd:(REPR[Y] + REPR[A].s - 1) / REPR[A].s, ad:REPR[Y].ad)
|
|
|
3999 |
if REPR[C].b
|
|
|
4000 |
REPR[offset_pad(A, Y: OFFSET(C, D))] =
|
|
|
4001 |
(sd:(REPR[Y] + REPR[A].s - 1) / REPR[A].s, ad:0)
|
|
|
4002 |
if REPR[A].b xd9 REPR[C].b
|
|
|
4003 |
REPR[offset_max(X:OFFSET(A, B), Y:OFFSET(C, D))] = Max(REPR[X], REPR[Y])
|
|
|
4004 |
if REPR[A].b xd9 REPR[C].b
|
|
|
4005 |
REPR[offset_max(X:OFFSET(A, B), Y:OFFSET(C, D))] =
|
|
|
4006 |
(sd:Max(REPR[X].sd, REPR[Y].sd), ad:Max(REPR[X].a, REPR[Y].ad))
|
|
|
4007 |
if REPR[A].b xd9 REPR[C].b
|
|
|
4008 |
REPR[offset_max(X:OFFSET(A, B), Y:OFFSET(C, D))] =
|
|
|
4009 |
(sd:Max(REPR[X].sd, REPR[Y]), ad:REPR[X].ad)
|
|
|
4010 |
if REPR[A].b xd9 REPR[C].b
|
|
|
4011 |
REPR[offset_max(X:OFFSET(A, B), Y:OFFSET(C, D))] =
|
|
|
4012 |
(sd:Max(REPR[Y].sd, REPR[X]), ad: REPR[Y].ad)
|
|
|
4013 |
if REPR[C].b xd9 REPR[A].b
|
|
|
4014 |
|
|
|
4015 |
REPR[offset_subtract(X:OFFSET(A, B), Y:OFFSET(C, D))] = REPR[X] - REPR[Y]
|
|
|
4016 |
if REPR[A].b xd9 REPR[C].b
|
|
|
4017 |
REPR[offset_subtract(X:OFFSET(A,B), Y:OFFSET(C, D))] =
|
|
|
4018 |
(sd:REPR[X].sd - REPR[Y].sd, ad:REPR[X].ad - REPR[Y].ad)
|
|
|
4019 |
if REPR[A].b xd9 REPR[C].b
|
|
|
4020 |
REPR[offset_add(X:OFFSET(A, B), Y:OFFSET(C, D))] = REPR[X].sd - REPR[Y]
|
|
|
4021 |
if REPR[A].b xd9 REPR[C].b
|
|
|
4022 |
.... and so on.
|
|
|
4023 |
</programlisting>
|
|
|
4024 |
|
|
|
4025 |
<para>Unlike the previous one, this model of ALIGNMENTs would reject
|
|
|
4026 |
OFFSETs such as OFFSET({long_variety}, {pointer}) but not
|
|
|
4027 |
OFFSET({pointer}, {long_variety}) since:</para>
|
|
|
4028 |
|
|
|
4029 |
<programlisting>
|
|
|
4030 |
REPR[{long_variety} ... {pointer}] = FALSE
|
|
|
4031 |
</programlisting>
|
|
|
4032 |
|
|
|
4033 |
<para>but:</para>
|
|
|
4034 |
|
|
|
4035 |
<programlisting>
|
|
|
4036 |
REPR[{pointer} ... {long_variety}] = TRUE
|
|
|
4037 |
</programlisting>
|
|
|
4038 |
|
|
|
4039 |
<para>This just reflects the fact that there is no way that one can
|
|
|
4040 |
extract a block-address necessary for a pointer from a
|
|
|
4041 |
scalar-block, but since the representation of a pointer includes a
|
|
|
4042 |
scalar displacement, one can always retrieve a scalar from a
|
|
|
4043 |
pointer to a pointer.</para>
|
|
|
4044 |
</sect3>
|
|
|
4045 |
</sect2>
|
|
|
4046 |
</sect1>
|
|
|
4047 |
|
|
|
4048 |
<sect1>
|
|
|
4049 |
<title>Conclusion</title>
|
|
|
4050 |
|
|
|
4051 |
<para>This commentary is not complete. I have tended to go into
|
|
|
4052 |
considerable detail into aspects which I consider might be unfamiliar
|
|
|
4053 |
and skip over others which occur in most compiling systems. I also
|
|
|
4054 |
have a tendency to say things more than once, albeit in different
|
|
|
4055 |
words; however if something is worth saying, it is worth saying
|
|
|
4056 |
twice.</para>
|
|
|
4057 |
|
|
|
4058 |
<para>I shall continue tracking further revisions of the TDF
|
|
|
4059 |
specification in later releases or appendices to this document.</para>
|
|
|
4060 |
</sect1>
|
|
|
4061 |
</chapter>
|
|
|
4062 |
</book>
|