Subversion Repositories tendra.SVN

Rev

Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 7u83 1
<!-- Crown Copyright (c) 1998 -->
2
<HTML>
3
<HEAD>
4
<TITLE>
5
C++ Producer Guide: Implementation 
6
</TITLE>
7
</HEAD>
8
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">
9
 
10
<H1>C++ Producer Guide</H1>
11
<H3>March 1998</H3>
12
<A HREF="std.html"><IMG SRC="../images/next.gif" ALT="next section"></A>
13
<A HREF="link.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>
14
<A HREF="index.html"><IMG SRC="../images/top.gif" ALT="current document"></A>
15
<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">
16
</A>
17
<IMG SRC="../images/no_index.gif" ALT="document index"><P>
18
<HR>
19
 
20
<DL>
21
<DT><A HREF="#arith"><B>2.6.1</B> - Arithmetic types</A><DD>
22
<DT><A HREF="#literal"><B>2.6.2</B> - Integer literal types</A><DD>
23
<DT><A HREF="#bitfield"><B>2.6.3</B> - Bitfield types</A><DD>
24
<DT><A HREF="#pointer"><B>2.6.4</B> - Generic pointers</A><DD>
25
<DT><A HREF="#conv"><B>2.6.5</B> - Undefined conversions</A><DD>
26
<DT><A HREF="#div"><B>2.6.6</B> - Integer division</A><DD>
27
<DT><A HREF="#call"><B>2.6.7</B> - Calling conventions</A><DD>
28
<DT><A HREF="#ptr_mem"><B>2.6.8</B> - Pointers to data members</A><DD>
29
<DT><A HREF="#ptr_mem_func"><B>2.6.9</B> - Pointers to function members</A><DD>
30
<DT><A HREF="#class"><B>2.6.10</B> - Class layout</A><DD>
31
<DT><A HREF="#derive"><B>2.6.11</B> - Derived class layout</A><DD>
32
<DT><A HREF="#constr"><B>2.6.12</B> - Constructors and destructors</A><DD>
33
<DT><A HREF="#vtable"><B>2.6.13</B> - Virtual function tables</A><DD>
34
<DT><A HREF="#rtti"><B>2.6.14</B> - Run-time type information</A><DD>
35
<DT><A HREF="#init"><B>2.6.15</B> - Dynamic initialisation</A><DD>
36
<DT><A HREF="#except"><B>2.6.16</B> - Exception handling</A><DD>
37
<DT><A HREF="#mangle"><B>2.6.17</B> - Mangled identifier names</A><DD>
38
</DL>
39
<HR>
40
 
41
<H2>2.6. Implementation details</H2>
42
<P>
43
This section describes various of the implementation details of the
44
C++ producer TDF output.  In particular it describes the standard
45
TDF tokens used to represent the target dependent aspects of the language
46
and to provide links into the run-time system.  Many of these tokens
47
are common to the C and C++ producers.  Those which are unique to
48
the C++ producer have names of the form <CODE>~cpp.*</CODE>.  Note
49
that the description is in terms of TDF tokens, not the internal tokens
50
introduced by the 
51
<A HREF="token.html"><CODE>#pragma token</CODE> syntax</A>. 
52
</P>
53
<P>
54
There are two levels of implementation in the run-time system.  The
55
actual interface between the producer and the run-time system is given
56
by the standard tokens.  The provided implementation defines these
57
tokens in a way appropriate to itself.  An alternative implementation
58
would have to define the tokens differently.  It is intended that
59
the standard tokens are sufficiently generic to allow a variety of
60
implementations to hook into the producer output in the manner they
61
require. 
62
</P>
63
 
64
<HR>
65
<H3><A NAME="arith">2.6.1. Arithmetic types</A></H3>
66
<P>
67
The representations of the basic arithmetic types are target dependent,
68
so, for example, an <CODE>int</CODE> may contain 16, 32, 64 or some
69
other number of bits.  Thus it is necessary to introduce a token to
70
stand for each of the built-in arithmetic types (including the 
71
<A HREF="pragma.html#longlong"><CODE>long long</CODE> types</A>).
72
Each integral type is represented by a <CODE>VARIETY</CODE> token
73
as follows: 
74
</P>
75
<CENTER>
76
<TABLE BORDER>
77
<TR><TH>Type</TH>
78
<TH>Token</TH>
79
<TH>Encoding</TH>
80
<TR><TD ALIGN=CENTER>char</TD>
81
<TD ALIGN=CENTER>~char</TD>
82
<TD ALIGN=CENTER>0</TD>
83
<TR><TD ALIGN=CENTER>signed char</TD>
84
<TD ALIGN=CENTER>~signed_char</TD>
85
<TD ALIGN=CENTER>0 | 4 = 4</TD>
86
<TR><TD ALIGN=CENTER>unsigned char</TD>
87
<TD ALIGN=CENTER>~unsigned_char</TD>
88
<TD ALIGN=CENTER>0 | 8 = 8</TD>
89
<TR><TD ALIGN=CENTER>signed short</TD>
90
<TD ALIGN=CENTER>~signed_short</TD>
91
<TD ALIGN=CENTER>1 | 4 = 5</TD>
92
<TR><TD ALIGN=CENTER>unsigned short</TD>
93
<TD ALIGN=CENTER>~unsigned_short</TD>
94
<TD ALIGN=CENTER>1 | 8 = 9</TD>
95
<TR><TD ALIGN=CENTER>signed int</TD>
96
<TD ALIGN=CENTER>~signed_int</TD>
97
<TD ALIGN=CENTER>2 | 4 = 6</TD>
98
<TR><TD ALIGN=CENTER>unsigned int</TD>
99
<TD ALIGN=CENTER>~unsigned_int</TD>
100
<TD ALIGN=CENTER>2 | 8 = 10</TD>
101
<TR><TD ALIGN=CENTER>signed long</TD>
102
<TD ALIGN=CENTER>~signed_long</TD>
103
<TD ALIGN=CENTER>3 | 4 = 7</TD>
104
<TR><TD ALIGN=CENTER>unsigned long</TD>
105
<TD ALIGN=CENTER>~unsigned_long</TD>
106
<TD ALIGN=CENTER>3 | 8 = 11</TD>
107
<TR><TD ALIGN=CENTER>signed long long</TD>
108
<TD ALIGN=CENTER>~signed_longlong</TD>
109
<TD ALIGN=CENTER>3 | 4 | 16 = 23 </TD>
110
<TR><TD ALIGN=CENTER>unsigned long long</TD>
111
<TD ALIGN=CENTER>~unsigned_longlong</TD>
112
<TD ALIGN=CENTER>3 | 8 | 16 = 27</TD>
113
</TABLE>
114
</CENTER>
115
<P>
116
Similarly each floating point type is represent by a 
117
<CODE>FLOATING_VARIETY</CODE> token: 
118
</P>
119
<CENTER>
120
<TABLE BORDER>
121
<TR><TH>Type</TH>   <TH>Token</TH>
122
<TR><TD ALIGN=CENTER>float</TD>  <TD ALIGN=CENTER>~float</TD>
123
<TR><TD ALIGN=CENTER>double</TD> <TD ALIGN=CENTER>~double</TD>
124
<TR><TD ALIGN=CENTER>long double</TD> <TD ALIGN=CENTER>~long_double</TD>
125
</TABLE>
126
</CENTER>
127
<P>
128
Each integral type also has an encoding as a <CODE>SIGNED_NAT</CODE>
129
as shown above.  This number is a bit pattern built up from the following
130
values: 
131
</P>
132
<CENTER>
133
<TABLE BORDER>
134
<TR><TH>Type</TH>   <TH>Encoding</TH>
135
<TR><TD ALIGN=CENTER>char</TD>  <TD ALIGN=CENTER>0</TD>
136
<TR><TD ALIGN=CENTER>short</TD>  <TD ALIGN=CENTER>1</TD>
137
<TR><TD ALIGN=CENTER>int</TD>  <TD ALIGN=CENTER>2</TD>
138
<TR><TD ALIGN=CENTER>long</TD>  <TD ALIGN=CENTER>3</TD>
139
<TR><TD ALIGN=CENTER>signed</TD> <TD ALIGN=CENTER>4</TD>
140
<TR><TD ALIGN=CENTER>unsigned</TD> <TD ALIGN=CENTER>8</TD>
141
<TR><TD ALIGN=CENTER>long long</TD> <TD ALIGN=CENTER>16</TD>
142
</TABLE>
143
</CENTER>
144
<P>
145
Any target dependent integral type can be represented by a 
146
<CODE>SIGNED_NAT</CODE> token using this encoding.  This representation,
147
rather than one based on <CODE>VARIETY</CODE>s, is used for ease of
148
manipulation.  The token: 
149
<PRE>
150
	~convert : ( SIGNED_NAT ) -&gt; VARIETY
151
</PRE>
152
gives the mapping from the integral encoding to the representing variety.
153
For example, it will map <CODE>6</CODE> to <CODE>~signed_int</CODE>.
154
</P>
155
<P>
156
The token: 
157
<PRE>
158
	~promote : ( SIGNED_NAT ) -&gt; SIGNED_NAT
159
</PRE>
160
describes how to form the promotion of an integral type according
161
to the ISO C/C++ value preserving rules, and is used by the producer
162
to represent target dependent promotion types.  For example, the promotion
163
of <CODE>unsigned short</CODE> may be <CODE>int</CODE> or <CODE>unsigned
164
int</CODE> depending on the representation of these types; that is
165
to say, <CODE>~promote ( 9 )</CODE> will be <CODE>6</CODE> on some
166
machines and <CODE>10</CODE> on others.  Although <CODE>~promote</CODE>
167
is used by default, a program may specify another token with the same
168
sort signature to be used in its place by means of the directive:
169
<PRE>
170
	#pragma TenDRA compute promote <I>identifier</I>
171
</PRE>
172
For example, a standard token <CODE>~sign_promote</CODE> is defined
173
which gives the older C sign preserving promotion rules.  In addition,
174
the promotion of an individual type can be specified using: 
175
<PRE>
176
	#pragma TenDRA promoted <I>type-id</I> : <I>promoted-type-id</I>
177
</PRE>
178
</P>
179
<P>
180
The token: 
181
<PRE>
182
	~arith_type : ( SIGNED_NAT, SIGNED_NAT ) -&gt; SIGNED_NAT
183
</PRE>
184
similarly describes how to form the usual arithmetic result type from
185
two promoted integral operand types.  For example, the arithmetic
186
type of <CODE>long</CODE> and <CODE>unsigned int</CODE> may be 
187
<CODE>long</CODE> or <CODE>unsigned long</CODE> depending on the representation
188
of these types; that is to say, 
189
<CODE>~arith_type ( 7, 10 )</CODE> will be <CODE>7</CODE> on some
190
machines and <CODE>11</CODE> on others. 
191
</P>
192
<P>
193
Any tokenised type declared using: 
194
<PRE>
195
	#pragma token VARIETY v # tv
196
</PRE>
197
will be represented by a <CODE>SIGNED_NAT</CODE> token with external
198
name 
199
<CODE>tv</CODE> corresponding to the encoding of <CODE>v</CODE>. 
200
Special cases of this are the implementation dependent integral types
201
which arise naturally within the language.  The external token names
202
for these types are given below: 
203
</P>
204
<CENTER>
205
<TABLE BORDER>
206
<TR><TH>Type</TH>   <TH>Token</TH>
207
<TR><TD ALIGN=CENTER>bool</TD>  <TD ALIGN=CENTER>~cpp.bool</TD>
208
<TR><TD ALIGN=CENTER>ptrdiff_t</TD> <TD ALIGN=CENTER>ptrdiff_t</TD>
209
<TR><TD ALIGN=CENTER>size_t</TD> <TD ALIGN=CENTER>size_t</TD>
210
<TR><TD ALIGN=CENTER>wchar_t</TD> <TD ALIGN=CENTER>wchar_t</TD>
211
</TABLE>
212
</CENTER>
213
<P>
214
So, for example, a <CODE>sizeof</CODE> expression has shape 
215
<CODE>~convert ( size_t )</CODE>.  The token <CODE>~cpp.bool</CODE>
216
is defined in the default implementation, but the other tokens are
217
defined according to their definitions on the target machine in the
218
normal API library building mechanism. 
219
</P>
220
 
221
<HR>
222
<H3><A NAME="literal">2.6.2. Integer literal types</A></H3>
223
<P>
224
The <A HREF="pragma.html#int">type of an integer literal</A> is defined
225
in terms of the first in a list of possible integral types.  The first
226
type in which the literal value can be represented gives the type
227
of the literal.  For small literals it is possible to work out the
228
type exactly, however for larger literals the result is target dependent.
229
For example, the literal <CODE>50000</CODE> will have type <CODE>int</CODE>
230
on machines in which <CODE>50000</CODE> fits into an <CODE>int</CODE>,
231
and 
232
<CODE>long</CODE> otherwise.  This target dependent mapping is given
233
by a series of tokens of the form: 
234
<PRE>
235
	~lit_* : ( SIGNED_NAT ) -&gt; SIGNED_NAT
236
</PRE>
237
which map a literal value to the representation of an integral type.
238
The token used depends on the list of possible types, which in turn
239
depends on the base used to represent the literal and the integer
240
suffix used, as given in the following table: 
241
</P>
242
<CENTER>
243
<TABLE BORDER>
244
<TR><TH>Base</TH>
245
<TH>Suffix</TH>
246
<TH>Token</TH>
247
<TH>Types</TH>
248
<TR><TD ALIGN=CENTER>decimal</TD>
249
<TD ALIGN=CENTER>none</TD>
250
<TD ALIGN=CENTER>~lit_int</TD>
251
<TD ALIGN=CENTER>int, long, unsigned long</TD>
252
<TR><TD ALIGN=CENTER>octal</TD>
253
<TD ALIGN=CENTER>none</TD>
254
<TD ALIGN=CENTER>~lit_hex</TD>
255
<TD ALIGN=CENTER>int, unsigned int, long, unsigned long</TD>
256
<TR><TD ALIGN=CENTER>hexadecimal</TD>
257
<TD ALIGN=CENTER>none</TD>
258
<TD ALIGN=CENTER>~lit_hex</TD>
259
<TD ALIGN=CENTER>int, unsigned int, long, unsigned long</TD>
260
<TR><TD ALIGN=CENTER>any</TD>
261
<TD ALIGN=CENTER>U</TD>
262
<TD ALIGN=CENTER>~lit_unsigned</TD>
263
<TD ALIGN=CENTER>unsigned int, unsigned long</TD>
264
<TR><TD ALIGN=CENTER>any</TD>
265
<TD ALIGN=CENTER>L</TD>
266
<TD ALIGN=CENTER>~lit_long</TD>
267
<TD ALIGN=CENTER>long, unsigned long</TD>
268
<TR><TD ALIGN=CENTER>any</TD>
269
<TD ALIGN=CENTER>UL</TD>
270
<TD ALIGN=CENTER>~lit_ulong</TD>
271
<TD ALIGN=CENTER>unsigned long</TD>
272
<TR><TD ALIGN=CENTER>any</TD>
273
<TD ALIGN=CENTER>LL</TD>
274
<TD ALIGN=CENTER>~lit_longlong</TD>
275
<TD ALIGN=CENTER>long long, unsigned long long</TD>
276
<TR><TD ALIGN=CENTER>any</TD>
277
<TD ALIGN=CENTER>ULL</TD>
278
<TD ALIGN=CENTER>~lit_ulonglong</TD>
279
<TD ALIGN=CENTER>unsigned long long</TD>
280
</TABLE>
281
</CENTER>
282
<P>
283
Thus, for example, the shape of the integer literal 50000 is: 
284
<PRE>
285
	~convert ( ~lit_int ( 50000 ) )
286
</PRE>
287
</P>
288
 
289
<HR>
290
<H3><A NAME="bitfield">2.6.3. Bitfield types</A></H3>
291
<P>
292
The sign of a plain bitfield type, declared without using 
293
<CODE>signed</CODE> or <CODE>unsigned</CODE>, is left unspecified
294
in C and C++.  The token: 
295
<PRE>
296
	~cpp.bitf_sign : ( SIGNED_NAT ) -&gt; BOOL
297
</PRE>
298
is used to give a mapping from integral types to the sign of a plain
299
bitfield of that type, in a form suitable for use in the TDF 
300
<CODE>bfvar_bits</CODE> construct.  (Note that <CODE>~cpp.bitf_sign</CODE>
301
should have been a standard C token but was omitted.) 
302
</P>
303
 
304
<HR>
305
<H3><A NAME="pointer">2.6.4. Generic pointers</A></H3>
306
<P>
307
TDF has no concept of a generic pointer type, so tokens are used to
308
defer the representation of <CODE>void *</CODE> and the basic operations
309
on it to the target machine.  The fundamental token is: 
310
<PRE>
311
	~ptr_void : () -&gt; SHAPE
312
</PRE>
313
which gives the representation of <CODE>void *</CODE>.  This shape
314
will be denoted by <CODE>pv</CODE> in the description of the following
315
tokens.  It is not guaranteed that <CODE>pv</CODE> is a TDF <CODE>pointer</CODE>
316
shape, although normally it will be implemented as a pointer to a
317
suitable alignment. 
318
</P>
319
<P>
320
The token: 
321
<PRE>
322
	~null_pv : () -&gt; EXP pv
323
</PRE>
324
gives the value of a null pointer of type <CODE>void *</CODE>.  Generic
325
pointers can also be converted to and from other pointers.  These
326
conversions are represented by the tokens: 
327
<PRE>
328
	~to_ptr_void : ( ALIGNMENT a, EXP POINTER a ) -&gt; EXP pv
329
	~from_ptr_void : ( ALIGNMENT a, EXP pv ) -&gt; EXP POINTER a
330
</PRE>
331
where the given alignment describes the destination or source pointer
332
type.  Finally a generic pointer may be tested against the null pointer
333
or two generic pointers may be compared.  These operations are represented
334
by the tokens: 
335
<PRE>
336
	~pv_test : ( EXP pv, LABEL, NTEST ) -&gt; EXP TOP
337
	~cpp.pv_compare : ( EXP pv, EXP pv, LABEL, NTEST ) -&gt; EXP TOP
338
</PRE>
339
where the given <CODE>NTEST</CODE> gives the comparison to be applied
340
and the given label gives the destination to jump to if the test fails.
341
(Note that <CODE>~cpp.pv_compare</CODE> should have been a standard
342
C token but was omitted.) 
343
</P>
344
 
345
<HR>
346
<H3><A NAME="conv">2.6.5. Undefined conversions</A></H3>
347
<P>
348
Several conversions in C and C++ can only be represented by undefined
349
TDF.  For example, converting a pointer to an integer can only be
350
represented in TDF by forming a union of the pointer and integer shapes,
351
putting the pointer into the union and pulling the integer out.  Such
352
conversions are tokenised.  Undefined conversions not mentioned below
353
may be performed by combining those given with the standard, well-defined,
354
conversions. 
355
</P>
356
<P>
357
The token: 
358
<PRE>
359
	~ptr_to_ptr : ( ALIGNMENT a, ALIGNMENT b, EXP POINTER a ) -&gt; EXP POINTER b
360
</PRE>
361
is used to convert between two incompatible pointer types.  The first
362
alignment describes the source pointer shape while the second describes
363
the destination pointer shape.  Note that if the destination alignment
364
is greater than the source alignment then the source pointer can be
365
used in most TDF constructs in place of the destination pointer, so
366
the use of <CODE>~ptr_to_ptr</CODE> can be omitted (the exception
367
is 
368
<CODE>pointer_test</CODE> which requires equal alignments).  Base
369
class pointer conversions are examples of these well-behaved, alignment
370
preserving conversions. 
371
</P>
372
<P>
373
The tokens: 
374
<PRE>
375
	~f_to_pv : ( EXP PROC ) -&gt; EXP pv
376
	~pv_to_f : ( EXP pv ) -&gt; EXP PROC
377
</PRE>
378
are used to convert pointers to functions to and from <CODE>void *</CODE>
379
(these conversions are not allowed in ISO C/C++ but are in older dialects).
380
</P>
381
<P>
382
The tokens: 
383
<PRE>
384
	~i_to_p : ( VARIETY v, ALIGNMENT a, EXP INTEGER v ) -&gt; EXP POINTER a
385
	~p_to_i : ( ALIGNMENT a, VARIETY v, EXP POINTER a ) -&gt; EXP INTEGER v
386
	~i_to_pv : ( VARIETY v, EXP INTEGER v ) -&gt; EXP pv
387
	~pv_to_i : ( VARIETY v, EXP pv ) -&gt; EXP INTEGER v
388
</PRE>
389
are used to convert integers to and from <CODE>void *</CODE> and other
390
pointers. 
391
</P>
392
 
393
<HR>
394
<H3><A NAME="div">2.6.6. Integer division</A></H3>
395
<P>
396
The precise form of the integer division and remainder operations
397
in C and C++ is left unspecified with respect to the sign of the result
398
if either operand is negative.  The tokens: 
399
<PRE>
400
	~div : ( EXP INTEGER v, EXP INTEGER v ) -&gt; EXP INTEGER v
401
	~rem : ( EXP INTEGER v, EXP INTEGER v ) -&gt; EXP INTEGER v
402
</PRE>
403
are used to represent integer division and remainder.  They will map
404
onto one of the pairs of TDF constructs, <CODE>div0</CODE> and <CODE>rem0</CODE>,
405
<CODE>div1</CODE> and <CODE>rem1</CODE> or <CODE>div2</CODE> and 
406
<CODE>rem2</CODE>. 
407
</P>
408
 
409
<HR>
410
<H3><A NAME="call">2.6.7. Calling conventions</A></H3>
411
<P>
412
The function calling conventions used by the C++ producer are essentially
413
the same as those used by the C producer with one exception.  That
414
is to say, all types except arrays are passed by value (note that
415
individual installers may modify these conventions to conform to their
416
own ABIs). 
417
</P>
418
<P>
419
The exception concerns classes with a non-trivial constructor, destructor
420
or assignment operator.  These classes are passed as function arguments
421
by taking a reference to a copy of the object (although it is often
422
possible to eliminate the copy and pass a reference to the object
423
directly).  They are passed as function return values by adding an
424
extra parameter to the start of the function parameters giving a reference
425
to a location into which the return value should be copied. 
426
</P>
427
 
428
<H4>Member functions</H4>
429
<P>
430
Non-static member functions are implemented in the obvious fashion,
431
by passing a pointer to the object the method is being applied to
432
as the first argument (or the second argument if the method has an
433
extra argument for its return value). 
434
</P>
435
 
436
<H4><A NAME="ellipsis">Ellipsis functions</A></H4>
437
<P>
438
Calls to functions declared with ellipses are via the 
439
<CODE>apply_proc</CODE> TDF construct, with all the arguments being
440
treated as non-variable.  However the definition of such a function
441
uses the <CODE>make_proc</CODE> construct with a variable parameter.
442
This parameter can be referred to within the program using the 
443
<A HREF="pragma.html#ellipsis"><CODE>...</CODE> expression</A>.  The
444
type of this expression is given by the built-in token: 
445
<PRE>
446
	~__va_t : () -&gt; SHAPE
447
</PRE>
448
The <CODE>va_start</CODE> macro declared in the 
449
<CODE>&lt;stdarg.h&gt;</CODE> header then describes how the variable
450
parameter (expressed as <CODE>...</CODE>) can be converted to an expression
451
of type <CODE>va_list</CODE> suitable for use in the 
452
<CODE>va_arg</CODE> macro. 
453
</P>
454
<P>
455
Note that the variable parameter is in effect only being used to determine
456
where the first optional parameter is defined.  The assumption is
457
that all such parameters are located contiguously on the stack, however
458
the fact that calls to such functions do not use the variable parameter
459
mechanism means that this is not automatically the case.  Strictly
460
speaking this means that the implementation of ellipsis functions
461
uses undefined behaviour in TDF, however given the non-type-safe function
462
calling rules in C this is unavoidable and installers need to make
463
provision for such calls (by dumping any parameters from registers
464
to the stack if necessary).  Given the theoretically type-safe nature
465
of C++ it would be possible to avoid such undefined behaviour, but
466
the need for C-compatible calling conventions prevents this. 
467
</P>
468
 
469
<HR>
470
<H3><A NAME="ptr_mem">2.6.8. Pointers to data members</A></H3>
471
<P>
472
The representation of, and operations on, pointers to data members
473
are represented by tokens to allow for a variety of implementations.
474
It is assumed that all pointers to data members (as opposed to pointers
475
to function members) are represented by the same shape: 
476
<PRE>
477
	~cpp.pm.type : () -&gt; SHAPE
478
</PRE>
479
This shape will be denoted by <CODE>pm</CODE> in the description of
480
the following tokens. 
481
</P>
482
<P>
483
There are two basic methods of constructing a pointer to a data member.
484
The first is to take the address of a data member of a class.  A data
485
member is represented in TDF by an expression which gives the offset
486
of the member from the start of its enclosing <CODE>compound</CODE>
487
shape (note that it is not possible to take the address of a member
488
of a virtual base). The mapping from this offset to a pointer to a
489
data member is given by: 
490
<PRE>
491
	~cpp.pm.make : ( EXP OFFSET ) -&gt; EXP pm
492
</PRE>
493
The second way of constructing a pointer to a data member is to use
494
a null pointer to member: 
495
<PRE>
496
	~cpp.pm.null : () -&gt; EXP pm
497
</PRE>
498
The other fundamental operation on a pointer to data member is to
499
turn it back into an offset expression which can be added to a pointer
500
to a class to access a member of that class in a <CODE>.*</CODE> or
501
<CODE>-&gt*</CODE>
502
operation.  This is done by the token: 
503
<PRE>
504
	~cpp.pm.offset : ( EXP pm, ALIGNMENT a ) -&gt; EXP OFFSET ( a, a )
505
</PRE>
506
Note that it is necessary to specify an alignment in order to describe
507
the shape of the result.  The value of this token is undefined if
508
the given expression is a null pointer to data member. 
509
</P>
510
<P>
511
A pointer to a data member of a non-virtual base class can be converted
512
to a pointer to a data member of a derived class.  The reverse conversion
513
is also possible using <CODE>static_cast</CODE>.  If the base is a
514
<A HREF="#primary">primary base class</A> then these conversions are
515
trivial and have no effect.  Otherwise null pointers to data members
516
are converted to null pointers to data members, and the non-null cases
517
are handled by the tokens: 
518
<PRE>
519
	~cpp.pm.cast : ( EXP pm, EXP OFFSET ) -&gt; EXP pm
520
	~cpp.pm.uncast : ( EXP pm, EXP OFFSET ) -&gt; EXP pm
521
</PRE>
522
where the given offset is the offset of the base class within the
523
derived class.  It is also possible to convert between any two pointers
524
to data members using <CODE>reinterpret_cast</CODE>.  This conversion
525
is implied by the equality of representation between any two pointers
526
to data members and has no effect. 
527
</P>
528
<P>
529
The only remaining operations on pointer to data members are to test
530
one against the null pointer to data member and to compare two pointer
531
to data members.  These are represented by the tokens: 
532
<PRE>
533
	~cpp.pm.test : ( EXP pm, LABEL, NTEST ) -&gt; EXP TOP
534
	~cpp.pm.compare : ( EXP pm, EXP pm, LABEL, NTEST ) -&gt; EXP TOP
535
</PRE>
536
where the given <CODE>NTEST</CODE> gives the comparison to be applied
537
and the given label gives the destination to jump to if the test fails.
538
</P>
539
<P>
540
In the default implementation, pointers to data members are implemented
541
as <CODE>int</CODE>.  The null pointer to member is represented by
542
 
543
of the member (in bytes).  Casting to and from a derived class then
544
correspond to adding or subtracting the base class offset (in bytes),
545
and pointer to member comparisons correspond to integer comparisons.
546
</P>
547
 
548
<HR>
549
<H3><A NAME="ptr_mem_func">2.6.9. Pointers to function members</A></H3>
550
<P>
551
As with pointers to data members, pointers to function members and
552
the operations on them are represented by tokens to allow for a range
553
of implementations.  All pointers to function members are represented
554
by the same shape: 
555
<PRE>
556
	~cpp.pmf.type : () -&gt; SHAPE
557
</PRE>
558
This shape will be denoted by <CODE>pmf</CODE> in the description
559
of the following tokens.  Many of the tokens take an expression which
560
has a shape which is a pointer to the alignment of <CODE>pmf</CODE>.
561
This will be denoted by <CODE>ppmf</CODE>. 
562
</P>
563
<P>
564
There are two basic methods for constructing a pointer to a function
565
member.  The first is to take the address of a non-static member function
566
of a class.  There are two cases, depending on whether or not the
567
member function is virtual.  The non-virtual case is given by the
568
token: 
569
<PRE>
570
	~cpp.pmf.make : ( EXP PROC, EXP OFFSET, EXP OFFSET ) -&gt; EXP pmf
571
</PRE>
572
where the first argument is the address of the corresponding function,
573
the second argument gives any base class offset which is to be added
574
when calling this function (to deal with inherited member functions),
575
and the third argument is a zero offset. 
576
</P>
577
<P>
578
For virtual functions, a pointer to function member of the form above
579
is entered in the <A HREF="#vtable">virtual function table</A> for
580
the corresponding class.  The actual pointer to the virtual function
581
member then gives a reference into the virtual function table as follows:
582
<PRE>
583
	~cpp.pmf.vmake : ( SIGNED_NAT, EXP OFFSET, EXP, EXP ) -&gt; EXP pmf
584
</PRE>
585
where the first argument gives the index of the function within the
586
virtual function table, the second argument gives the offset of the
587
<I>vptr</I> field within the class, and the third and fourth arguments
588
are zero offsets. 
589
</P>
590
<P>
591
The second way of constructing a pointer to a function member is to
592
use a null pointer to function member: 
593
<PRE>
594
	~cpp.pmf.null : () -&gt; EXP pmf
595
	~cpp.pmf.null2 : () -&gt; EXP pmf
596
</PRE>
597
For technical reasons there are two versions of this token, although
598
they have the same value.  The first token is used in static initialisers;
599
the second token is used in other expressions. 
600
<P>
601
The cast operations on pointers to function members are more complex
602
than those on pointers to data members.  The value to be cast is copied
603
into a temporary and one of the tokens: 
604
<PRE>
605
	~cpp.pmf.cast : ( EXP ppmf, EXP OFFSET, EXP, EXP OFFSET ) -&gt; EXP TOP
606
	~cpp.pmf.uncast : ( EXP ppmf, EXP OFFSET, EXP, EXP OFFSET ) -&gt; EXP TOP
607
</PRE>
608
is applied to modify the value of the temporary according to the given
609
cast.  The first argument gives the address of the temporary, the
610
second gives the base class offset to be added or subtracted, the
611
third gives the number to be added or subtracted to convert virtual
612
function indexes for the base class into virtual function indexes
613
for the derived class, and the fourth gives the offset of the <I>vptr</I>
614
field within the class.  Again, the ability to use <CODE>reinterpret_cast</CODE>
615
to convert between any two pointer to function member types arises
616
because of the uniform representation of these types. 
617
</P>
618
<P>
619
As with pointers to data members, there are tokens implementing comparisons
620
on pointers to function members: 
621
<PRE>
622
	~cpp.pmf.test : ( EXP ppmf, LABEL, NTEST ) -&gt; EXP TOP
623
	~cpp.pmf.compare : ( EXP ppmf, EXP ppmf, LABEL, NTEST ) -&gt; EXP TOP
624
</PRE>
625
Note however that the arguments are passed by reference. 
626
</P>
627
<P>
628
The most important, and most complex, operation is calling a function
629
through a pointer to function member.  The first step is to copy the
630
pointer to function member into a temporary.  The token: 
631
<PRE>
632
	~cpp.pmf.virt : ( EXP ppmf, EXP, ALIGNMENT ) -&gt; EXP TOP
633
</PRE>
634
is then applied to the temporary to convert a pointer to a virtual
635
function member to a normal pointer to function member by looking
636
it up in the corresponding virtual function table.  The first argument
637
gives the address of the temporary, the second gives the object to
638
which the function is to be applied, and the third gives the alignment
639
of the corresponding class.  Now the base class conversion to be applied
640
to the object can be determined by applying the token: 
641
<PRE>
642
	~cpp.pmf.delta : ( ALIGNMENT a, EXP ppmf ) -&gt; EXP OFFSET ( a, a )
643
</PRE>
644
to the temporary to find the offset to be added.  Finally the function
645
to be called can be extracted from the temporary using the token:
646
<PRE>
647
	~cpp.pmf.func : ( EXP ppmf ) -&gt; EXP PROC
648
</PRE>
649
The function call then procedes as normal. 
650
</P>
651
<P>
652
The default implementation is that described in the ARM, where each
653
pointer to function member is represented in the form: 
654
<PRE>
655
	struct PTR_MEM_FUNC {
656
	    short delta ;
657
	    short index ;
658
	    union {
659
		void ( *func ) () ;
660
		short off ;
661
	    } u ;
662
	} ;
663
</PRE>
664
The <CODE>delta</CODE> field gives the base class offset (in bytes)
665
to be added before applying the function.  The <CODE>index</CODE>
666
field is 0 for null pointers, -1 for non-virtual function pointers
667
and the index into the virtual function table for virtual function
668
pointers (as described below these indexes start from 1).  For non-virtual
669
function pointers the function itself is given by the <CODE>u.func</CODE>
670
field. For virtual function pointers the offset of the <I>vptr</I>
671
field within the class is given by the <CODE>u.off</CODE> field. 
672
</P>
673
 
674
<HR>
675
<H3><A NAME="class">2.6.10. Class layout</A></H3>
676
<P>
677
Consider a class with no base classes: 
678
<PRE>
679
	class A {
680
	    // A's members
681
	} ;
682
</PRE>
683
Each object of class <I>A</I> needs its own copy of the non-static
684
data members of <I>A</I> and, for polymorphic types, a means of referencing
685
the virtual function table and run-time type information for <I>A</I>.
686
This is accomplished using a layout of the form: 
687
<CENTER>
688
<IMG SRC="../images/class.gif" ALT="class A">
689
</CENTER>
690
where the <I>A</I> component consists of the non-static data members
691
and 
692
<I>vptr A</I> is a pointer to the virtual function table for <I>A</I>.
693
For non-polymorphic classes the <I>vptr A</I> field is omitted; otherwise
694
space for <I>vptr A</I> needs to be allocated within the class and
695
the pointer needs to be initialised in each constructor for <I>A</I>.
696
The precise layout of the <A HREF="#vtable">virtual function table</A>
697
and the <A HREF="#rtti">run-time type information</A> is given below.
698
</P>
699
<P>
700
Two alternative ways of laying out the non-static data members within
701
the class are implemented.  The first, which is default, gives them
702
in the order in which they are declared in the class definition. 
703
The second lays out the <CODE>public</CODE>, the <CODE>protected</CODE>,
704
and the <CODE>private</CODE> members in three distinct sections, the
705
members within each section being given in the order in which they
706
are declared. The latter can be enabled using the <CODE>-jo</CODE>
707
command-line option. 
708
</P>
709
<P>
710
The offset of each member within the class (including <I>vptr A</I>)
711
can be calculated in terms of the offset of the previous member. 
712
The first member has offset zero.  The offset of any other member
713
is given by the offset of the previous member plus the size of the
714
previous member, rounded up to the alignment of the current member.
715
The overall size of the class is given by the offset of the last member
716
plus the size of the last member, rounded up using the token: 
717
<PRE>
718
	~comp_off : ( EXP OFFSET ) -&gt; EXP OFFSET
719
</PRE>
720
which allows for any target dependent padding at the end of the class.
721
The shape of the class is then a <CODE>compound</CODE> shape with
722
this offset. 
723
</P>
724
<P>
725
Classes with no members need to be treated slightly differently. 
726
The shape of such a class is given by the token: 
727
<PRE>
728
	~cpp.empty.shape : () -&gt; SHAPE
729
</PRE>
730
(recall that an empty class still has a nonzero size).  The token:
731
<PRE>
732
	~cpp.empty.offset : () -&gt; EXP OFFSET
733
</PRE>
734
is used to represent the offset required for an empty class when it
735
is used as a base class.  This may be a zero offset. 
736
</P>
737
<P>
738
Bitfield members provide a slight complication to the picture above.
739
The offset of a bitfield is additionally padded using the token: 
740
<PRE>
741
	~pad : ( EXP OFFSET, SHAPE, SHAPE ) -&gt; EXP OFFSET
742
</PRE>
743
where the two shapes give the type underlying the bitfield and the
744
bitfield itself. 
745
</P>
746
<P>
747
The layout of unions is similar to that of classes except that all
748
members have zero offset, and the size of the union is the maximum
749
of the sizes of its members, suitably padded.  Of course unions cannot
750
be polymorphic and cannot have base classes. 
751
</P>
752
<P>
753
Pointers to incomplete classes are represented by means of the alignment:
754
<PRE>
755
	~cpp.empty.align : () -&gt; ALIGNMENT
756
</PRE>
757
This token is also used for the alignment of a complete class if that
758
class is never used in the generated TDF in a manner which requires
759
it to be complete.  This can lead to savings on the size of the generated
760
code by preventing the need to define all the member offset tokens
761
in order to find the shape of the class. 
762
</P>
763
 
764
<HR>
765
<H3><A NAME="derive">2.6.11. Derived class layout</A></H3>
766
<P>
767
The description of the implementation of derived classes will be given
768
in terms of the example class hierarchy given by: 
769
<PRE>
770
	class A {
771
	    // A's members
772
	} ;
773
 
774
	class B : public A {
775
	    // B's members
776
	} ;
777
 
778
	class C : public A {
779
	    // C's members
780
	} ;
781
 
782
	class D : public B, public C {
783
	    // D's members
784
	} ;
785
</PRE>
786
or, as a directed acyclic graph: 
787
<CENTER>
788
<IMG SRC="../images/graph.gif" ALT="class D">
789
</CENTER>
790
 
791
<H4>Single inheritance</H4>
792
<P>
793
The layout of class <I>A</I> is given by: 
794
<CENTER>
795
<IMG SRC="../images/classA.gif" ALT="class A">
796
</CENTER>
797
as above.  Class <I>B</I> inherits all the members of class <I>A</I>
798
plus those members explicitly declared within class <I>B</I>.  In
799
addition, class <I>B</I> inherits all the virtual member functions
800
of <I>A</I>, some of which may be overridden in <I>B</I>, extended
801
by any additional virtual functions declared in <I>B</I>.  This may
802
be represented as follows: 
803
<CENTER>
804
<IMG SRC="../images/classB.gif" ALT="class B">
805
</CENTER>
806
where <I>A</I> denotes those members inherited from the base class
807
and 
808
<I>B</I> denotes those members added in the derived class.  Note that
809
an object of class <I>B</I> contains a sub-object of class <I>A</I>.
810
The fact that this sub-object is located at the start of <I>B</I>
811
means that the base class conversion from <I>B</I> to <I>A</I> is
812
trivial.  Any base class with this property is called a 
813
<A NAME="primary">primary base class</A>. 
814
</P>
815
<P>
816
Note that in theory two virtual function tables are required, the
817
normal virtual function table for <I>B</I>, denoted by <I>vtbl B</I>,
818
and a modified virtual function table for <I>A</I>, denoted by <I>vtbl
819
B::A</I>, taking into account any overriding virtual functions within
820
<I>B</I>, and pointing to <I>B</I>'s run-time type information.  This
821
latter means that the dynamic type information for the <I>A</I> sub-object
822
relates to 
823
<I>B</I> rather than <I>A</I>.  However these two tables can usually
824
be combined - if the virtual functions added in <I>B</I> are listed
825
in the virtual function table after those inherited from <I>A</I>
826
and the form of the overriding is <A HREF="#override">suitably well
827
behaved</A>
828
(in the sense defined below) then <I>vptr B::A</I> is an initial segment
829
of <I>vptr B</I>.  It is also possible to remove the <I>vptr B</I>
830
field and use <I>vptr B::A</I> in its place in this case (it has to
831
be this way round to preserve the <I>A</I> sub-object).  Thus the
832
items shaded in the diagram can be removed. 
833
</P>
834
<P>
835
The class <I>C</I> is similarly given by: 
836
<CENTER>
837
<IMG SRC="../images/classC.gif" ALT="class C">
838
</CENTER>
839
</P>
840
 
841
<H4>Multiple inheritance</H4>
842
<P>
843
Class <I>D</I> is more complex because of the presence of multiple
844
inheritance.  <I>D</I> inherits all the members of <I>B</I>, including
845
those which <I>B</I> inherits from <I>A</I>, plus all the members
846
of 
847
<I>C</I>, including those which <I>C</I> inherits from <I>A</I>. 
848
It also inherits all of the virtual member functions from <I>B</I>
849
and 
850
<I>C</I>, some of which may be overridden in <I>D</I>, extended by
851
any additional virtual functions declared in <I>D</I>.  This may be
852
represented as follows: 
853
<CENTER>
854
<IMG SRC="../images/classD.gif" ALT="class D">
855
</CENTER>
856
Note that there are two copies of <I>A</I> in <I>D</I> because virtual
857
inheritance has not been used. 
858
</P>
859
<P>
860
The <I>B</I> base class of <I>D</I> is essentially similar to the
861
single inheritance case already discussed; the <I>C</I> base class
862
is different however.  Note firstly that the <I>C</I> sub-object of
863
<I>D</I> is located at a non-zero offset, <I>delta D::C</I>, from
864
the start of the object. This means that the base class conversion
865
from <I>D</I> to <I>C</I>
866
consists of adding this offset (for pointer conversions things are
867
further complicated by the need to allow for null pointers).  Also
868
<I>vtbl D::C</I> is not an initial segment of <I>vtbl D</I> because
869
this contains the virtual functions inherited from <I>B</I> first,
870
followed by those inherited from <I>C</I>, followed by those first
871
declared in <I>D</I> (there are <A HREF="#override">other reasons</A>
872
as well).  Thus <I>vtbl D::C</I> cannot be eliminated. 
873
</P>
874
 
875
<H4>Virtual inheritance</H4>
876
<P>
877
Virtual inheritance introduces a further complication.  Now consider
878
the class hierarchy given by: 
879
<PRE>
880
	class A {
881
	    // A's members
882
	} ;
883
 
884
	class B : virtual public A {
885
	    // B's members
886
	} ;
887
 
888
	class C : virtual public A {
889
	    // C's members
890
	} ;
891
 
892
	class D : public B, public C {
893
	    // D's members
894
	} ;
895
</PRE>
896
or, as a <A NAME="diamond">directed acyclic graph</A>: 
897
<CENTER>
898
<IMG SRC="../images/diamond.gif" ALT="class D">
899
</CENTER>
900
As before <I>A</I> is given by: 
901
<CENTER>
902
<IMG SRC="../images/classA.gif" ALT="class A">
903
</CENTER>
904
but now <I>B</I> is given by: 
905
<CENTER>
906
<IMG SRC="../images/virtualB.gif" ALT="class B">
907
</CENTER>
908
Rather than having the sub-object of class <I>A</I> directly as part
909
of 
910
<I>B</I>, the class now contains a pointer, <I>ptr A</I>, to this
911
sub-object.  The virtual sub-objects are always located at the end
912
of a class layout; their offset may therefore vary for different objects,
913
however the offset for <I>ptr A</I> is always fixed.  The <I>ptr A</I>
914
field is initialised in each constructor for <I>B</I>.  In order to
915
perform the base class conversion from <I>B</I> to <I>A</I>, the contents
916
of <I>ptr A</I> are taken (again provision needs to be made for null
917
pointers in pointer conversions).  In cases when the dynamic type
918
of the <I>B</I> object can be determined statically it is possible
919
to access the <I>A</I> sub-object directly by adding a suitable offset.
920
Because this conversion is non-trivial (see <A HREF="#override">below</A>)
921
the virtual function table <I>vtbl B::A</I> is not an initial segment
922
of 
923
<I>vtbl B</I> and cannot be eliminated. 
924
</P>
925
<P>
926
The class <I>C</I> is similarly given by: 
927
<CENTER>
928
<IMG SRC="../images/virtualC.gif" ALT="class C">
929
</CENTER>
930
Now the class <I>D</I> is given by: 
931
<CENTER>
932
<IMG SRC="../images/virtualD.gif" ALT="class D">
933
</CENTER>
934
Note that there is a single <I>A</I> sub-object of <I>D</I> referenced
935
by the <I>ptr A</I> fields in both the <I>B</I> and <I>C</I> sub-objects.
936
The elimination of <I>vtbl D::B</I> is as above. 
937
</P>
938
 
939
<HR>
940
<H3><A NAME="constr">2.6.12. Constructors and destructors</A></H3>
941
<P>
942
The implementation of constructors and destructors, whether explicitly
943
or implicitly defined, is slightly more complex than that of other
944
member functions.  For example, the constructors need to set up the
945
internal <I>vptr</I> and <I>ptr</I> fields mentioned above. 
946
</P>
947
<P>
948
The order of initialisation in a constructor is as follows: 
949
<OL>
950
<LI>The internal <I>ptr</I> fields giving the locations of the virtual
951
base classes are initialised. 
952
<LI>The constructors for the virtual base classes are called. 
953
<LI>The constructors for the non-virtual direct base classes are called.
954
<LI>The internal <I>vptr</I> fields giving the locations of the virtual
955
function tables are initialised. 
956
<LI>The constructors for the members of the class are called. 
957
<LI>The main constructor body is executed. 
958
</OL>
959
To ensure that each virtual base is only initialised once, if a class
960
has a virtual base class then all its constructors have an implicit
961
extra parameter of type <CODE>int</CODE>.  The first two steps above
962
are then only applied if this flag is nonzero.  In normal applications
963
of the constructor this argument will be 1, however in base class
964
initialisations such as those in the third and fourth steps above,
965
it will be 0. 
966
</P>
967
<P>
968
Note that similar steps to protect virtual base classes are not taken
969
in an implicitly declared <CODE>operator=</CODE> function.  The order
970
of assignment in this case is as follows: 
971
<OL>
972
<LI>The assignment operators for the direct base classes (both virtual
973
and non-virtual) are called. 
974
<LI>The assignment operators for the members of the class are called.
975
<LI>A reference to the object assigned to (i.e. <CODE>*this</CODE>)
976
is     returned. 
977
</OL>
978
</P>
979
<P>
980
The order of destruction in a destructor is essentially the reverse
981
of the order of construction: 
982
<OL>
983
<LI>The main destructor body is executed. 
984
<LI>The destructor for the members of the class are called. 
985
<LI>The internal <I>vptr</I> fields giving the locations of the virtual
986
function tables are re-initialised. 
987
<LI>The destructors for the non-virtual direct base classes are called.
988
<LI>The destructors for the virtual base classes are called. 
989
<LI>If necessary the space occupied by the object is deallocated.
990
</OL>
991
All destructors have an extra parameter of type <CODE>int</CODE>.
992
The virtual base classes are only destroyed if this flag is nonzero
993
when and-ed with 2.  The space occupied by the object is only deallocated
994
if this flag is nonzero when and-ed with 1.  This deallocation is
995
equivalent to inserting:  
996
<PRE>
997
	delete this ;
998
</PRE>
999
in the destructor.  The <CODE>operator delete</CODE> function is called
1000
via the destructor in this way in order to implement the pseudo-virtual
1001
nature of these deallocation functions.  Thus for normal destructor
1002
calls the extra argument is 2, for base class destructor calls it
1003
is 0, and for calls arising from a <CODE>delete</CODE> expression
1004
it is 3. 
1005
</P>
1006
<P>
1007
The point at which the virtual function tables are initialised in
1008
the constructor, and the fact that they are re-initialised in the
1009
destructor, is to ensure that virtual functions called from base class
1010
initialisers are handled correctly (see ISO C++ 12.7). 
1011
</P>
1012
<P>
1013
A further complication arises from the need to destroy 
1014
<A NAME="partial">partially constructed objects</A> if an exception
1015
is thrown in a constructor.  A count is maintained of the number of
1016
base classes and members constructed within a constructor.  If an
1017
exception is thrown then it is caught in the constructor, the constructed
1018
base classes and members are destroyed, and the exception is re-thrown.
1019
The count variable is used to determine which bases and members need
1020
to be destroyed. 
1021
</P>
1022
<P>
1023
<IMG SRC="../images/warn.gif" ALT="warning"> These partial destructors
1024
currently do not interact correctly with any exception specification
1025
on the constructor.  Exceptions thrown within destructors are not
1026
correctly handled either. 
1027
</P>
1028
 
1029
<HR>
1030
<H3><A NAME="vtable">2.6.13. Virtual function tables</A></H3>
1031
<P>
1032
The virtual functions in a polymorphic class are given in its virtual
1033
function table in the following order: firstly those virtual functions
1034
inherited from its direct base classes (which may be overridden in
1035
the derived class) followed by those first declared in the derived
1036
class in the order in which they are declared.  Note that this can
1037
result in virtual functions inherited from virtual base classes appearing
1038
more than once.  The virtual functions are numbered from 1 (this is
1039
slightly more convenient than numbering from 0 in the default implementation).
1040
</P>
1041
<P>
1042
The virtual function table for this class has shape: 
1043
<PRE>
1044
	~cpp.vtab.type : ( NAT ) -&gt; SHAPE
1045
</PRE>
1046
the argument being <I>n + 1</I> where <I>n</I> is the number of virtual
1047
functions in the class (there is also a token: 
1048
<PRE>
1049
	~cpp.vtab.diag : () -&gt; SHAPE
1050
</PRE>
1051
which is used in the diagnostic output for a generic virtual function
1052
table).  The table is created using the token: 
1053
<PRE>
1054
	~cpp.vtab.make : ( EXP pti, EXP OFFSET, NAT, EXP NOF ) -&gt; EXP vt
1055
</PRE>
1056
where the first expression gives the address of the <A HREF="#rtti">run-time
1057
type information structure</A> for the class, the second expression
1058
gives the offset of the <I>vptr</I> field within the class (i.e. <I>voff</I>),
1059
the integer constant is <I>n + 1</I>, and the final expression is
1060
a 
1061
<CODE>make_nof</CODE> construct giving information on each of the
1062
<I>n</I>
1063
virtual functions. 
1064
</P>
1065
<P>
1066
The information given on each virtual function in this table has the
1067
form of a <A HREF="#ptr_mem_func">pointer to function member</A> formed
1068
using the token: 
1069
<PRE>
1070
	~cpp.pmf.make : ( EXP PROC, EXP OFFSET, EXP OFFSET ) -&gt; EXP pmf
1071
</PRE>
1072
as above, except that the third argument gives the offset of the base
1073
class in virtual function tables such as <I>vtbl B::A</I>.  For pure
1074
virtual functions the function pointer in this token is given by:
1075
<PRE>
1076
	~cpp.vtab.pure : () -&gt; EXP PROC
1077
</PRE>
1078
In the default implementation this gives a function 
1079
<CODE>__TCPPLUS_pure</CODE> which just calls <CODE>abort</CODE>. 
1080
</P>
1081
<P>
1082
To avoid duplicate copies of virtual function tables and run-time
1083
type information structures being created, the ARM algorithm is used.
1084
The virtual function table and run-time type information structure
1085
for a class are defined in the module containing the definition of
1086
the first non-inline, non-pure virtual function declared in that class.
1087
If such a function does not exist then duplicate copies are created
1088
in every module which requires them.  In the former case the virtual
1089
function table will have an <A HREF="#other">external tag name</A>;
1090
in the latter case it will be an internal tag.  This scheme can be
1091
overridden using the <CODE>-jv</CODE> command-line option, which causes
1092
local virtual function tables to be output for all classes. 
1093
</P>
1094
<P>
1095
Note that the discussion above applies to both simple virtual function
1096
tables, such as <I>vtbl B</I> above, and to those arising from base
1097
classes, such as <I>vtbl B::A</I>.  <A NAME="override">We are now
1098
in a position to precisely determine when <I>vtbl B::A</I> is an initial
1099
segment of <I>vtbl B</I> and hence can be eliminated</A>.  Firstly,
1100
<I>A</I> must be the first direct base class of <I>B</I> and cannot
1101
be virtual.  This is to ensure both that there are no virtual functions
1102
in <I>vtbl B</I> before those inherited from <I>A</I>, and that the
1103
corresponding base class conversion is trivial so that the pointers
1104
to function members of <I>B</I> comprising the virtual function table
1105
can be equally regarded as pointers to function members of <I>A</I>.
1106
The second requirement is that if a virtual function for <I>A</I>,
1107
<I>f</I>, is overridden in <I>B</I> then the return type for <I>B::f</I>
1108
cannot differ from the return type for <I>A::f</I> by a non-trivial
1109
conversion (recall that ISO C++ allows the return types to differ
1110
by a base class conversion).  In the non-trivial conversion case the
1111
function entered in <I>vtbl B::A</I> needs to be, not <I>B::f</I>
1112
as in <I>vtbl B</I>, but a stub function which calls <I>B::f</I> and
1113
converts its return value to the return type of <I>A::f</I>. 
1114
</P>
1115
 
1116
<H4>Calling virtual functions</H4>
1117
<P>
1118
The virtual function call mechanism is implemented using the token:
1119
<PRE>
1120
	~cpp.vtab.func : ( EXP ppvt, SIGNED_NAT ) -&gt; EXP ppmf
1121
</PRE>
1122
which has as its arguments a reference to the <I>vptr</I> field of
1123
the object the function is to be called for, and the number of the
1124
virtual function to be called.  It returns a reference to the corresponding
1125
pointer to function member within the object's virtual function table.
1126
The function is then called by extracting the base class offset to
1127
be added, and the function to be called, from this reference using
1128
the tokens: 
1129
<PRE>
1130
	~cpp.pmf.delta : ( ALIGNMENT a, EXP ppmf ) -&gt; EXP OFFSET ( a, a )
1131
	~cpp.pmf.func : ( EXP ppmf ) -&gt; EXP PROC
1132
</PRE>
1133
described as part of the <A HREF="#ptr_mem_func">pointer to function
1134
member call mechanism</A> above. 
1135
</P>
1136
 
1137
<HR>
1138
<H3><A NAME="rtti">2.6.14. Run-time type information</A></H3>
1139
<P>
1140
Each C++ type can be associated with a run-time type information structure
1141
giving information about that type.  These type information structures
1142
have shape given by the token: 
1143
<PRE>
1144
	~cpp.typeid.type : () -&gt; SHAPE
1145
</PRE>
1146
which corresponds to the representation for the standard type 
1147
<CODE>std::type_info</CODE> declared in the header 
1148
<CODE>&lt;typeinfo&gt;</CODE>.  Each type information structure consists
1149
of a tag number, giving information on the kind of type represented,
1150
a string literal, giving the name of the type, and a pointer to a
1151
list of base type information structures.  These are combined to give
1152
a type information structure using the token: 
1153
<PRE>
1154
	~cpp.typeid.make : ( SIGNED_NAT, EXP, EXP ) -&gt; EXP ti
1155
</PRE>
1156
Each base type information structure has shape given by the token:
1157
<PRE>
1158
	~cpp.baseid.type : () -&gt; SHAPE
1159
</PRE>
1160
It consists of a pointer to a type information structure, an expression
1161
used to describe the offset of a base class, a pointer to the next
1162
base type information structure in the list, and two integers giving
1163
information on type qualifiers etc.  These are combined to give a
1164
base type information structure using the token: 
1165
<PRE>
1166
	~cpp.baseid.make : ( EXP, EXP, EXP, SIGNED_NAT, SIGNED_NAT ) -&gt; EXP bi
1167
</PRE>
1168
</P>
1169
<P>
1170
The following table gives the various tag numbers used in type information
1171
structures plus a list of the base type information structures associated
1172
with each type.  Macros giving these tag numbers are provided in the
1173
default implementation in a header, <CODE>interface.h</CODE>, which
1174
is shared by the C++ producer. 
1175
</P>
1176
<P>
1177
<CENTER>
1178
<TABLE BORDER>
1179
<TR><TH>Type</TH>
1180
<TH>Form</TH>
1181
<TH>Tag</TH>
1182
<TH>Base information</TH>
1183
<TR><TD ALIGN=CENTER>integer</TD>
1184
<TD ALIGN=CENTER>-</TD>
1185
<TD ALIGN=CENTER>0</TD>
1186
<TD ALIGN=CENTER>-</TD>
1187
<TR><TD ALIGN=CENTER>floating point</TD>
1188
<TD ALIGN=CENTER>-</TD>
1189
<TD ALIGN=CENTER>1</TD>
1190
<TD ALIGN=CENTER>-</TD>
1191
<TR><TD ALIGN=CENTER>void</TD>
1192
<TD ALIGN=CENTER>-</TD>
1193
<TD ALIGN=CENTER>2</TD>
1194
<TD ALIGN=CENTER>-</TD>
1195
<TR><TD ALIGN=CENTER>class or struct</TD>
1196
<TD ALIGN=CENTER>class T</TD>
1197
<TD ALIGN=CENTER>3</TD>
1198
<TD ALIGN=CENTER>[base,access,virtual], ....</TD>
1199
<TR><TD ALIGN=CENTER>union</TD>
1200
<TD ALIGN=CENTER>union T</TD>
1201
<TD ALIGN=CENTER>4</TD>
1202
<TD ALIGN=CENTER>-</TD>
1203
<TR><TD ALIGN=CENTER>enumeration</TD>
1204
<TD ALIGN=CENTER>enum T</TD>
1205
<TD ALIGN=CENTER>5</TD>
1206
<TD ALIGN=CENTER>-</TD>
1207
<TR><TD ALIGN=CENTER>pointer</TD>
1208
<TD ALIGN=CENTER>cv T *</TD>
1209
<TD ALIGN=CENTER>6</TD>
1210
<TD ALIGN=CENTER>[T,cv,0]</TD>
1211
<TR><TD ALIGN=CENTER>reference</TD>
1212
<TD ALIGN=CENTER>cv T &amp;</TD>
1213
<TD ALIGN=CENTER>7</TD>
1214
<TD ALIGN=CENTER>[T,cv,0]</TD>
1215
<TR><TD ALIGN=CENTER>pointer to member</TD>
1216
<TD ALIGN=CENTER>cv T S::*</TD>
1217
<TD ALIGN=CENTER>8</TD>
1218
<TD ALIGN=CENTER>[S,0,0], [T,cv,0]</TD>
1219
<TR><TD ALIGN=CENTER>array</TD>
1220
<TD ALIGN=CENTER>cv T [n]</TD>
1221
<TD ALIGN=CENTER>9</TD>
1222
<TD ALIGN=CENTER>[T,cv,n]</TD>
1223
<TR><TD ALIGN=CENTER>bitfield</TD>
1224
<TD ALIGN=CENTER>cv T : n</TD>
1225
<TD ALIGN=CENTER>10</TD>
1226
<TD ALIGN=CENTER>[T,cv,n]</TD>
1227
<TR><TD ALIGN=CENTER>C++ function</TD>
1228
<TD ALIGN=CENTER>cv T ( S1, ...., Sn )</TD>
1229
<TD ALIGN=CENTER>11</TD>
1230
<TD ALIGN=CENTER>[T,cv,0], [S1,0,0], ...., [Sn,0,0]</TD>
1231
<TR><TD ALIGN=CENTER>C function</TD>
1232
<TD ALIGN=CENTER>cv T ( S1, ...., Sn )</TD>
1233
<TD ALIGN=CENTER>12</TD>
1234
<TD ALIGN=CENTER>[T,cv,0], [S1,0,0], ...., [Sn,0,0]</TD>
1235
</TABLE>
1236
</CENTER>
1237
</P>
1238
<P>
1239
In the form column <CODE>cv T</CODE> is used to denote not only the
1240
normal cv-qualifiers but, when <CODE>T</CODE> is a function type,
1241
the member function cv-qualifiers.  Arrays with an unspecified bound
1242
are treated as if their bound was zero.  Functions with ellipsis are
1243
treated as if they had an extra parameter of a dummy type named 
1244
<CODE>...</CODE> (see below).  Note the distinction between C++ and
1245
C function types. 
1246
</P>
1247
<P>
1248
Each base type information structure is described as a triple consisting
1249
of a type and two integers.  One of these integers may be used to
1250
encode a type qualifier, <CODE>cv</CODE>, as follows: 
1251
</P>
1252
<P>
1253
<CENTER>
1254
<TABLE BORDER>
1255
<TR><TH>Qualifier</TH>   <TH>Encoding</TH>
1256
<TR><TD ALIGN=CENTER>none</TD>  <TD ALIGN=CENTER>0</TD>
1257
<TR><TD ALIGN=CENTER>const</TD>  <TD ALIGN=CENTER>1</TD>
1258
<TR><TD ALIGN=CENTER>volatile</TD> <TD ALIGN=CENTER>2</TD>
1259
<TR><TD ALIGN=CENTER>const volatile</TD><TD ALIGN=CENTER>3</TD>
1260
</TABLE>
1261
</CENTER>
1262
</P>
1263
<P>
1264
The base type information for a class consists of information on each
1265
of its direct base classes.  The includes the offset of this base
1266
within the class (for a virtual base class this is the offset of the
1267
corresponding 
1268
<I>ptr</I> field), whether the base is virtual (1) or not (0), and
1269
the base class access, encoded as follows: 
1270
</P>
1271
<P>
1272
<CENTER>
1273
<TABLE BORDER>
1274
<TR><TH>Access</TH>   <TH>Encoding</TH>
1275
<TR><TD ALIGN=CENTER>public</TD> <TD ALIGN=CENTER>0</TD>
1276
<TR><TD ALIGN=CENTER>protected</TD> <TD ALIGN=CENTER>1</TD>
1277
<TR><TD ALIGN=CENTER>private</TD> <TD ALIGN=CENTER>2</TD>
1278
</TABLE>
1279
</CENTER>
1280
</P>
1281
<P>
1282
For example, the run-time type information structures for the classes
1283
declared in the <A HREF="#diamond">diamond lattice</A> above can be
1284
represented as follows: 
1285
<CENTER>
1286
<IMG SRC="../images/rttiD.gif" ALT="typeid D">
1287
</CENTER>
1288
</P>
1289
 
1290
<H4>Defining run-time type information structures</H4>
1291
<P>
1292
For built-in types, the run-time type information structure may be
1293
referenced by the token: 
1294
<PRE>
1295
	~cpp.typeid.basic : ( SIGNED_NAT ) -&gt; EXP pti
1296
</PRE>
1297
where the argument gives the encoding of the type as given in the
1298
following table: 
1299
</P>
1300
<CENTER>
1301
<TABLE BORDER>
1302
<TR><TH>Type</TH>   <TH>Encoding</TH>
1303
<TH>Type</TH>   <TH>Encoding</TH>
1304
<TR><TD ALIGN=CENTER>char</TD>  <TD ALIGN=CENTER>0</TD>
1305
<TD ALIGN=CENTER>unsigned long</TD> <TD ALIGN=CENTER>11</TD>
1306
<TR><TD ALIGN=CENTER>(error)</TD> <TD ALIGN=CENTER>1</TD>
1307
<TD ALIGN=CENTER>float</TD>  <TD ALIGN=CENTER>12</TD>
1308
<TR><TD ALIGN=CENTER>void</TD>  <TD ALIGN=CENTER>2</TD>
1309
<TD ALIGN=CENTER>double</TD> <TD ALIGN=CENTER>13</TD>
1310
<TR><TD ALIGN=CENTER>(bottom)</TD> <TD ALIGN=CENTER>3</TD>
1311
<TD ALIGN=CENTER>long double</TD> <TD ALIGN=CENTER>14</TD>
1312
<TR><TD ALIGN=CENTER>signed char</TD> <TD ALIGN=CENTER>4</TD>
1313
<TD ALIGN=CENTER>wchar_t</TD> <TD ALIGN=CENTER>16</TD>
1314
<TR><TD ALIGN=CENTER>signed short</TD> <TD ALIGN=CENTER>5</TD>
1315
<TD ALIGN=CENTER>bool</TD>  <TD ALIGN=CENTER>17</TD>
1316
<TR><TD ALIGN=CENTER>signed int</TD> <TD ALIGN=CENTER>6</TD>
1317
<TD ALIGN=CENTER>(ptrdiff_t)</TD> <TD ALIGN=CENTER>18</TD>
1318
<TR><TD ALIGN=CENTER>signed long</TD> <TD ALIGN=CENTER>7</TD>
1319
<TD ALIGN=CENTER>(size_t)</TD> <TD ALIGN=CENTER>19</TD>
1320
<TR><TD ALIGN=CENTER>unsigned char</TD> <TD ALIGN=CENTER>8</TD>
1321
<TD ALIGN=CENTER>(...)</TD>  <TD ALIGN=CENTER>20</TD>
1322
<TR><TD ALIGN=CENTER>unsigned short</TD><TD ALIGN=CENTER>9</TD>
1323
<TD ALIGN=CENTER>signed long long</TD>
1324
<TD ALIGN=CENTER>23</TD>
1325
<TR><TD ALIGN=CENTER>unsigned int</TD> <TD ALIGN=CENTER>10</TD>
1326
<TD ALIGN=CENTER>unsigned long long</TD>
1327
<TD ALIGN=CENTER>27</TD>
1328
</TABLE>
1329
</CENTER>
1330
<P>
1331
Note that the encoding for the basic integral types is the same as
1332
that 
1333
<A HREF="#arith">given above</A>.  The other types are assigned to
1334
unused values.  Note that the encodings for <CODE>ptrdiff_t</CODE>
1335
and 
1336
<CODE>size_t</CODE> are not used, instead that for their implementation
1337
is used (using the standard tokens <CODE>ptrdiff_t</CODE> and 
1338
<CODE>size_t</CODE>).  The encodings for <CODE>bool</CODE> and 
1339
<CODE>wchar_t</CODE> are used because they are conceptually distinct
1340
types even though they are implemented as one of the basic integral
1341
types.  The type labelled <CODE>...</CODE> is the dummy used in the
1342
representation of ellipsis functions.  The default implementation
1343
uses an array of type information structures, <CODE>__TCPPLUS_typeid</CODE>,
1344
to implement <CODE>~cpp.typeid.basic</CODE>. 
1345
</P>
1346
<P>
1347
The run-time type information structures for classes are defined in
1348
the same place as their <A HREF="#vtable">virtual function tables</A>.
1349
Other run-time type information structures are defined in whatever
1350
modules require them.  In the former case the type information structure
1351
will have an <A HREF="#other">external tag name</A>; in the latter
1352
case it will be an internal tag. 
1353
</P>
1354
 
1355
<H4>Accessing run-time type information</H4>
1356
<P>
1357
The primary means of accessing the run-time type information for an
1358
object is using the <CODE>typeid</CODE> construct.  In cases where
1359
the operand type can be determined statically, the address of the
1360
corresponding type information structure is returned.  In other cases
1361
the token: 
1362
<PRE>
1363
	~cpp.typeid.ref : ( EXP ppvt ) -&gt; EXP pti
1364
</PRE>
1365
is used, where the argument gives a reference to the <I>vptr</I> field
1366
of the object being checked.  From this information it is trivial
1367
to trace the corresponding type information. 
1368
</P>
1369
<P>
1370
Another means of querying the run-time type information for an object
1371
is using the <CODE>dynamic_cast</CODE> construct.  When the result
1372
cannot be determined statically, this is implemented using the token:
1373
<PRE>
1374
	~cpp.dynam.cast : ( EXP ppvt, EXP pti ) -&gt; EXP pv
1375
</PRE>
1376
where the first expression gives a reference to the <I>vptr</I> field
1377
of the object being cast and the second gives the run-time type information
1378
for the type being cast to.  In the default implementation this token
1379
is implemented by the procedure <CODE>__TCPPLUS_dynamic_cast</CODE>.
1380
The key point to note is that the virtual function table contains
1381
the offset, <I>voff</I>, of the <I>vptr</I> field from the start of
1382
the most complete object.  Thus it is possible to find the address
1383
of the most complete object.  The run-time type information contains
1384
enough information to determine whether this object has a sub-object
1385
of the type being cast to, and if so, how to find the address of this
1386
sub-object.  The result is returned as a <CODE>void *</CODE>, with
1387
the null pointer indicating that the conversion is not possible. 
1388
</P>
1389
 
1390
<HR>
1391
<H3><A NAME="init">2.6.15. Dynamic initialisation</A></H3>
1392
<P>
1393
The dynamic initialisation of variables with static storage duration
1394
in C++ is implemented by means of the TDF <CODE>initial_value</CODE>
1395
construct.  However in order for the producer to maintain control
1396
over the order of initialisation, rather than each variable being
1397
initialised separately using <CODE>initial_value</CODE>, a single
1398
expression is created which initialises all the variables in a module,
1399
and this initialiser expression is used to initialise a single dummy
1400
variable using <CODE>initial_value</CODE>.  Note that, while this
1401
enables the variables within a single module to be initialised in
1402
the order in which they are defined, the order of initialisation between
1403
different modules is unspecified. 
1404
</P>
1405
<P>
1406
The implementation needs to keep a list of those variables with static
1407
storage duration which have been initialised so that it can call the
1408
destructors for these objects at the end of the program. This is done
1409
by declaring a variable of shape: 
1410
<PRE>
1411
	~cpp.destr.type : () -&gt; SHAPE
1412
</PRE>
1413
for each such object with a non-trivial destructor.  Each element
1414
of an array is considered a distinct object.  Immediately after the
1415
variable has been initialised the token: 
1416
<PRE>
1417
	~cpp.destr.global : ( EXP pd, EXP POINTER c, EXP PROC ) -&gt; EXP TOP
1418
</PRE>
1419
is called to add the variable to the list of objects to be destroyed.
1420
The first argument is the address of the dummy variable just declared,
1421
the second is the address of the object to be destroyed, and the third
1422
is the destructor to be used.  In this way a list giving the objects
1423
to be destroyed, and the order in which to destroy them, is built
1424
up.  Note that partially constructed objects are destroyed within
1425
their constructors (see <A HREF="#partial">above</A>) so that only
1426
completely constructed objects need to be considered. 
1427
</P>
1428
<P>
1429
The implementation also needs to ensure that it calls the destructors
1430
in this list at the end of the program, including calls of 
1431
<CODE>exit</CODE>.  This is done by calling the token: 
1432
<PRE>
1433
	~cpp.destr.init : () -&gt; EXP TOP
1434
</PRE>
1435
at the start of each <CODE>initial_value</CODE> construct.  In the
1436
default implementation this uses <CODE>atexit</CODE> to register a
1437
function, <CODE>__TCPPLUS_term</CODE>, which calls the destructors.
1438
To aid alternative implementations the token: 
1439
<PRE>
1440
	~cpp.start : () -&gt; EXP TOP
1441
</PRE>
1442
is called at the start of the <CODE>main</CODE> function, however
1443
this has no effect in the default implementation. 
1444
</P>
1445
 
1446
<HR>
1447
<H3><A NAME="except">2.6.16. Exception handling</A></H3>
1448
<P>
1449
Conceptually, exception handling can be described in terms of the
1450
following diagram: 
1451
<CENTER>
1452
<IMG SRC="../images/try.gif" ALT="try stack">
1453
</CENTER>
1454
At any point in the execution of the program there is a stack of currently
1455
active <CODE>try</CODE> blocks and currently active local variables.
1456
A 
1457
<CODE>try</CODE> block is pushed onto the stack as it is entered and
1458
popped from the stack when it is left (whether directly or via a jump).
1459
A local variable with a non-trivial destructor is pushed onto the
1460
stack just after its constructor has been called at the start of its
1461
scope, and popped from the stack just before its destructor is called
1462
at the end of its scope (including before jumps out of its scope).
1463
Each element of an array is considered a separate object.  Each <CODE>try</CODE>
1464
block has an associated list of handlers.  Each local variable has
1465
an associated destructor. 
1466
</P>
1467
<P>
1468
Provided no exception is thrown this stack grows and shrinks in a
1469
well-behaved manner as execution proceeds.  When an exception is thrown
1470
an exception manager is invoked to find a matching exception handler.
1471
The exception manager proceeds to execute a loop to unwind the stack
1472
as follows.  If the stack is empty then the exception cannot be caught
1473
and 
1474
<CODE>std::terminate</CODE> is called.  Otherwise the top element
1475
is popped from the stack.  If this is a local variable then the associated
1476
destructor is called for the variable.  If the top element is a 
1477
<CODE>try</CODE> block then the current exception is compared in turn
1478
to each of the associated handlers.  If a match is found then execution
1479
jumps to the handler body, otherwise the exception manager continues
1480
to the next element of the stack. 
1481
</P>
1482
<P>
1483
Note that this description is purely conceptual.  There is no need
1484
for exception handling to be implemented by a stack in this way (although
1485
the default implementation uses a similar technique).  It does however
1486
serve to illustrate the various stages which must exist in any implementation.
1487
</P>
1488
 
1489
<H4>Try blocks</H4>
1490
<P>
1491
At the start of a <CODE>try</CODE> block a variable of shape: 
1492
<PRE>
1493
	~cpp.try.type : () -&gt; SHAPE
1494
</PRE>
1495
is declared corresponding to the stack element for this block.  This
1496
is then initialised using the token: 
1497
<PRE>
1498
	~cpp.try.begin : ( EXP ptb, EXP POINTER fa, EXP POINTER ca ) -&gt; EXP TOP
1499
</PRE>
1500
</P>
1501
where the first argument is a pointer to this variable, the second
1502
argument is the TDF <CODE>current_env</CODE> construct, and the third
1503
argument is the result of the TDF <CODE>make_local_lv</CODE> construct
1504
on the label which is used to mark the first handler associated with
1505
the block.  Note that the last two arguments enable a TDF 
1506
<CODE>long_jump</CODE> construct to be applied to transfer control
1507
to the first handler. 
1508
<P>
1509
When control exits from a <CODE>try</CODE> block, whether by reaching
1510
the end of the block or jumping out of it, the block is removed from
1511
the stack using the token: 
1512
<PRE>
1513
	~cpp.try.end : ( EXP ptb ) -&gt; EXP TOP
1514
</PRE>
1515
where the argument is a pointer to the <CODE>try</CODE> block variable.
1516
</P>
1517
 
1518
<H4>Local variables</H4>
1519
<P>
1520
The technique used to add a local variable with a non-trivial destructor
1521
to the stack is similar to that used in the dynamic initialisation
1522
of global variables.  A local variable of shape <CODE>~cpp.destr.type</CODE>
1523
is declared at the start of the variable scope.  This is initialised
1524
just after the constructor for the variable is called using the token:
1525
<PRE>
1526
	~cpp.destr.local : ( EXP pd, EXP POINTER c, EXP PROC ) -&gt; EXP TOP
1527
</PRE>
1528
where the first argument is a pointer to the variable being initialised,
1529
the  second is a pointer to the local variable to be destroyed, and
1530
the third is the destructor to be called.  At the end of the variable
1531
scope, just before its destructor is called, the token: 
1532
<PRE>
1533
	~cpp.destr.end : ( EXP pd ) -&gt; EXP TOP
1534
</PRE>
1535
where the argument is a pointer to destructor variable, is called
1536
to remove the local variable destructor from the stack.  Note that
1537
partially constructed objects are destroyed within their constructors
1538
(see 
1539
<A HREF="#partial">above</A>) so that only completely constructed
1540
objects need to be considered. 
1541
</P>
1542
<P>
1543
In cases where the local variable may be conditionally initialised
1544
(for example a temporary variable in the second operand of a <CODE>||</CODE>
1545
operation) the local variable of shape <CODE>~cpp.destr.type</CODE>
1546
is initialised to the value given by the token: 
1547
<PRE>
1548
	~cpp.destr.null : () -&gt; EXP d
1549
</PRE>
1550
(normally it is  left uninitialised).  Before the destructor for this
1551
variable is called the value of the token: 
1552
<PRE>
1553
	~cpp.destr.ptr : ( EXP pd ) -&gt; EXP POINTER c
1554
</PRE>
1555
is tested.  If <CODE>~cpp.destr.local</CODE> has been called for this
1556
variable then this token returns a pointer to the variable, otherwise
1557
it returns a null pointer.  The token <CODE>~cpp.destr.end</CODE>
1558
and the destructor are only called if this token indicates that the
1559
variable has been initialised. 
1560
</P>
1561
 
1562
<H4>Throwing an exception</H4>
1563
<P>
1564
When a <CODE>throw</CODE> expression with an argument is encountered
1565
a number of steps performed.  Firstly, space is allocated to hold
1566
the exception value using the token: 
1567
<PRE>
1568
	~cpp.except.alloc : ( EXP VARIETY size_t ) -&gt; EXP pv
1569
</PRE>
1570
the argument of which gives the size of the value.  The space allocated
1571
is returned as an expression of type <CODE>void *</CODE>.  Secondly,
1572
the exception value is copied into the space allocated, using a copy
1573
constructor if appropriate.  Finally the exception is raised using
1574
the token: 
1575
<PRE>
1576
	~cpp.except.throw : ( EXP pv, EXP pti, EXP PROC ) -&gt; EXP BOTTOM
1577
</PRE>
1578
The first argument gives the pointer to the exception value, returned
1579
by 
1580
<CODE>~cpp.except.alloc</CODE>, the second argument gives a pointer
1581
to the run-time type information for the exception type, and the third
1582
argument gives the destructor to be called to destroy the exception
1583
value (if any). This token sets the current exception to the given
1584
values and invokes the exception manager as above. 
1585
</P>
1586
<P>
1587
A <CODE>throw</CODE> expression without an argument results in a call
1588
to the token: 
1589
<PRE>
1590
	~cpp.except.rethrow : () -&gt EXP BOTTOM
1591
</PRE>
1592
which re-invokes the exception manager with the current exception.
1593
If there is no current exception then the implementation should call
1594
<CODE>std::terminate</CODE>. 
1595
</P>
1596
 
1597
<H4>Handling an exception</H4>
1598
<P>
1599
The exception manager proceeds to find an exception in the manner
1600
described above, unwinding the stack and calling destructors for local
1601
variables.  When a <CODE>try</CODE> block is popped from the stack
1602
a TDF <CODE>long_jump</CODE> is applied to transfer control to its
1603
list of handlers.  For each handler in turn it is checked whether
1604
the handler can catch the current exception.  For <CODE>...</CODE>
1605
handlers this is always true; for other handlers it is checked using
1606
the token: 
1607
<PRE>
1608
	~cpp.except.catch : ( EXP pti ) -&gt; EXP VARIETY int
1609
</PRE>
1610
where the argument is a pointer to the run-time type information for
1611
the handler type.  This token gives 1 if the exception is caught by
1612
this handler, and 0 otherwise.  If the exception is not caught by
1613
the handler then the next handler is checked, until there are no more
1614
handlers associated with the <CODE>try</CODE> block.  In this case
1615
control is passed back to the exception manager by re-throwing the
1616
current exception using <CODE>~cpp.except.rethrow</CODE>. 
1617
</P>
1618
<P>
1619
If an exception is caught by a handler then a number of steps are
1620
performed. Firstly, if appropriate, the handler variable is initialised
1621
by copying the current exception value.  A pointer to the current
1622
exception value can be obtained using the token: 
1623
<PRE>
1624
	~cpp.except.value : () -&gt; EXP pv
1625
</PRE>
1626
Once this initialisation is complete the token: 
1627
<PRE>
1628
	~cpp.except.caught : () -&gt; EXP TOP
1629
</PRE>
1630
is called to indicate that the exception has been caught.  The handler
1631
body is then executed.  When control exits from the handler, whether
1632
by reaching the end of the handler or by jumping out of it, the token:
1633
<PRE>
1634
	~cpp.except.end : () -&gt; EXP TOP
1635
</PRE>
1636
is called to indicate that the exception has been completed.  Note
1637
that the implementation should call the destructor for the current
1638
exception and free the space allocated by <CODE>~cpp.except.alloc</CODE>
1639
at this point. Execution then continues with the statement following
1640
the handler. 
1641
</P>
1642
<P>
1643
To conclude, the TDF generated for a <CODE>try</CODE> block and its
1644
associated list of handlers has the form: 
1645
<PRE>
1646
	variable (
1647
	    long_jump_access,
1648
	    stack_tag,
1649
	    make_value ( ~cpp.try.type ),
1650
	    conditional (
1651
		handler_label,
1652
		sequence (
1653
		    ~cpp.try.begin (
1654
			obtain_tag ( stack_tag ),
1655
			current_env,
1656
			make_local_lv ( handler_label ) ),
1657
			<I>try-block-body</I>,
1658
			~cpp.try.end ),
1659
		    conditional (
1660
			catch_label_1,
1661
			sequence (
1662
			    integer_test (
1663
				not_equal,
1664
				catch_label_1,
1665
				~cpp.except.catch (
1666
				    <I>handler-1-typeid</I> ) )
1667
			    variable (
1668
				handler_tag_1,
1669
				<I>handler-1-init</I> (
1670
				    ~cpp.except.value ),
1671
				sequence (
1672
				    ~cpp.except.caught,
1673
				    <I>handler-1-body</I> ) )
1674
			    ~cpp.except.end )
1675
			conditional (
1676
			    catch_label_2,
1677
			    <I>further-handlers</I>,
1678
			    ~cpp.except.rethrow ) ) ) )
1679
</PRE>
1680
</P>
1681
<P>
1682
Note that for a local variable to maintain its previous value when
1683
an  exception is caught in this way it is necessary to declare it
1684
using the TDF <CODE>long_jump_access</CODE> construct.  Any local
1685
variable which contains a <CODE>try</CODE> block in its scope is declared
1686
in this way. 
1687
</P>
1688
<P>
1689
To aid implementations in the writing of exception managers the following
1690
standard tokens are provided: 
1691
<PRE>
1692
	~cpp.ptr.code : () -&gt; SHAPE POINTER ca
1693
	~cpp.ptr.frame : () -&gt; SHAPE POINTER fa
1694
	~cpp.except.jump : ( EXP POINTER fa, EXP POINTER ca ) -&gt; EXP BOTTOM
1695
</PRE>
1696
These give the shape of the TDF <CODE>make_local_lv</CODE> construct,
1697
the shape of the TDF <CODE>current_env</CODE> construct, and direct
1698
access to the TDF <CODE>long_jump</CODE> access.  The exception manager
1699
in the default implementation is a function called <CODE>__TCPPLUS_throw</CODE>.
1700
</P>
1701
 
1702
<H4>Exception specifications</H4>
1703
<P>
1704
If a function is declared with an exception specification then extra
1705
code needs to be generated in the function definition to catch any
1706
unexpected exceptions thrown by the function and to call <CODE>std::unexpected
1707
</CODE>. Since this is a potentially high overhead for small functions,
1708
this extra code is not generated if it can be proved that such unexpected
1709
exceptions can never be thrown (the analysis is essentially the same
1710
as that in the 
1711
<A HREF="pragma.html#exception">exception analysis</A> check). 
1712
</P>
1713
<P>
1714
The implementation of exception specification is to enclose the entire
1715
function definition in a <CODE>try</CODE> block.  The handler for
1716
this block uses <CODE>~cpp.except.catch</CODE> to check whether the
1717
current exception can be caught by any of the types listed in the
1718
exception specification.  If so the current exception is re-thrown.
1719
If none of these types catch the current exception then the token:
1720
<PRE>
1721
	~cpp.except.bad : ( SIGNED_NAT ) -&gt; EXP TOP
1722
</PRE>
1723
is called.  The argument is 1 if the exception specification includes
1724
the special type <CODE>std::bad_exception</CODE>, and 0 otherwise.
1725
The implementation should call <CODE>std::unexpected</CODE>, but how
1726
any exceptions thrown during this call are to be handled depends on
1727
the value of the argument. 
1728
</P>
1729
 
1730
<HR>
1731
<H3><A NAME="mangle">2.6.17. Mangled identifier names</A></H3>
1732
<P>
1733
In a similar fashion to other C++ compilers, the C++ producer needs
1734
a method of mapping C++ identifiers to a form suitable for further
1735
processing, namely TDF tag names.  This mangled name contains an encoding
1736
of the identifier name, its parent namespace or class and its type.
1737
Identifiers with C linkage are not mangled.  The producer contains
1738
a built-in <A HREF="man.html#unmangle">name unmangler</A>
1739
which performs the reverse operation of transforming the mangled form
1740
of an identifier name back to the underlying identifier.  This can
1741
be useful when analysing system linker errors. 
1742
</P>
1743
<P>
1744
Note that the type of an identifier forms part of its mangled name
1745
not only for functions, but also for variables.  Many other compilers
1746
do not mangle variable names, however the ISO C++ rules on namespaces
1747
and variables with C linkage make it necessary (this can be suppressed
1748
using the <CODE>-j-n</CODE> command-line option).  Declaring the language
1749
linkage of a variable inconsistently can therefore lead to linking
1750
errors with the C++ producer which are not detected by other compilers.
1751
A common example is: 
1752
<PRE>
1753
	extern int errno ;
1754
</PRE>
1755
which, leaving aside whether <CODE>errno</CODE> is actually an external
1756
variable, should be: 
1757
<PRE>
1758
	extern &quot;C&quot; int errno ;
1759
</PRE>
1760
</P>
1761
<P>
1762
As described above, the mangled form of an identifier has three components;
1763
the identifier name, the identifier namespace and the identifier type.
1764
Two underscores (<CODE>__</CODE>) are used to separate the name component
1765
from the namespace and type components.  The mangling scheme used
1766
is based on that described in the ARM.  The description below is not
1767
complete; the mangling and unmangling routines themselves should be
1768
consulted for a complete description. 
1769
</P>
1770
 
1771
<H4>Mangling identifier names</H4>
1772
<P>
1773
Simple identifier names are mapped to themselves.  Unicode characters
1774
of the forms <CODE>\u</CODE><I>xxxx</I> and <CODE>\U</CODE><I>xxxxxxxx</I>
1775
are mapped to <CODE>__k</CODE><I>xxxx</I> and <CODE>__K</CODE><I>xxxxxxxx</I>
1776
respectively, where the hex digits are output in their canonical lower-case
1777
form.  Constructors are mapped to <CODE>__ct</CODE> and destructors
1778
to <CODE>__dt</CODE>.  Conversions functions are mapped to 
1779
<CODE>__op</CODE><I>type</I> where <I>type</I> is the mangled form
1780
of the conversion type.  Overloaded operator functions, 
1781
<CODE>operator@</CODE>, are mapped as follows: 
1782
</P>
1783
<CENTER>
1784
<TABLE BORDER>
1785
<TR><TH>Operator</TH>   <TH>Mapping</TH>
1786
<TH>Operator</TH>   <TH>Mapping</TH>
1787
<TH>Operator</TH>   <TH>Mapping</TH>
1788
<TR><TD ALIGN=CENTER>&amp;</TD>  <TD ALIGN=CENTER>__ad</TD>
1789
<TD ALIGN=CENTER>&amp;=</TD> <TD ALIGN=CENTER>__aad</TD>
1790
<TD ALIGN=CENTER>[]</TD>  <TD ALIGN=CENTER>__vc</TD>
1791
<TR><TD ALIGN=CENTER>-&gt;</TD>  <TD ALIGN=CENTER>__rf</TD>
1792
<TD ALIGN=CENTER>-&gt;*</TD> <TD ALIGN=CENTER>__rm</TD>
1793
<TD ALIGN=CENTER>=</TD>  <TD ALIGN=CENTER>__as</TD>
1794
<TR><TD ALIGN=CENTER>,</TD>  <TD ALIGN=CENTER>__cm</TD>
1795
<TD ALIGN=CENTER>~</TD>  <TD ALIGN=CENTER>__co</TD>
1796
<TD ALIGN=CENTER>/</TD>  <TD ALIGN=CENTER>__dv</TD>
1797
<TR><TD ALIGN=CENTER>/=</TD>  <TD ALIGN=CENTER>__adv</TD>
1798
<TD ALIGN=CENTER>==</TD>  <TD ALIGN=CENTER>__eq</TD>
1799
<TD ALIGN=CENTER>()</TD>  <TD ALIGN=CENTER>__cl</TD>
1800
<TR><TD ALIGN=CENTER>&gt;</TD>  <TD ALIGN=CENTER>__gt</TD>
1801
<TD ALIGN=CENTER>&gt;=</TD>  <TD ALIGN=CENTER>__ge</TD>
1802
<TD ALIGN=CENTER>&lt;</TD>  <TD ALIGN=CENTER>__lt</TD>
1803
<TR><TD ALIGN=CENTER>&lt;=</TD>  <TD ALIGN=CENTER>__le</TD>
1804
<TD ALIGN=CENTER>&amp;&amp;</TD> <TD ALIGN=CENTER>__aa</TD>
1805
<TD ALIGN=CENTER>||</TD>  <TD ALIGN=CENTER>__oo</TD>
1806
<TR><TD ALIGN=CENTER>&lt;&lt;</TD> <TD ALIGN=CENTER>__ls</TD>
1807
<TD ALIGN=CENTER>&lt;&lt;=</TD> <TD ALIGN=CENTER>__als</TD>
1808
<TD ALIGN=CENTER>-</TD>  <TD ALIGN=CENTER>__mi</TD>
1809
<TR><TD ALIGN=CENTER>-=</TD>  <TD ALIGN=CENTER>__ami</TD>
1810
<TD ALIGN=CENTER>--</TD>  <TD ALIGN=CENTER>__mm</TD>
1811
<TD ALIGN=CENTER>!</TD>  <TD ALIGN=CENTER>__nt</TD>
1812
<TR><TD ALIGN=CENTER>!=</TD>  <TD ALIGN=CENTER>__ne</TD>
1813
<TD ALIGN=CENTER>|</TD>  <TD ALIGN=CENTER>__or</TD>
1814
<TD ALIGN=CENTER>|=</TD>  <TD ALIGN=CENTER>__aor</TD>
1815
<TR><TD ALIGN=CENTER>+</TD>  <TD ALIGN=CENTER>__pl</TD>
1816
<TD ALIGN=CENTER>+=</TD>  <TD ALIGN=CENTER>__apl</TD>
1817
<TD ALIGN=CENTER>++</TD>  <TD ALIGN=CENTER>__pp</TD>
1818
<TR><TD ALIGN=CENTER>%</TD>  <TD ALIGN=CENTER>__md</TD>
1819
<TD ALIGN=CENTER>%=</TD>  <TD ALIGN=CENTER>__amd</TD>
1820
<TD ALIGN=CENTER>&gt;&gt;</TD> <TD ALIGN=CENTER>__rs</TD>
1821
<TR><TD ALIGN=CENTER>&gt;&gt;=</TD> <TD ALIGN=CENTER>__ars</TD>
1822
<TD ALIGN=CENTER>*</TD>  <TD ALIGN=CENTER>__ml</TD>
1823
<TD ALIGN=CENTER>*=</TD>  <TD ALIGN=CENTER>__aml</TD>
1824
<TR><TD ALIGN=CENTER>^</TD>  <TD ALIGN=CENTER>__er</TD>
1825
<TD ALIGN=CENTER>^=</TD>  <TD ALIGN=CENTER>__aer</TD>
1826
<TD ALIGN=CENTER>delete</TD> <TD ALIGN=CENTER>__dl</TD>
1827
<TR><TD ALIGN=CENTER>delete []</TD> <TD ALIGN=CENTER>__vd</TD>
1828
<TD ALIGN=CENTER>new</TD>  <TD ALIGN=CENTER>__nw</TD>
1829
<TD ALIGN=CENTER>new []</TD> <TD ALIGN=CENTER>__vn</TD>
1830
<TR><TD ALIGN=CENTER>?:</TD>  <TD ALIGN=CENTER>__cn</TD>
1831
<TD ALIGN=CENTER>:</TD>  <TD ALIGN=CENTER>__cs</TD>
1832
<TD ALIGN=CENTER>::</TD>  <TD ALIGN=CENTER>__cc</TD>
1833
<TR><TD ALIGN=CENTER>.</TD>  <TD ALIGN=CENTER>__df</TD>
1834
<TD ALIGN=CENTER>.*</TD>  <TD ALIGN=CENTER>__dm</TD>
1835
<TD ALIGN=CENTER>abs</TD>  <TD ALIGN=CENTER>__ab</TD>
1836
<TR><TD ALIGN=CENTER>max</TD>  <TD ALIGN=CENTER>__mx</TD>
1837
<TD ALIGN=CENTER>min</TD>  <TD ALIGN=CENTER>__mn</TD>
1838
<TD ALIGN=CENTER>sizeof</TD> <TD ALIGN=CENTER>__sz</TD>
1839
<TR><TD ALIGN=CENTER>typeid</TD> <TD ALIGN=CENTER>__td</TD>
1840
<TD ALIGN=CENTER>vtable</TD> <TD ALIGN=CENTER>__tb</TD>
1841
<TD ALIGN=CENTER>-</TD>  <TD ALIGN=CENTER>-</TD>
1842
</TABLE>
1843
</CENTER>
1844
<P>
1845
Note that this table contains a number of operators which are not
1846
part of C++ or cannot be overloaded in C++.  These are used in the
1847
representation of target dependent integer constants. 
1848
</P>
1849
 
1850
<H4>Mangling namespace names</H4>
1851
<P>
1852
The global namespace is mapped to an empty string.  Simple namespace
1853
and class names are mapped as above, but are preceded by a series
1854
of decimal digits giving the length of the mangled name.  Nested namespaces
1855
and classes are represented by a sequence of such namespace names,
1856
preceded by the number of elements in the sequence.  This takes the
1857
form <CODE>Q</CODE><I>digit</I> if there are less than 10 elements,
1858
or 
1859
<CODE>Q_</CODE><I>digits</I><CODE>_</CODE> if there are more than
1860
10. Note that members of anonymous classes or namespaces are local
1861
to their translation unit, and so do not have external tag names.
1862
</P>
1863
 
1864
<H4>Mangling types</H4>
1865
<P>
1866
The mangling of types is essentially similar to that used in the 
1867
<A HREF="dump.html">symbol table dump</A> format.  The type used in
1868
the mangled name for an identifier ignores the return type for a function
1869
and ignores the most significant bound for an array. 
1870
</P>
1871
<P>
1872
The built-in types are mapped in precisely the same way as in the
1873
<A HREF="dump.html#built-in">symbol table dump</A>.  Class and enumeration
1874
types are mapped to their type names mangled in the same way as the
1875
namespace names above.  The exception to this is that in a class member,
1876
the parent class is mapped to <CODE>X</CODE>. 
1877
</P>
1878
<P>
1879
The composite types are again mapped in a similar fashion to that
1880
in the <A HREF="dump.html#composite">dump file</A>.  For example,
1881
<CODE>PCc</CODE> represents <CODE>const char *</CODE>.  The only difficult
1882
case concerns function parameter types where the ARM 
1883
<CODE>T</CODE> and <CODE>N</CODE> encodings are used for duplicate
1884
parameter types.  The function return type is included in the mangled
1885
form except for function identifier types.  In the cases where the
1886
identifier is known always to represent a function (constructors,
1887
destructors etc.) the initial <CODE>F</CODE>
1888
indicating a function type is also omitted. 
1889
</P>
1890
<P>
1891
The types of template functions and classes are represented by the
1892
underlying template and the template arguments giving rise to the
1893
instance.  Template classes are preceded by <CODE>t</CODE>; template
1894
functions are preceded by <CODE>G</CODE> rather than <CODE>F</CODE>.
1895
Type arguments are represented by <CODE>Z</CODE> followed by the type
1896
value; non-type arguments are represented by the argument type followed
1897
by the argument value.  In the underlying type the template parameters
1898
are represented by <CODE>m0</CODE>, <CODE>m1</CODE> etc. An alternative
1899
scheme, in which the mangled form of a template function includes
1900
the type of that instance, rather than the underlying template, can
1901
be enabled using the <CODE>-j-f</CODE>
1902
command-line option. 
1903
</P>
1904
 
1905
<H4><A NAME="other">Other mangled names</A></H4>
1906
<P>
1907
The <A HREF="#vtable">virtual function table</A> for a class, when
1908
this is a variable with external linkage, is named <CODE>__vt__</CODE><I>type
1909
</I>, where <I>type</I> is the mangled form of the class name.  The
1910
virtual function table for a base class is named <CODE>__vt__</CODE><I>base</I>
1911
where <I>base</I> is a sequence of mangled class names specifying
1912
the base class.  The <A HREF="#rtti">run-time type information structure</A>
1913
for a type, when this is a variable with external linkage, is named
1914
<CODE>__ti__</CODE><I>type</I>, where <I>type</I> is the mangled form
1915
of the type name. 
1916
</P>
1917
 
1918
<H4>Mangled name examples</H4>
1919
<P>
1920
The following gives some examples of the name mangling scheme: 
1921
<PRE>
1922
	class A {
1923
	    static int a ;			// a__1Ai
1924
	public :
1925
	    A () ;				// __ct__1A
1926
	    A ( int ) ;				// __ct__1Ai
1927
	    A ( const A &amp; ) ;			// __ct__1ARCX
1928
	    virtual ~A () ;			// __dt__1A
1929
	    operator bool () ;			// __opb__1A
1930
	    bool operator! () ;			// __nt__1A
1931
	} ;
1932
 
1933
	// virtual function table	__vt__1A
1934
	// run-time type information	__ti__1A
1935
 
1936
	int f ( A *, int, A * ) ;		// f__FP1AiT1
1937
	int b = 2 ;				// b__i
1938
	int c [3] ;				// c__A_i
1939
 
1940
	namespace N {
1941
	    int *p = 0 ;			// p__1NPi
1942
	}
1943
</PRE>
1944
</P>
1945
 
1946
<HR>
1947
<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown
1948
Copyright &copy; 1998.</I></P>
1949
</BODY>
1950
</HTML>