Warning: Attempt to read property "date" on null in /usr/local/www/websvn.planix.org/blame.php on line 247

Warning: Attempt to read property "msg" on null in /usr/local/www/websvn.planix.org/blame.php on line 247
WebSVN – tendra.SVN – Blame – /trunk/doc/developer/components/tcpplus/tcpplus.xml – Rev 6

Subversion Repositories tendra.SVN

Rev

Go to most recent revision | Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
6 7u83 1
<?xml version="1.0" standalone="no"?>
2
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
3
  "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
4
 
5
<!--
6
  $Id$
7
-->
8
 
9
<book>
10
  <bookinfo>
11
    <title>C++ Producer Guide</title>
12
 
13
    <corpauthor>The TenDRA Project</corpauthor>
14
 
15
    <author>
16
      <firstname>Jeroen</firstname>
17
      <surname>Ruigrok van der Werven</surname>
18
    </author>
19
    <authorinitials>JRvdW</authorinitials>
20
    <pubdate>2004</pubdate>
21
 
22
    <copyright>
23
      <year>2004</year>
24
      <year>2005</year>
25
 
26
      <holder>The TenDRA Project</holder>
27
    </copyright>
28
 
29
    <copyright>
30
      <year>1998</year>
31
 
32
      <holder>DERA</holder>
33
    </copyright>
34
  </bookinfo>
35
 
36
  <chapter>
37
  <sect1 id="intro">
38
    <title>Introduction</title>
39
 
40
    <para>This document is designed as a technical overview of the TenDRA C++
41
      to TDF/ANDF producer.  It is divided into two broad areas; descriptions
42
      of the <A HREF="#interface">public interfaces</A> of the producer, and
43
      an overview of the producer <A HREF="#program">source code</A>.</para>
44
 
45
    <para>Whereas the interface description contains most of the information
46
      which would be required in a users' guide, it is not necessarily in a
47
      readily digestible form.  The C++ producer is designed to complement the
48
      existing TenDRA C to TDF producer; although they are completely distinct
49
      programs, the same design philosophy underlies both and they share a
50
      number of common interfaces.  There are no radical differences between
51
      the two producers, besides the fact that the C++ producer covers a
52
      vastly larger and more complex language.  This means that much of the
53
      <A HREF="#tdfc">existing documentation on the C producer</A> can be
54
      taken as also applying to the C++ producer.  This document tries to make
55
      clear where the C++ producer extends the C producer's interfaces, and
56
      those portions of these interfaces which are not directly applicable to
57
      C++.</para>
58
 
59
    <para>
60
    A familiarity with both C++ and TDF is assumed. The version of C++
61
    implemented is that given by the <A HREF="#cplusplus">draft ISO C++
62
    standard</A>.  All references to &quot;ISO C++&quot; within the document
63
    should strictly be qualified using the word &quot;draft&quot;, but
64
    for convenience this has been left implicit.  The C++ producer has
65
    a number of switches which allow it to be configured for older dialects
66
    of C++.  In particular, the version of C++ described in the <A HREF="#arm">ARM
67
    (Annotated Reference Manual)</A> is fully supported. 
68
    </para>
69
 
70
    <para>The <A HREF="#tdf">TDF specification</A> (version 4.0) may be consulted
71
    for a description of the compiler intermediate language used.  The
72
    paper 
73
    <A HREF="#port"><I>TDF and Portability</I></A> provides a useful (if
74
    slightly old) introduction to some of the ideas relating to static
75
    program analysis and interface checking which underlie the whole TenDRA
76
    compilation system. 
77
    </para>
78
 
79
    <para>
80
    The warning sign: 
81
 
82
    <IMG SRC="../images/warn.gif" ALT="warning"/>
83
 
84
    is used within the document to indicate areas where the implementation
85
    is currently incomplete or incorrect. 
86
    </para>
87
 
88
    <sect2 id="update">
89
      <title>1.1. Updated introduction</title>
90
 
91
      <para>Since this document was originally written, the old C producer,
92
        <I>tdfc</I>, has been replaced by a new C producer, <I>tdfc2</I>,
93
        which is just a modified version of the C++ producer, <I>tcpplus</I>.
94
        All C producer documentation continues to apply to the new C producer,
95
        but the new C producer also has many of the features described in this
96
        document as only applying to the C++ producer.</para>
97
    </sect2>
98
  </sect1>
99
 
100
  <sect1 id="interface">
101
    <title>Interface descriptions</title>
102
  <para>
103
  The most important public interfaces of the C++ producer are the ISO
104
  C++ standard and the TDF 4.0 specification; however there are other
105
  interfaces, mostly common to both the C and C++ producers, which are
106
  described in this section. 
107
  </para>
108
  <para>
109
  An important design criterion of the C++ producer was that it should
110
  be strictly ISO conformant by default, but have a method whereby dialect
111
  features and extra static program analysis can be enabled. This compiler
112
  configuration is controlled by the 
113
  <A HREF="pragma.html"><code>#pragma TenDRA</code> directives</A>
114
  described in the first section. 
115
  </para>
116
  <para>
117
  The requirement that the C and C++ producers should be able to translate
118
  portable C or C++ programs into target independent TDF requires a
119
  mechanism whereby the target dependent implementations of APIs can
120
  be represented.  This mechanism, the <A HREF="token.html"><code>#pragma
121
  token</code> syntax</A>, is described in the following section.  Note
122
  that at present this mechanism only contains support for C APIs; it
123
  is considered that the C++ language itself contains sufficient interface
124
  mechanisms for C++ APIs to be described. 
125
  </para>
126
  <para>
127
  The C and C++ producers provide two mechanisms whereby type and declaration
128
  information derived from a translation unit can be stored to a file
129
  for post-processing by other tools.  The first is the 
130
  <A HREF="dump.html">symbol table dump</A>, which is a public interface
131
  designed for use by third party tools.  The second is the 
132
  <A HREF="link.html">C/C++ spec file</A>, which is designed for ease
133
  of reading and writing by the producers themselves, and is used for
134
  intermodule analysis. 
135
  </para>
136
  <para>
137
  The mapping from C++ to TDF implemented by the C++ producer is largely
138
  straightforward.  There are however target dependencies arising within
139
  the language itself which require special handling.  These are represented
140
  by certain <A HREF="lib.html">standard tokens</A> which the producer
141
  requires to be defined on the target machine.  These tokens are also
142
  used to describe the interface between the producer and the run-time
143
  system.  Note that the C++ producer is primarily concerned with the
144
  C++ language, not with the standard C++ library. An example implementation
145
  of those library components which are required as an integral part
146
  of the language (memory allocation, exception handling, run-time type
147
  information etc.) is provided. Otherwise, libraries should be obtained
148
  from third parties.  A number of hints on <A HREF="std.html">integrating
149
  such libraries</A> with the C++ producer are given. 
150
  </para>
151
  </sect1>
152
 
153
  <sect1 id="program">
154
    <title>Program overview</title>
155
  <para>
156
  The C++ producer is a large program (over 200000 lines, including
157
  automatically generated code) written in C.  A description of the
158
  <A HREF="style.html#language">coding conventions</A> used, the 
159
  <A HREF="style.html#api">API</A> observed and the basic organisation
160
  of the <A HREF="style.html#src">source code</A> are described in the
161
  first section. 
162
  </para>
163
  <para>
164
  One of the design methods used in the C++ producer is the extensive
165
  use of automatic code generation tools.  The type system is based
166
  around the <code>calculus</code> tool, which allows complex type systems
167
  to be described in a simple format.  The interface generated by <code>calculus
168
  </code> allows for rigorous static type checking, generic type constructors
169
  for lists, stacks etc., encapsulation of the operations on the types
170
  within the system, and optional run-time checking for null pointers
171
  and discriminated union tags.  An overview is given of the <A HREF="alg.html">type
172
  system</A> used as the basis of the C++ producer design.  Also see
173
  the 
174
  <A HREF="../utilities/calc.html"><code>calculus</code> users' guide</A>.
175
  </para>
176
  <para>
177
  The other general purpose code generation tool used in the C++ producer
178
  is the parser generator, <code>sid</code>.  A brief description of
179
  the problems in writing a <A HREF="parse.html">C++ parser</A> is given.
180
  Also see the <A HREF="../utilities/sid.html"><code>sid</code> users'
181
  guide</A>. 
182
  </para>
183
  <para>
184
  The other code generation tools used were written specifically for
185
  the C++ producer.  The error reporting routines within the producer
186
  are based on an <A HREF="error.html">error catalogue</A>, from which
187
  code for constructing and printing errors is generated.  The 
188
  <A HREF="tdf.html">TDF output routines</A> are based on primitives
189
  automatically generated from a standard database describing the TDF
190
  specification. 
191
  </para>
192
  <para>
193
  The program itself is well commented, so no lower level program documentation
194
  has been provided.  When performing development work the producer
195
  should be compiled with the <code>DEBUG</code> macro defined. This
196
  enables the <code>calculus</code> run-time checks, along with other
197
  assertions, and makes available the debugging routines, 
198
  <code>DEBUG_</code><I>type</I>, which can be used to print an object
199
  from the internal type system. 
200
  </para>
201
  </sect1>
202
 
203
  <sect1 id="reference">
204
    <title>References</title>
205
    <itemizedlist>
206
      <listitem><A id="cplusplus"><B>Working paper for Draft Proposed
207
      Internation Standard for Information Systems - Programming Language
208
      C++</B></A>, X3J16/96-0225, December 1996:     
209
      <A HREF="http://www.cygnus.com/misc/wp/dec96pub/">
210
      <code>http://www.cygnus.com/misc/wp/dec96pub/</code></A> or     
211
      <A HREF="http://www.maths.warwick.ac.uk/c++/pub/wp/html/cd2/">
212
      <code>http://www.maths.warwick.ac.uk/c++/pub/wp/html/cd2/</code></A>.
213
      </listitem>
214
      <listitem><A id="arm"><B>The Annotated C++ Reference Manual</B></A>,
215
      Margaret Ellis and Bjarne Stroustrup, ISBN 0-201-51459-1,
216
      Addison-Wesley, 1990:     
217
      <A HREF="http://heg-school.aw.com/cseng/authors/ellis/annocpp/annocpp.html">
218
      <code>http://heg-school.aw.com/cseng/authors/ellis/annocpp/annocpp.html</code>
219
      </A>
220
      </listitem>
221
      <listitem><A id="tdf"><B>TDF Specification, Issue 4.0</B></A>: 
222
      <A HREF="../tdf/spec1.html">attached</A>. 
223
      </listitem>
224
      <listitem><A id="tdfc"><B>C Checker Reference Manual</B></A>: 
225
      <A HREF="../tdfc/tdfc1.html">attached</A>. 
226
      </listitem>
227
      <listitem><A id="port"><B>TDF and Portability</B></A>: 
228
      <A HREF="../port/port1.html">attached</A>. 
229
      </listitem>
230
      <listitem><A id="cstyle"><B>C Coding Standards</B></A>,
231
      DRA/CIS(SE2)/WI/94/57/2.0 (OSSG internal document). 
232
      </listitem>
233
    </itemizedlist>
234
  </sect1>
235
 
236
  <sect1>
237
  <title>
238
  C++ Producer Guide: Invocation 
239
  </title>
240
 
241
  <sect2>
242
    <title>2.1. Invocation</title>
243
  <para>
244
  In this section it is described how the C++ to TDF producer, 
245
  <code>tcpplus</code>, fits into the overall compilation scheme controlled
246
  by the TenDRA compiler front-end, <code>tcc</code>, or the TenDRA
247
  checker front-end, <code>tchk</code>.  While it is possible to use
248
  <code>tcpplus</code> as a stand-alone program, it is recommended that
249
  it should be invoked via <code>tcc</code> or <code>tchk</code>. The
250
  <code>tcc</code> users' guide should be consulted for more details.
251
  </para>
252
  <para>
253
  <code>tcc</code> and <code>tchk</code> require the <code>-Yc++</code>
254
  command-line option in order to enable their C++ capabilities.  Files
255
  with a <code>.C</code> suffix are recognised as C++ source files and
256
  passed to <code>tcpplus</code> for processing (see 
257
  <A HREF="#compile">below</A>).  It is possible to change the suffix
258
  used for C++ source files; for example <code>-sC:cc</code> causes
259
  <code>.cc</code> files to be recognised as C++ source files.  An interesting
260
  variation is <code>-sC:c</code> which causes C source files to be
261
  processed by the C++ producer.  Similarly <code>.I</code> files are
262
  recognised as preprocessed C++ source files and <code>.K</code>
263
  files are recognised as C++ spec files. 
264
  </para>
265
  <para>
266
  Most of the command-line option handling for <code>tcpplus</code>
267
  is done by <code>tcc</code> and <code>tchk</code>, however it is possible
268
  to pass the option <I>opt</I> directly to <code>tcpplus</code> using
269
  the option <code>-Wx,</code><I>opt</I> to <code>tcc</code> or <code>tchk</code>.
270
  Similarly <code>-Wg,</code><I>opt</I> and <code>-WS,</code><I>opt</I>
271
  can be used to pass options to the C++ preprocessor and the C++ spec
272
  linker (both of which are actually <code>tcpplus</code> invoked with
273
  different options) respectively. 
274
  </para>
275
 
276
 
277
  <sect3 id="compile">
278
    <title>2.1.1. Compilation scheme</title>
279
  <para>
280
  The overall compilation scheme controlled by <code>tcc</code>, as
281
  it relates to the C++ producer, can be represented as follows: 
282
 
283
  <IMG SRC="../images/compile.gif" ALT="compilation scheme"/>
284
 
285
  Each C++ source file, <code>a.C</code> say, is processed using 
286
  <code>tcpplus</code> to give an output TDF capsule, <code>a.j</code>,
287
  which is passed to the installer phase of <code>tcc</code>.  The capsule
288
  is linked with any target dependent token definition libraries, translated
289
  to assembler and assembled to give a binary object file, 
290
  <code>a.o</code>.  The various object files comprising the program
291
  are then linked with the system libraries to give a final executable,
292
  <code>a.out</code>. 
293
  </para>
294
  <para>
295
  In addition to this main compilation scheme, <code>tcpplus</code>
296
  can additionally be made to output a <A HREF="link.html">C++ spec
297
  file</A>
298
  for each C++ source file, <code>a.K</code> say.  These C++ spec files
299
  can be linked, using <code>tcpplus</code> in its spec linker mode,
300
  to give an additional TDF capsule, <code>x.j</code> say, and a combined
301
  C++ spec file, <code>x.K</code>.  The main purpose of this C++ spec
302
  linking is to perform intermodule checks on the program, however in
303
  the course of this checking exported templates which are defined in
304
  one module and used in another are instantiated.  This extra code
305
  is output to <code>x.j</code>, which is then installed and linked
306
  in the normal way. 
307
  </para>
308
  <para>
309
  Note that intermodule checks, and hence intermodule template instantiations,
310
  are only performed if the <code>-im</code> option is passed to <code>tcc</code>.
311
  </para>
312
  <para>
313
  The TenDRA checker, <code>tchk</code>, is similar to <code>tcc</code>
314
  except that it disables TDF output and has intermodule analysis enabled
315
  by default. 
316
  </para>
317
  </sect3>
318
 
319
  <sect3 id="option">
320
    <title>2.1.2. Producer options</title>
321
  <para>
322
  The general form for the invocation of <code>tcpplus</code> is as
323
  follows: 
324
  <programlisting>
325
  	tcpplus [ <I>options</I> ] [ <I>input-file</I> ] .... [ <I>output-file</I> ]
326
  </programlisting>
327
  The output file can alternatively be specified using the 
328
  <A HREF="#output"><code>-o</code> option</A>.  If no output file is
329
  given, or the output file is <code>-</code>, the standard output is
330
  used.  In general there can be any number of input files.  If no input
331
  file is given, or the input file is <code>-</code>, the standard input
332
  is used. 
333
  </para>
334
  <para>
335
  <code>tcpplus</code> has three modes which determine the form of its
336
  input and output files.  The default mode is compilation, in which
337
  a single input C++ source file is translated into an output TDF capsule.
338
  In preprocessing mode, specified using the 
339
  <A HREF="#preproc"><code>-E</code> option</A>, a single input C++
340
  source file is preprocessed into an output C++ source file.  Note
341
  that the preprocessor is built into <code>tcpplus</code>, rather than,
342
  as with most other compilers, being a separate program.  The final
343
  mode is 
344
  <A HREF="link.html">C++ spec linking</A>, specified using the 
345
  <A HREF="#linker"><code>-S</code> option</A>.  Any number of C++ spec
346
  input files are linked and any code generated as a result (for example,
347
  template instantiations) is written to the output TDF capsule. 
348
  </para>
349
  <para>
350
  In either compilation or spec linking mode, a C++ spec output file
351
  can be generated, in addition to the TDF capsule, using the 
352
  <A HREF="#spec"><code>-s</code> option</A>.  In any mode a symbol
353
  table dump output file can generated using the <A HREF="#dump"><code>-d</code>
354
  option</A>. 
355
  </para>
356
  <para>
357
  Command-line options can appear in any order and can be interspersed
358
  with the input and output files, except following a <code>--</code>
359
  option.  All the multi-part options can be given either as one or
360
  two command-line arguments, so that <code>-I</code><I>directory</I>
361
  and 
362
  <code>-I</code> <I>directory</I> are equivalent.  The recognised options
363
  are as follows: 
364
 
365
  <itemizedlist>
366
 
367
  <listitem><B>-A<I>predicate</I>(<I>tokens</I>)</B>
368
  Asserts that the given predicate is true, that is to say: 
369
  <programlisting>
370
  	#assert <I>predicate</I> ( <I>tokens</I> )
371
  </programlisting>
372
  The special case <code>-A-</code> undefines all the built-in predicates
373
  (of which there are none).  Use of this option automatically enables
374
  support for the <A HREF="pragma.html#ppdir"><code>#assert</code> and
375
  <code>#unassert</code> directives</A>. 
376
  </listitem>
377
 
378
  <listitem><B>-D<I>macro</I></B>
379
  <B>-D<I>macro</I>=<I>tokens</I></B>
380
  Defines the given macro to be 1 in the first case, or the given sequence
381
  of preprocessing tokens in the second case, that is to say: 
382
  <programlisting>
383
  	#define <I>macro</I> 1
384
  	#define <I>macro tokens</I>
385
  </programlisting>
386
  respectively.  In fact <code>-D</code> and <code>-U</code> options
387
  to 
388
  <code>tcc</code> are not passed as <code>-D</code> and <code>-U</code>
389
  options to <code>tcpplus</code>.  Instead a 
390
  <A HREF="#start-up">start-up</A> file containing the equivalent 
391
  <code>#define</code> and <code>#undef</code> directives is used. 
392
  </listitem>
393
 
394
  <listitem><A id="preproc"><B>-E</B></A>
395
  Enables preprocessing mode in which the input C++ source file is preprocessed
396
  into the output file. 
397
  </listitem>
398
 
399
  <listitem><B>-F<I>file</I></B>
400
  Causes a list of command-line options to be read from <I>file</I>.
401
  Other than empty lines and lines beginning with <code>#</code>, each
402
  line in the file is treated as if it had been specified as a separate
403
  command-line option. 
404
  </listitem>
405
 
406
  <listitem><B>-H</B>
407
  Enables verbose inclusion mode in which warnings are printed at the
408
  start and end of each included source file. 
409
  </listitem>
410
 
411
  <listitem><B>-I<I>directory</I></B>
412
  Adds the given directory to the list searched for included source
413
  files. No such directories are built into the producer by default.
414
  </listitem>
415
 
416
  <listitem><A id="directory"><B>-N<I>name</I>:<I>directory</I></B></A>
417
  This is identical to <code>-I</code><I>directory</I> except that it
418
  also associates the given identifier with the directory.  The directory
419
  name can be used to specify a <A HREF="pragma.html#scope">compilation
420
  profile</A> to be used on files included from this directory. 
421
  </listitem>
422
 
423
  <listitem><A id="linker"><B>-S</B></A>
424
  Enables C++ spec linker mode, in which any number of C++ spec input
425
  files are linked together. 
426
  </listitem>
427
 
428
  <listitem><B>-U<I>macro</I></B>
429
  Undefines the given macro, that is to say: 
430
  <programlisting>
431
  	#undef <I>macro</I>
432
  </programlisting>
433
  The special case <code>-U-</code> undefines all the built-in macros.
434
  These may be described as follows: 
435
  <programlisting>
436
  	#define __FILE__		<I>(current file)</I>
437
  	#define __LINE__		<I>(current line)</I>
438
  	#define __TIME__		<I>(current time)</I>
439
  	#define __DATE__		<I>(current date)</I>
440
  	#define __STDC__		1
441
  	#define __STDC_VERSION__	199409L
442
  	#define __cplusplus		199711L
443
  </programlisting>
444
  The actual value of <code>__cplusplus</code> gives the date of the
445
  draft ISO C++ standard on which the current version of the producer
446
  is based. The value given above gives the expected date of the final
447
  C++ standard. 
448
  </listitem>
449
 
450
  <listitem><B>-V</B>
451
  Causes the name of each function to be printed to the standard output
452
  as it is compiled. 
453
  </listitem>
454
 
455
  <listitem><B>-W<I>option</I></B>
456
  Sets the given <A HREF="pragma.html#low">compiler option</A> to give
457
  a warning, that is to say: 
458
  <programlisting>
459
  	#pragma TenDRA option &quot;<I>option</I>&quot; warning
460
  </programlisting>
461
  The special case <code>-Wall</code> enables a wide range of warnings.
462
  </listitem>
463
 
464
  <listitem><B>-X</B>
465
  Disables exception handling.  The <A HREF="lib.html#except">current
466
  implementation</A> can be a large run-time overhead if not required.
467
  The effect of linking any module compiled with this option with a
468
  module which throws an exception is undefined.  This is equivalent
469
  to <A HREF="#output"><code>-j-e</code></A>. 
470
  </listitem>
471
 
472
  <listitem><B>-a</B>
473
  Causes complete program analysis to be applied.  That is to say it
474
  is assumed that no other translation units need to be linked in order
475
  for the program to execute. 
476
  </listitem>
477
 
478
  <listitem><B>-c</B>
479
  Disables TDF output.  The output file will still be a valid TDF capsule,
480
  but it will contain no information.  This is equivalent to 
481
  <A HREF="#output"><code>-j-c</code></A>. 
482
  </listitem>
483
 
484
  <listitem><para><A id="dump"><B>-d<I>opt</I>=<I>dump-file</I></B></A>
485
  Specifies the given file as a <A HREF="dump.html">symbol table dump</A>
486
  output file.  <I>opt</I> will be a series of characters describing
487
  the information to be dumped, as follows: 
488
 
489
  <table>
490
  <tr><th>Key</th>
491
  <th>Description</th>
492
  </tr>
493
  <tr><td><code>a</code></td>
494
  <td>equivalent to <code>ehlmu</code></td>
495
  </tr>
496
  <tr><td><code>c</code></td>
497
  <td>dump string literals</td>
498
  </tr>
499
  <tr><td><code>e</code></td>
500
  <td>dump error messages</td>
501
  </tr>
502
  <tr><td><code>h</code></td>
503
  <td>dump header information</td>
504
  </tr>
505
  <tr><td><code>k</code></td>
506
  <td>dump keyword identifiers</td>
507
  </tr>
508
  <tr><td><code>l</code></td>
509
  <td>dump local variables</td>
510
  </tr>
511
  <tr><td><code>m</code></td>
512
  <td>dump macro identifiers</td>
513
  </tr>
514
  <tr><td><code>s</code></td>
515
  <td>dump scope information</td>
516
  </tr>
517
  <tr><td><code>u</code></td>
518
  <td>dump identifier usage information</td>
519
  </tr>
520
  </table>
521
 
522
  </para>
523
  <para>
524
  Note that these correspond to the <code>tcc -sym</code> options. 
525
  </para>
526
  </listitem>
527
 
528
  <listitem><A id="end-up"><B>-e<I>file</I></B></A>
529
  Specifies the given file as an end-up file.  This is equivalent to
530
  adding: 
531
  <programlisting>
532
  	#include &quot;<I>file</I>&quot;
533
  </programlisting>
534
  at the end of the input source file.  More than one end-up file may
535
  be given; they are processed in the order given. 
536
  </listitem>
537
 
538
  <listitem><A id="start-up"><B>-f<I>file</I></B></A>
539
  Specifies the given file as a start-up file.  This is equivalent to
540
  adding: 
541
  <programlisting>
542
  	#include &quot;<I>file</I>&quot;
543
  </programlisting>
544
  at the start of the input source file.  More than one start-up file
545
  may be given; they are processed in the order given. 
546
  </listitem>
547
 
548
  <listitem><B>-g</B>
549
  Specifies that the output TDF capsule should also contain information
550
  to allow for the generation of run-time debugging directives.  This
551
  is equivalent to <A HREF="#output"><code>-jg</code></A>. 
552
  </listitem>
553
 
554
  <listitem><B>-h</B>
555
  Causes a full list of command-line options to be printed.  This includes
556
  a number not documented here which are unlikely to prove useful to
557
  the normal user. 
558
  </listitem>
559
 
560
  <listitem><A id="output"><B>-j<I>opt</I></B></A>
561
  Sets the TDF output options given by <I>opt</I>.  This consists of
562
  a sequence of characters describing the options to be enabled or disabled.
563
  By default, or following a <code>+</code>, the options are enabled;
564
  following a <code>-</code> they are disabled.  The available options
565
  are as follows: 
566
  </listitem>
567
 
568
  <table>
569
  <tr><th>Key</th>
570
  <th>Default</th>
571
  <th>Description</th>
572
  </tr>
573
  <tr><td><code>a</code></td>
574
  <td>off</td>
575
  <td>output external names for local objects</td>
576
  </tr>
577
  <tr><td><code>b</code></td>
578
  <td>off</td>
579
  <td>work round old installer bugs</td>
580
  </tr>
581
  <tr><td><code>c</code></td>
582
  <td>on</td>
583
  <td>output TDF capsule</td>
584
  </tr>
585
  <tr><td><code>d</code></td>
586
  <td>off</td>
587
  <td>output termination function</td>
588
  </tr>
589
  <tr><td><code>e</code></td>
590
  <td>on</td>
591
  <td>output exceptions</td>
592
  </tr>
593
  <tr><td><code>f</code></td>
594
  <td>on</td>
595
  <td>mangle template function signatures</td>
596
  </tr>
597
  <tr><td><code>g</code></td>
598
  <td>off</td>
599
  <td>output debugging information</td>
600
  </tr>
601
  <tr><td><code>i</code></td>
602
  <td>off</td>
603
  <td>output dynamic initialisers as a function</td>
604
  </tr>
605
  <tr><td><code>n</code></td>
606
  <td>on</td>
607
  <td>mangle object names</td>
608
  </tr>
609
  <tr><td><code>o</code></td>
610
  <td>off</td>
611
  <td>order class data members by access</td>
612
  </tr>
613
  <tr><td><code>p</code></td>
614
  <td>on</td>
615
  <td>output partial destructors</td>
616
  </tr>
617
  <tr><td><code>r</code></td>
618
  <td>on</td>
619
  <td>output run-time type information</td>
620
  </tr>
621
  <tr><td><code>s</code></td>
622
  <td>on</td>
623
  <td>output shared string literals</td>
624
  </tr>
625
  <tr><td><code>t</code></td>
626
  <td>off</td>
627
  <td>output token declarations</td>
628
  </tr>
629
  <tr><td><code>u</code></td>
630
  <td>on</td>
631
  <td>output unused static variables</td>
632
  </tr>
633
  <tr><td><code>v</code></td>
634
  <td>off</td>
635
  <td>output local virtual function tables</td>
636
  </tr>
637
  </table>
638
 
639
  <listitem><A id="error"><B>-m<I>opt</I></B></A>
640
  Sets the error formatting options given by <I>opt</I>.  This consists
641
  of a sequence of characters describing the options to be enabled or
642
  disabled. By default, or following a <code>+</code>, the options are
643
  enabled; following a <code>-</code> they are disabled.  The available
644
  options are as follows: 
645
 
646
  <table>
647
  <tr><th>Key</th>
648
  <th>Default</th>
649
  <th>Description</th>
650
  </tr>
651
  <tr><td><code>c</code></td>
652
  <td>off</td>
653
  <td>show source code with error</td>
654
  </tr>
655
  <tr><td><code>e</code></td>
656
  <td>off</td>
657
  <td>show error name</td>
658
  </tr>
659
  <tr><td><code>f</code></td>
660
  <td>on</td>
661
  <td>reliable <code>fseek</code> function</td>
662
  </tr>
663
  <tr><td><code>g</code></td>
664
  <td>off</td>
665
  <td>record statement locations</td>
666
  </tr>
667
  <tr><td><code>i</code></td>
668
  <td>on</td>
669
  <td>reliable <code>stat</code> function</td>
670
  </tr>
671
  <tr><td><code>k</code></td>
672
  <td>off</td>
673
  <td>enable C++ spec output</td>
674
  </tr>
675
  <tr><td><code>l</code></td>
676
  <td>off</td>
677
  <td>output full error location</td>
678
  </tr>
679
  <tr><td><code>s</code></td>
680
  <td>on</td>
681
  <td>output ISO section number</td>
682
  </tr>
683
  <tr><td><code>t</code></td>
684
  <td>off</td>
685
  <td>use <code>typedef</code> names in errors</td>
686
  </tr>
687
  <tr><td><code>w</code></td>
688
  <td>off</td>
689
  <td>disable warnings</td>
690
  </tr>
691
  <tr><td><code>z</code></td>
692
  <td>off</td>
693
  <td>continue after error</td>
694
  </tr>
695
  </table>
696
 
697
  </listitem>
698
 
699
  <listitem><A id="table"><B>-n<I>port-table</I></B></A>
700
  Specifies that the given <A HREF="pragma.html#table">portability table</A>
701
  should be used to specify the basic configuration parameters. 
702
  </listitem>
703
 
704
  <listitem><A id="output"><B>-o<I>output-file</I></B></A>
705
  Gives an alternative method of specifying the output file. 
706
  </listitem>
707
 
708
  <listitem><B>-q</B>
709
  Causes the program to quit immediately without processing its input
710
  files. This is useful primarily in version and command-line option
711
  queries. 
712
  </listitem>
713
 
714
  <listitem><A id="spec"><B>-s<I>spec-file</I></B></A>
715
  Specifies the given file as a C++ spec output file. 
716
  </listitem>
717
 
718
  <listitem><B>-t</B>
719
  Specifies that token declarations should be included in the output
720
  TDF capsule.  While these are strictly unnecessary, they help when
721
  pretty-printing the output.  This is equivalent to 
722
  <A HREF="#output"><code>-jt</code></A>. 
723
  </listitem>
724
 
725
  <listitem><A id="unmangle"><B>-u</B></A>
726
  The form: 
727
  <programlisting>
728
  	tcpplus -u <I>name</I> .... <I>name</I>
729
  </programlisting>
730
  can be used to print the unmangled forms of a list of 
731
  <A HREF="lib.html#mangle">mangled identifier names</A> to the standard
732
  output. 
733
  </listitem>
734
 
735
  <listitem><B>-v</B>
736
  Causes the C++ producer version number, plus information on the versions
737
  of C++ and TDF supported, to be printed to the standard error. 
738
  </listitem>
739
 
740
  <listitem><B>-w</B>
741
  Disables all warning messages.  This is equivalent to 
742
  <A HREF="#error"><code>-mw</code></A>. 
743
  </listitem>
744
 
745
  <listitem><B>-z</B>
746
  Forces an output file to be created even if compilation errors occur.
747
  The effect of installing a TDF capsule produced using this option
748
  is undefined.  This is equivalent to <A HREF="#error"><code>-mz</code></A>.
749
  </listitem>
750
 
751
  <listitem><B>--</B>
752
  Marks the last option.  Any subsequent arguments are interpreted as
753
  input and output files even if they resemble command-line options.
754
  </listitem>
755
 
756
  </itemizedlist>
757
  </para>
758
  </sect3>
759
  </sect2>
760
 
761
  <sect2>
762
    <title>2.2. Compiler configuration</title>
763
  <para>
764
  This section describes how the C++ producer can be configured to apply
765
  extra static checks or to support various dialects of C++.  In all
766
  cases the default behaviour is precisely that specified in the ISO
767
  C++ standard with no extra checks. 
768
  </para>
769
  <para>
770
  Certain very basic configuration information is specified using a
771
  <A HREF="#table">portability table</A>, however the primary method
772
  of configuration is by means of <code>#pragma</code> directives. 
773
  These directives may be placed within the program itself, however
774
  it is generally more convenient to group them into a 
775
  <A HREF="man.html#start-up">start-up file</A> in order to create a
776
  <A id="usr">user-defined compilation profile</A>.  The 
777
  <code>#pragma</code> directives recognised by the C++ producer have
778
  one of the equivalent forms: 
779
  <programlisting>
780
  	#pragma TenDRA ....
781
  	#pragma TenDRA++ ....
782
  </programlisting>
783
  Some of these are common to the C and C++ producers (although often
784
  with differing default behaviour).  The C producer will ignore any
785
  <code>TenDRA++</code> directives, so these may be used in compilation
786
  profiles which are to be used by both producers.  In the descriptions
787
  below, the presence of a <code>++</code> is used to indicate a directive
788
  which is C++ specific; the other directives are common to both producers.
789
  </para>
790
  <para>
791
  Within the description of the <code>#pragma</code> syntax, <I>on</I>
792
  stands for <code>on</code>, <code>off</code> or <code>warning</code>,
793
  <I>allow</I> stands for <code>allow</code>, <code>disallow</code>
794
  or 
795
  <code>warning</code>, <I>string-literal</I> is any string literal,
796
  <I>integer-literal</I> is any integer literal, <I>identifier</I> is
797
  any simple, unqualified identifier name, and <I>type-id</I> is any
798
  type identifier.  Other syntactic items are described in the text.
799
  A 
800
  <A HREF="pragma1.html">complete grammar</A> for the <code>#pragma</code>
801
  directives accepted by the C++ producer is given as an annex. 
802
  </para>
803
 
804
 
805
  <sect3 id="table">
806
    <title>2.2.1. Portability tables</title>
807
  <para>
808
  Certain very basic configuration information is read from a file called
809
  a portability table, which may be specified to the producer using
810
  a 
811
  <A HREF="man.html#table"><code>-n</code> option</A>.  This information
812
  includes the minimum sizes of the basic integral types, the 
813
  <A HREF="#char">sign of plain <code>char</code></A>, and whether signed
814
  types can be assumed to be symmetric (for example, [-127,127]) or
815
  maximum (for example, [-128,127]). 
816
  </para>
817
  <para>
818
  The default portability table values, which are built into the producer,
819
  can be expressed in the form: 
820
  <programlisting>
821
  	char_bits			8
822
  	short_bits			16
823
  	int_bits			16
824
  	long_bits			32
825
  	signed_range			symmetric
826
  	char_type			either
827
  	ptr_int				none
828
  	ptr_fn				no
829
  	non_prototype_checks		yes
830
  	multibyte			1
831
  </programlisting>
832
  This illustrates the syntax for the portability table; note that all
833
  ten entries are required, even though the last four are ignored. 
834
  </para>
835
  </sect3>  
836
 
837
  <sect3 id="low">
838
    <title>2.2.2. Low level configuration</title>
839
  <para>
840
  The simplest level of configuration is to reset the severity level
841
  of a particular error message using: 
842
  <programlisting>
843
  	#pragma TenDRA++ error <I>string-literal on</I>
844
  	#pragma TenDRA++ error <I>string-literal allow</I>
845
  </programlisting>
846
  The given <I>string-literal</I> should name an error from the 
847
  <A HREF="error.html">error catalogue</A>.  A severity of <code>on</code>
848
  or <code>disallow</code> indicates that the associated diagnostic
849
  message should be an error, which causes the compilation to fail.
850
  A severity of 
851
  <code>warning</code> indicates that the associated diagnostic message
852
  should be a warning, which is printed but allows the compilation to
853
  continue.  A severity of <code>off</code> or <code>allow</code>
854
  indicates that the associated error should be ignored.  Reducing the
855
  severity of any error from its default value, other than via one of
856
  the dialect directives described in this section, results in undefined
857
  behaviour. 
858
  </para>
859
  <para>
860
  The next level of configuration is to reset the severity level of
861
  a particular compiler option using: 
862
  <programlisting>
863
  	#pragma TenDRA++ option <I>string-literal on</I>
864
  	#pragma TenDRA++ option <I>string-literal allow</I>
865
  </programlisting>
866
  The given <I>string-literal</I> should name an option from the option
867
  catalogue.  The simplest form of compiler option just sets the severity
868
  level of one or more error messages.  Some of these options may require
869
  additional processing to be applied.</para>
870
  <para>
871
  It is possible to link a particular error message to a particular
872
  compiler option using: 
873
  <programlisting>
874
  	#pragma TenDRA++ error <I>string-literal</I> as option <I>string-literal</I>
875
  </programlisting>
876
  </para>
877
  <para>
878
  Note that the directive: 
879
  <programlisting>
880
  	#pragma TenDRA++ use error <I>string-literal</I> 
881
  </programlisting>
882
  can be used to raise a given error at any point in a translation unit
883
  in a similar fashion to the <code>#error</code> directive.  The values
884
  of any parameters for this error are unspecified. 
885
  </para>
886
  <para>
887
  The directives just described give the primitive operations on error
888
  messages and compiler options.  Many of the remaining directives in
889
  this section are merely higher level ways of expressing these primitives.
890
  </para>
891
  </sect3>  
892
 
893
  <sect3 id="scope">
894
    <title>2.2.3. Checking scopes</title>
895
  <para>
896
  Most compiler options are scoped.  A checking scope may be defined
897
  by enclosing a list of declarations within: 
898
  <programlisting>
899
  	#pragma TenDRA begin
900
  	....
901
  	#pragma TenDRA end
902
  </programlisting>
903
  If the final <code>end</code> directive is omitted then the scope
904
  ends at the end of the translation unit.  Checking scopes may be nested
905
  in the obvious way.  A checking scope inherits its initial set of
906
  checks from its enclosing scope (this includes the implicit main checking
907
  scope consisting of the entire input file).  Any checks switched on
908
  or off within a scope apply only to the remainder of that scope and
909
  any scope it contains.  A particular check can only be set once in
910
  a given scope. The set of applied checks reverts to its previous state
911
  at the end of the scope.</para>
912
  <para>
913
  A checking scope can be named using the directives: 
914
  <programlisting>
915
  	#pragma TenDRA begin name environment <I>identifier</I>
916
  	....
917
  	#pragma TenDRA end
918
  </programlisting>
919
  Checking scope names occupy a namespace distinct from any other namespace
920
  within the translation unit.  A named scope defines a set of modifications
921
  to the current checking scope.  These modifications may be reapplied
922
  within a different scope using: 
923
  <programlisting>
924
  	#pragma TenDRA use environment <I>identifier</I>
925
  </programlisting>
926
  The default behaviour is not to allow checks set in the named checking
927
  scope to be reset in the current scope.  This can however be modified
928
  using: 
929
  <programlisting>
930
  	#pragma TenDRA use environment <I>identifier</I> reset <I>allow</I>
931
  </programlisting>
932
  </para>
933
  <para>
934
  Another use of a named checking scope is to associate a checking scope
935
  with a named include file directory.  This is done using: 
936
  <programlisting>
937
  	#pragma TenDRA directory <I>identifier</I> use environment <I>identifier</I>
938
  </programlisting>
939
  where the directory name is one introduced via a 
940
  <A HREF="man.html#directory"><code>-N</code> command-line option</A>.
941
  The effect of this directive, if a <code>#include</code> directive
942
  is found to resolve to a file from the given directory, is as if the
943
  file was enclosed in directives of the form: 
944
  <programlisting>
945
  	#pragma TenDRA begin
946
  	#pragma TenDRA use environment <I>identifier</I> reset allow
947
  	....
948
  	#pragma TenDRA end
949
  </programlisting>
950
  </para>
951
  <para>
952
  The checks applied to the expansion of a macro definition are those
953
  from the scope in which the macro was defined, not that in which it
954
  was expanded. The macro arguments are checked in the scope in which
955
  they are specified, that is to say, the scope in which the macro is
956
  expanded.  This enables macro definitions to remain localised with
957
  respect to checking scopes. 
958
  </para>
959
  </sect3>  
960
 
961
  <sect3 id="limits">
962
    <title>2.2.4. Implementation limits</title>
963
  <para>
964
  This table gives the default implementation limits imposed by the
965
  C++ producer for the various implementation quantities listed in Annex
966
  B of the ISO C++ standard, together with the minimum limits allowed
967
  in ISO C and C++.  A default limit of <I>none</I> means that the quantity
968
  is limited only by the size of the host machine (either <code>ULONG_MAX</code>
969
  or until it runs out of memory).  A limit of <I>target</I> means that
970
  while no limits is imposed by the C++ front-end, particular target
971
  machines may impose such limits. 
972
  </para>
973
 
974
  <table>
975
  <tr><th>Quantity identifier</th>
976
  <th>Min C limit</th>  <th>Min C++ limit</th>
977
  <th>Default limit</th>
978
  </tr>
979
  <tr><td>statement_depth</td>
980
  <td>15</td>  <td>256</td>
981
  <td>none</td>
982
  </tr>
983
  <tr><td>hash_if_depth</td>
984
  <td>8</td>  <td>256</td>
985
  <td>none</td>
986
  </tr>
987
  <tr><td>declarator_max</td>
988
  <td>12</td>  <td>256</td>
989
  <td>none</td>
990
  </tr>
991
  <tr><td>paren_depth</td>
992
  <td>32</td>  <td>256</td>
993
  <td>none</td>
994
  </tr>
995
  <tr><td>name_limit</td>
996
  <td>31</td>  <td>1024</td>
997
  <td>none</td>
998
  </tr>
999
  <tr><td>extern_name_limit</td>
1000
  <td>6</td>  <td>1024</td>
1001
  <td>target</td>
1002
  </tr>
1003
  <tr><td>external_ids</td>
1004
  <td>511</td>  <td>65536</td>
1005
  <td>target</td>
1006
  </tr>
1007
  <tr><td>block_ids</td>
1008
  <td>127</td>  <td>1024</td>
1009
  <td>none</td>
1010
  </tr>
1011
  <tr><td>macro_ids</td>
1012
  <td>1024</td>  <td>65536</td>
1013
  <td>none</td>
1014
  </tr>
1015
  <tr><td>func_pars</td>
1016
  <td>31</td>  <td>256</td>
1017
  <td>none</td>
1018
  </tr>
1019
  <tr><td>func_args</td>
1020
  <td>31</td>  <td>256</td>
1021
  <td>none</td>
1022
  </tr>
1023
  <tr><td>macro_pars</td>
1024
  <td>31</td>  <td>256</td>
1025
  <td>none</td>
1026
  </tr>
1027
  <tr><td>macro_args</td>
1028
  <td>31</td>  <td>256</td>
1029
  <td>none</td>
1030
  </tr>
1031
  <tr><td>line_length</td>
1032
  <td>509</td>  <td>65536</td>
1033
  <td>none</td>
1034
  </tr>
1035
  <tr><td>string_length</td>
1036
  <td>509</td>  <td>65536</td>
1037
  <td>none</td>
1038
  </tr>
1039
  <tr><td>sizeof_object</td>
1040
  <td>32767</td>  <td>262144</td>
1041
  <td>target</td>
1042
  </tr>
1043
  <tr><td>include_depth</td>
1044
  <td>8</td>  <td>256</td>
1045
  <td>256</td>
1046
  </tr>
1047
  <tr><td>switch_cases</td>
1048
  <td>257</td>  <td>16384</td>
1049
  <td>none</td>
1050
  </tr>
1051
  <tr><td>data_members</td>
1052
  <td>127</td>  <td>16384</td>
1053
  <td>none</td>
1054
  </tr>
1055
  <tr><td>enum_consts</td>
1056
  <td>127</td>  <td>4096</td>
1057
  <td>none</td>
1058
  </tr>
1059
  <tr><td>nested_class</td>
1060
  <td>15</td>  <td>256</td>
1061
  <td>none</td>
1062
  </tr>
1063
  <tr><td>atexit_funcs</td>
1064
  <td>32</td>  <td>32</td>
1065
  <td>target</td>
1066
  </tr>
1067
  <tr><td>base_classes</td>
1068
  <td>N/A</td>  <td>16384</td>
1069
  <td>none</td>
1070
  </tr>
1071
  <tr><td>direct_bases</td>
1072
  <td>N/A</td>  <td>1024</td>
1073
  <td>none</td>
1074
  </tr>
1075
  <tr><td>class_members</td>
1076
  <td>N/A</td>  <td>4096</td>
1077
  <td>none</td>
1078
  </tr>
1079
  <tr><td>virtual_funcs</td>
1080
  <td>N/A</td>  <td>16384</td>
1081
  <td>none</td>
1082
  </tr>
1083
  <tr><td>virtual_bases</td>
1084
  <td>N/A</td>  <td>1024</td>
1085
  <td>none</td>
1086
  </tr>
1087
  <tr><td>static_members</td>
1088
  <td>N/A</td>  <td>1024</td>
1089
  <td>none</td>
1090
  </tr>
1091
  <tr><td>friends</td>
1092
  <td>N/A</td>  <td>4096</td>
1093
  <td>none</td>
1094
  </tr>
1095
  <tr><td>access_declarations</td>
1096
  <td>N/A</td>  <td>4096</td>
1097
  <td>none</td>
1098
  </tr>
1099
  <tr><td>ctor_initializers</td>
1100
  <td>N/A</td>  <td>6144</td>
1101
  <td>none</td>
1102
  </tr>
1103
  <tr><td>scope_qualifiers</td>
1104
  <td>N/A</td>  <td>256</td>
1105
  <td>none</td>
1106
  </tr>
1107
  <tr><td>external_specs</td>
1108
  <td>N/A</td>  <td>1024</td>
1109
  <td>none</td>
1110
  </tr>
1111
  <tr><td>template_pars</td>
1112
  <td>N/A</td>  <td>1024</td>
1113
  <td>none</td>
1114
  </tr>
1115
  <tr><td>instance_depth</td>
1116
  <td>N/A</td>  <td>17</td>
1117
  <td>17</td>
1118
  </tr>
1119
  <tr><td>exception_handlers</td>
1120
  <td>N/A</td>  <td>256</td>
1121
  <td>none</td>
1122
  </tr>
1123
  <tr><td>exception_specs</td>
1124
  <td>N/A</td>  <td>256</td>
1125
  <td>none</td>
1126
  </tr>
1127
  </table>
1128
 
1129
  <para>
1130
  It is possible to impose lower limits on most of the quantities listed
1131
  above by means of the directive: 
1132
  <programlisting>
1133
  	#pragma TenDRA++ option value <I>string-literal integer-literal</I>
1134
  </programlisting>
1135
  where <I>string-literal</I> gives one of the quantity identifiers
1136
  listed above and <I>integer-literal</I> gives the limit to be imposed.
1137
  An error is reported if the quantity exceeds this limit (note however
1138
  that checks have not yet been implemented for all of the quantities
1139
  listed).  Note that the <A HREF="#identifier"><code>name_limit</code></A>
1140
  and 
1141
  <A HREF="#include"><code>include_depth</code></A> implementation limits
1142
  can be set using dedicated directives. 
1143
  </para>
1144
  <para>
1145
  The maximum number of errors allowed before the producer bails out
1146
  can be set using the directive:
1147
  <programlisting>
1148
  	#pragma TenDRA++ set error limit <I>integer-literal</I>
1149
  </programlisting>
1150
  The default value is 32.
1151
  </para>
1152
  </sect3>  
1153
 
1154
  <sect3 id="lex">
1155
    <title>2.2.5. Lexical analysis</title>
1156
  <para>
1157
  During lexical analysis, a source file which is not empty should end
1158
  in a newline character.  It is possible to relax this constraint using
1159
  the directive: 
1160
  <programlisting>
1161
  	#pragma TenDRA no nline after file end <I>allow</I>
1162
  </programlisting>
1163
  </para>
1164
  </sect3>  
1165
 
1166
  <sect3 id="keyword">
1167
    <title>2.2.6. Keywords</title>
1168
  <para>
1169
  In several places in this section it is described how to introduce
1170
  keywords for TenDRA language extensions.  By default, no such extra
1171
  keywords are defined.  There are also low-level directives for defining
1172
  and undefining keywords.  The directive: 
1173
  <programlisting>
1174
  	#pragma TenDRA++ keyword <I>identifier</I> for keyword <I>identifier</I> 
1175
  </programlisting>
1176
  can be used to introduce a keyword (the first identifier) standing
1177
  for the standard C++ keyword given by the second identifier.  The
1178
  directive: 
1179
  <programlisting>
1180
  	#pragma TenDRA++ keyword <I>identifier</I> for operator <I>operator</I> 
1181
  </programlisting>
1182
  can similarly be used to introduce a keyword giving an alternative
1183
  representation for the given operator or punctuator, as, for example,
1184
  in: 
1185
  <programlisting>
1186
  	#pragma TenDRA++ keyword and for operator &amp;&amp;
1187
  </programlisting>
1188
  Finally the directive: 
1189
  <programlisting>
1190
  	#pragma TenDRA++ undef keyword <I>identifier</I> 
1191
  </programlisting>
1192
  can be used to undefine a keyword. 
1193
  </para>
1194
  </sect3>  
1195
 
1196
  <sect3 id="comment">
1197
    <title>2.2.7. Comments</title>
1198
  <para>
1199
  C-style comments do not nest.  The directive: 
1200
  <programlisting>
1201
  	#pragma TenDRA nested comment analysis <I>on</I>
1202
  </programlisting>
1203
  enables a check for the characters <code>/*</code> within C-style
1204
  comments. 
1205
  </para>
1206
  </sect3>  
1207
 
1208
  <sect3 id="identifier-names">
1209
    <title>2.2.8. Identifier names</title>
1210
  <para>
1211
  During lexical analysis, each character in the source file has an
1212
  associated look-up value which is used to determine whether the character
1213
  can be used in an identifier name, is a white space character etc.
1214
  These values are stored in a simple look-up table.  It is possible
1215
  to set the look-up value using: 
1216
  <programlisting>
1217
  	#pragma TenDRA++ character <I>character-literal</I> as <I>character-literal</I> allow 
1218
  </programlisting>
1219
  which sets the look-up for the first character to be the default look-up
1220
  for the second character.  The form: 
1221
  <programlisting>
1222
  	#pragma TenDRA++ character <I>character-literal</I> disallow 
1223
  </programlisting>
1224
  sets the look-up of the character to be that of an invalid character.
1225
  The forms: 
1226
  <programlisting>
1227
  	#pragma TenDRA++ character <I>string-literal</I> as <I>character-literal</I> allow 
1228
  	#pragma TenDRA++ character <I>string-literal</I> disallow 
1229
  </programlisting>
1230
  can be used to modify the look-up values for the set of characters
1231
  given by the string literal.  For example: 
1232
  <programlisting>
1233
  	#pragma TenDRA character '$' as 'a' allow
1234
  	#pragma TenDRA character '\r' as ' ' allow
1235
  </programlisting>
1236
  allows <code>$</code> to be used in identifier names (like <code>a</code>)
1237
  and carriage return to be a white space character.  The former is
1238
  a common dialect feature and can also be controlled by the directive:
1239
  <programlisting>
1240
  	#pragma TenDRA dollar as ident <I>allow</I>
1241
  </programlisting>
1242
  </para>
1243
  <para>
1244
  The maximum number of characters allowed in an identifier name can
1245
  be set using the directives: 
1246
  <programlisting>
1247
  	#pragma TenDRA set name limit <I>integer-literal</I>
1248
  	#pragma TenDRA++ set name limit <I>integer-literal</I> warning 
1249
  </programlisting>
1250
  This length is given by the <code>name_limit</code> implementation
1251
  quantity  
1252
  <A HREF="#limits">mentioned above</A>.  Identifiers which exceed this
1253
  length raise an error or a warning, but are not truncated. 
1254
  </para>
1255
  </sect3>  
1256
 
1257
  <sect3 id="int">
1258
    <title>2.2.9. Integer literals</title>
1259
  <para>
1260
  The rules for finding the type of an integer literal can be described
1261
  using directives of the form: 
1262
  <programlisting>
1263
  	#pragma TenDRA integer literal <I>literal-spec</I>
1264
  </programlisting>
1265
  where: 
1266
  <programlisting>
1267
  	<I>literal-spec</I> :
1268
  		<I>literal-base literal-suffix<SUB>opt</SUB> literal-type-list</I>
1269
 
1270
  	<I>literal-base</I> :
1271
  		octal
1272
  		decimal
1273
  		hexadecimal
1274
 
1275
  	<I>literal-suffix</I> :
1276
  		unsigned
1277
  		long
1278
  		unsigned long
1279
  		long long
1280
  		unsigned long long
1281
 
1282
  	<I>literal-type-list</I> :
1283
  		* <I>literal-type-spec</I>
1284
  		<I>integer-literal literal-type-spec</I> | <I>literal-type-list</I>
1285
  		? <I>literal-type-spec</I> | <I>literal-type-list</I>
1286
 
1287
  	<I>literal-type-spec</I> :
1288
  		: <I>type-id</I>
1289
  		* <I>allow<SUB>opt</SUB></I> : <I>identifier</I>
1290
  		* * <I>allow<SUB>opt</SUB></I> :
1291
  </programlisting>
1292
  Each directive gives a literal base and suffix, describing the form
1293
  of an integer literal, and a list of possible types for literals of
1294
  this form. This list gives a mapping from the value of the literal
1295
  to the type to be used to represent the literal.  There are three
1296
  cases for the literal type; it may be a given integral type, it may
1297
  be calculated using a given <A HREF="lib.html#literal">literal type
1298
  token</A>, or it may cause an error to be raised.  There are also
1299
  three cases for describing a literal range; it may be given by values
1300
  less than or equal to a given integer literal, it may be given by
1301
  values which are guaranteed to fit into a given integral type, or
1302
  it may be match any value.  For example: 
1303
  <programlisting>
1304
  	#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int
1305
  	#pragma TenDRA integer literal decimal 32767 : int | ** : l_i
1306
  </programlisting>
1307
  describes how to find the type of a decimal literal with no suffix.
1308
  Values less that or equal to 32767 have type <code>int</code>; larger
1309
  values have target dependent type calculated using the token 
1310
  <code>~lit_int</code>.  Introducing a <code>warning</code> into the
1311
  directive will cause a warning to be printed if the token is used
1312
  to calculate the value. 
1313
  </para>
1314
  <para>
1315
  Note that this scheme extends that implemented by the C producer,
1316
  because of the need for more accurate information in the C++ producer.
1317
  For example, the specification above does not fully express the ISO
1318
  rule that the type of a decimal integer is the first of the types
1319
  <code>int</code>, <code>long</code> and <code>unsigned long</code>
1320
  which it fits into (it only expresses the first step).  However with
1321
  the C++ extensions it is possible to write: 
1322
  <programlisting>
1323
  	#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int
1324
  	#pragma TenDRA integer literal decimal ? : int | ? : long |\
1325
  	    ? : unsigned long | ** : l_i
1326
  </programlisting>
1327
  </para>
1328
  </sect3>  
1329
 
1330
  <sect3 id="char">
1331
    <title>2.2.10. Character literals and built-in types</title>
1332
  <para>
1333
  By default, a simple character literal has type <code>int</code> in
1334
  C and type <code>char</code> in C++.  The type of such literals can
1335
  be controlled using the directive: 
1336
  <programlisting>
1337
  	#pragma TenDRA++ set character literal : <I>type-id</I> 
1338
  </programlisting>
1339
  The type of a wide character literal is given by the implementation
1340
  defined type <code>wchar_t</code>.  By default, the definition of
1341
  this type is taken from the target machine's <code>&lt;stddef.h&gt;</code>
1342
  C header (note that in ISO C++, <code>wchar_t</code> is actually a
1343
  keyword, but its underlying representation must be the same as in
1344
  C). This definition can be overridden in the producer by means of
1345
  the directive: 
1346
  <programlisting>
1347
  	#pragma TenDRA set wchar_t : <I>type-id</I>
1348
  </programlisting>
1349
  for an integral type <I>type-id</I>.  Similarly, the definitions of
1350
  the other implementation dependent integral types which arise naturally
1351
  within the language - the type of the difference of two pointers,
1352
  <code>ptrdiff_t</code>, and the type of the <code>sizeof</code>
1353
  operator, <code>size_t</code> - given in the <code>&lt;stddef.h&gt;</code>
1354
  header can be overridden using the directives: 
1355
  <programlisting>
1356
  	#pragma TenDRA set ptrdiff_t : <I>type-id</I>
1357
  	#pragma TenDRA set size_t : <I>type-id</I>
1358
  </programlisting>
1359
  These directives are useful when targeting a specific machine on which
1360
  the definitions of these types are known; while they may not affect
1361
  the code generated they can cut down on spurious conversion warnings.
1362
  Note that although these types are built into the producer they are
1363
  not visible to the user unless an appropriate header is included (with
1364
  the exception of the keyword <code>wchar_t</code> in ISO C++), however
1365
  the directives: 
1366
  <programlisting>
1367
  	#pragma TenDRA++ type <I>identifier</I> for <I>type-name</I> 
1368
  </programlisting>
1369
  can be used to make these types visible.  They are equivalent to a
1370
  <code>typedef</code> declaration of <I>identifier</I> as the given
1371
  built-in type, <code>ptrdiff_t</code>, <code>size_t</code> or 
1372
  <code>wchar_t</code>. 
1373
  </para>
1374
  <para>
1375
  Whether plain <code>char</code> is signed or unsigned is implementation
1376
  dependent.  By default the implementation is determined by the definition
1377
  of the <A HREF="lib.html#arith"><code>~char</code> token</A>, however
1378
  this can be overridden in the producer either by means of the 
1379
  <A HREF="#table">portability table</A> or by the directive: 
1380
  <programlisting>
1381
  	#pragma TenDRA character <I>character-sign</I>
1382
  </programlisting>
1383
  where <I>character-sign</I> can be <code>signed</code>, 
1384
  <code>unsigned</code> or <code>either</code> (the default).  Again
1385
  this directive is useful primarily when targeting a specific machine
1386
  on which the signedness of <code>char</code> is known. 
1387
  </para>
1388
  </sect3>  
1389
 
1390
  <sect3 id="string">
1391
    <title>2.2.11. String literals</title>
1392
  <para>
1393
  By default, character string literals have type <code>char [n]</code>
1394
  in C and older dialects of C++, but type <code>const char [n]</code>
1395
  in ISO C++.  Similarly wide string literals have type <code>wchar_t
1396
  [n]</code>
1397
  or <code>const wchar_t [n]</code>.  Whether string literals are 
1398
  <code>const</code> or not can be controlled using the two directives:
1399
  <programlisting>
1400
  	#pragma TenDRA++ set string literal : const 
1401
  	#pragma TenDRA++ set string literal : no const 
1402
  </programlisting>
1403
  In the case where literals are <code>const</code>, the array-to-pointer
1404
  conversion is allowed to cast away the <code>const</code> to allow
1405
  for a degree of backwards compatibility.  The status of this deprecated
1406
  conversion can be controlled using the directive: 
1407
  <programlisting>
1408
  	#pragma TenDRA writeable string literal <I>allow</I>
1409
  </programlisting>
1410
  (yes, I know that that should be <code>writable</code>).  Note that
1411
  this directive has a slightly different meaning in the C producer.
1412
  </para>
1413
  <para>
1414
  Adjacent string literals tokens of similar types (either both character
1415
  string literals or both wide string literals) are concatenated at
1416
  an early stage in parser, however it is unspecified what happens if
1417
  a character string literal token is adjacent to a wide string literal
1418
  token.  By default this gives an error, but the directive: 
1419
  <programlisting>
1420
  	#pragma TenDRA unify incompatible string literal <I>allow</I>
1421
  </programlisting>
1422
  can be used to enable the strings to be concatenated to give a wide
1423
  string literal. 
1424
  </para>
1425
  <para>
1426
  If a <code>'</code> or <code>&quot;</code> character does not have
1427
  a matching closing quote on the same line then it is undefined whether
1428
  an implementation should report an unterminated string or treat the
1429
  quote as a single unknown character.  By default, the C++ producer
1430
  treats this as an unterminated string, but this behaviour can be controlled
1431
  using the directive: 
1432
  <programlisting>
1433
  	#pragma TenDRA unmatched quote <I>allow</I>
1434
  </programlisting>
1435
  </para>
1436
  </sect3>  
1437
 
1438
  <sect3 id="escape">
1439
    <title>2.2.12. Escape sequences</title>
1440
  <para>
1441
  By default, if the character following the <code>\</code> in an escape
1442
  sequence is not one of those listed in the ISO C or C++ standards
1443
  then an error is given.  This behaviour, which is left unspecified
1444
  by the standards, can be controlled by the directive: 
1445
  <programlisting>
1446
  	#pragma TenDRA unknown escape <I>allow</I>
1447
  </programlisting>
1448
  The result is that the <code>\</code> in unknown escape sequences
1449
  is ignored, so that <code>\z</code> is interpreted as <code>z</code>,
1450
  for example.  Individual escape sequences can be enabled or disabled
1451
  using the directives: 
1452
  <programlisting>
1453
  	#pragma TenDRA++ escape <I>character-literal</I> as <I>character-literal</I> allow 
1454
  	#pragma TenDRA++ escape <I>character-literal</I> disallow 
1455
  </programlisting>
1456
  so that, for example: 
1457
  <programlisting>
1458
  	#pragma TenDRA++ escape 'e' as '\033' allow 
1459
  	#pragma TenDRA++ escape 'a' disallow 
1460
  </programlisting>
1461
  sets <code>\e</code> to be the ASCII escape character and disables
1462
  the alert character <code>\a</code>. 
1463
  </para>
1464
  <para>
1465
  By default, if the value of a character, given for example by a 
1466
  <code>\x</code> escape sequence, does not fit into its type then an
1467
  error is given.  This implementation dependent behaviour can however
1468
  be controlled by the directive: 
1469
  <programlisting>
1470
  	#pragma TenDRA character escape overflow <I>allow</I>
1471
  </programlisting>
1472
  the value being converted to its type in the normal way. 
1473
  </para>
1474
  </sect3>  
1475
 
1476
  <sect3 id="ppdir">
1477
    <title>2.2.13. Preprocessing directives</title>
1478
  <para>
1479
  Non-standard preprocessing directives can be controlled using the
1480
  directives: 
1481
  <programlisting>
1482
  	#pragma TenDRA directive <I>ppdir allow</I>
1483
  	#pragma TenDRA directive <I>ppdir</I> (ignore) <I>allow</I>
1484
  </programlisting>
1485
  where <I>ppdir</I> can be <code>assert</code>, <code>file</code>,
1486
  <code>ident</code>, <code>import</code> (C++ only), 
1487
  <code>include_next</code> (C++ only), <code>unassert</code>,
1488
  <code>warning</code> (C++ only) or <code>weak</code>.  The second form
1489
  causes the directive to be processed but ignored (note that there is no
1490
  <code>(ignore) disallow</code> form).  The treatment of other unknown
1491
  preprocessing directives can be controlled using: 
1492
  <programlisting>
1493
  	#pragma TenDRA unknown directive <I>allow</I>
1494
  </programlisting>
1495
  Cases where the token following the <code>#</code> in a preprocessing
1496
  directive is not an identifier can be controlled using: 
1497
  <programlisting>
1498
  	#pragma TenDRA no directive/nline after ident <I>allow</I>
1499
  </programlisting>
1500
  When permitted, unknown preprocessing directives are ignored. 
1501
  </para>
1502
  <para>
1503
  By default, unknown <code>#pragma</code> directives are ignored without
1504
  comment, however this behaviour can be modified using the directive:
1505
  <programlisting>
1506
  	#pragma TenDRA unknown pragma <I>allow</I>
1507
  </programlisting>
1508
  Note that any unknown <code>#pragma TenDRA</code> directives always
1509
  give an error. 
1510
  </para>
1511
  <para>
1512
  Older preprocessors allowed text after <code>#else</code> and 
1513
  <code>#endif</code> directives.  The following directive can be used
1514
  to enable such behaviour: 
1515
  <programlisting>
1516
  	#pragma TenDRA text after directive <I>allow</I>
1517
  </programlisting>
1518
  Such text after a directive is ignored. 
1519
  </para>
1520
  <para>
1521
  Some older preprocessors have problems with white space in preprocessing
1522
  directives - whether at the start of the line, before the initial
1523
  <code>#</code>, or between the <code>#</code> and the directive identifier.
1524
  Such white space can be detected using the directives: 
1525
  <programlisting>
1526
  	#pragma TenDRA indented # directive <I>allow</I>
1527
  	#pragma TenDRA indented directive after # <I>allow</I>
1528
  </programlisting>
1529
  respectively. 
1530
  </para>
1531
  </sect3>  
1532
 
1533
  <sect3 id="target-if">
1534
    <title>2.2.14. Target dependent conditional inclusion</title>
1535
  <para>
1536
  One of the effects of trying to compile code in a target independent
1537
  manner is that it is not always possible to completely evaluate the
1538
  condition in a <code>#if</code> directive.  Thus the conditional inclusion
1539
  needs to be preserved until the installer phase.  This can only be
1540
  done if the target dependent <code>#if</code> is more structured than
1541
  is normally required for preprocessing directives. There are two cases;
1542
  in the first, where the <code>#if</code> appears in a statement, it
1543
  is treated as if it were a <code>if</code> statement with braces including
1544
  its branches; that is: 
1545
  <programlisting>
1546
  	#if cond
1547
  	    true_statements
1548
  	#else
1549
  	    false_statements
1550
  	#endif
1551
  </programlisting>
1552
  maps to: 
1553
  <programlisting>
1554
  	if ( cond ) {
1555
  	    true_statements
1556
  	} else {
1557
  	    false_statements
1558
  	}
1559
  </programlisting>
1560
  In the second case, where the <code>#if</code> appears in a list of
1561
  declarations, normally gives an error.  The can however be overridden
1562
  by the directive: 
1563
  <programlisting>
1564
  	#pragma TenDRA++ conditional declaration <I>allow</I>
1565
  </programlisting>
1566
  which causes both branches of the <code>#if</code> to be analysed.
1567
  </para>
1568
  </sect3>  
1569
 
1570
  <sect3 id="include">
1571
    <title>2.2.15. File inclusion directives</title>
1572
  <para>
1573
  There is a maximum depth of nested <code>#include</code>
1574
  directives allowed by the C++ producer. This depth is given by the
1575
  <code>include_depth</code> implementation quantity  
1576
  <A HREF="#limits">mentioned above</A>.  Its value is fairly small
1577
  in order to detect recursive inclusions.  The maximum depth can be
1578
  set using: 
1579
  <programlisting>
1580
  	#pragma TenDRA includes depth <I>integer-literal</I>
1581
  </programlisting>
1582
  </para>
1583
  <para>
1584
  A further check, for full pathnames in <code>#include</code> directives
1585
  (which may not be portable), can be enabled using the directive: 
1586
  <programlisting>
1587
  	#pragma TenDRA++ complete file includes <I>allow</I> 
1588
  </programlisting>
1589
  </para>
1590
  </sect3>  
1591
 
1592
  <sect3 id="macro">
1593
    <title>2.2.16. Macro definitions</title>
1594
  <para>
1595
  By default, multiple consistent definitions of a macro are allowed.
1596
  This behaviour can be controlled using the directive: 
1597
  <programlisting>
1598
  	#pragma TenDRA extra macro definition <I>allow</I>
1599
  </programlisting>
1600
  The ISO C/C++ rules for determining whether two macro definitions
1601
  are consistent are fairly restrictive.  A more relaxed rule allowing
1602
  for consistent renaming of macro parameters can be enabled using:
1603
  <programlisting>
1604
  	#pragma TenDRA weak macro equality <I>allow</I>
1605
  </programlisting>
1606
  </para>
1607
  <para>
1608
  In the definition of macros with parameters, a <code>#</code> in the
1609
  replacement list must be followed by a parameter name, indicating
1610
  the stringising operation.  This behaviour can be controlled by the
1611
  directive: 
1612
  <programlisting>
1613
  	#pragma TenDRA no ident after # <I>allow</I>
1614
  </programlisting>
1615
  which allows a <code>#</code> which is not followed by a parameter
1616
  name to be treated as a normal preprocessing token. 
1617
  </para>
1618
  <para>
1619
  In a list of macro arguments, the effect of a sequence of preprocessing
1620
  tokens which otherwise resembles a preprocessing directive is undefined.
1621
  The C++ producer treats such directives as normal sequences of preprocessing
1622
  tokens, but can be made to report such behaviour using: 
1623
  <programlisting>
1624
  	#pragma TenDRA directive as macro argument <I>allow</I>
1625
  </programlisting>
1626
  </para>
1627
  </sect3>  
1628
 
1629
  <sect3 id="empty">
1630
    <title>2.2.17. Empty source files</title>
1631
  <para>
1632
  ISO C requires that a translation unit should contain at least one
1633
  declaration.  C++ and older dialects of C allow translation units
1634
  which contain no declarations.  This behaviour can be controlled using
1635
  the directive: 
1636
  <programlisting>
1637
  	#pragma TenDRA no external declaration <I>allow</I>
1638
  </programlisting>
1639
  </para>
1640
  </sect3>  
1641
 
1642
  <sect3 id="std">
1643
    <title>2.2.18. The <code>std</code> namespace</title>
1644
  <para>
1645
  Several classes declared in the <code>std</code> namespace arise naturally
1646
  as part of the C++ language specification.  These are as follows:
1647
  <programlisting>
1648
  	std::type_info		// type of typeid construct
1649
  	std::bad_cast		// thrown by dynamic_cast construct
1650
  	std::bad_typeid		// thrown by typeid construct
1651
  	std::bad_alloc		// thrown by new construct
1652
  	std::bad_exception	// used in exception specifications
1653
  </programlisting>
1654
  The definitions of these classes are found, when needed, by looking
1655
  up the appropriate class name in the <code>std</code> namespace. 
1656
  Depending on the context, an error may be reported if the class is
1657
  not found. It is possible to modify the namespace which is searched
1658
  for these classes using the directive: 
1659
  <programlisting>
1660
  	#pragma TenDRA++ set std namespace : <I>scope-name</I>
1661
  </programlisting>
1662
  where <I>scope-name</I> can be an identifier giving a namespace name
1663
  or <code>::</code>, indicating the global namespace. 
1664
  </para>
1665
  </sect3>  
1666
 
1667
  <sect3 id="linkage">
1668
    <title>2.2.19. Object linkage</title>
1669
  <para>
1670
  If an object is declared with both external and internal linkage in
1671
  the same translation unit then, by default, an error is given.  This
1672
  behaviour can be changed using the directive: 
1673
  <programlisting>
1674
  	#pragma TenDRA incompatible linkage <I>allow</I>
1675
  </programlisting>
1676
  When incompatible linkages are allowed, whether the resultant identifier
1677
  has external or internal linkage can be set using one of the directives:
1678
  <programlisting>
1679
  	#pragma TenDRA linkage resolution : off
1680
  	#pragma TenDRA linkage resolution : (external) <I>on</I>
1681
  	#pragma TenDRA linkage resolution : (internal) <I>on</I>
1682
  </programlisting>
1683
  </para>
1684
  <para>
1685
  It is possible to declare objects with external linkage in a block.
1686
  C leaves it undefined whether declarations of the same object in different
1687
  blocks, such as: 
1688
  <programlisting>
1689
  	void f ()
1690
  	{
1691
  	    extern int a ;
1692
  	    ....
1693
  	}
1694
 
1695
  	void g ()
1696
  	{
1697
  	    extern double a ;
1698
  	    ....
1699
  	}
1700
  </programlisting>
1701
  are checked for compatibility.  However in C++ the one definition
1702
  rule implies that such declarations are indeed checked for compatibility.
1703
  The status of this check can be set using the directive: 
1704
  <programlisting>
1705
  	#pragma TenDRA unify external linkage <I>on</I>
1706
  </programlisting>
1707
  Note that it is not possible in ISO C or C++ to declare objects or
1708
  functions with internal linkage in a block.  While <code>static</code>
1709
  object definitions in a block have a specific meaning, there is no
1710
  real reason why <code>static</code> functions should not be declared
1711
  in a block.  This behaviour can be enabled using the directive: 
1712
  <programlisting>
1713
  	#pragma TenDRA block function static <I>allow</I>
1714
  </programlisting>
1715
  </para>
1716
  <para>
1717
  Inline functions have external linkage by default in ISO C++, but
1718
  internal linkage in older dialects.  The default linkage can be set
1719
  using the directive: 
1720
  <programlisting>
1721
  	#pragma TenDRA++ inline linkage <I>linkage-spec</I> 
1722
  </programlisting>
1723
  where <I>linkage-spec</I> can be <code>external</code> or 
1724
  <code>internal</code>.  Similarly <code>const</code> objects have
1725
  internal linkage by default in C++, but external linkage in C.  The
1726
  default linkage can be set using the directive: 
1727
  <programlisting>
1728
  	#pragma TenDRA++ const linkage <I>linkage-spec</I> 
1729
  </programlisting>
1730
  </para>
1731
  <para>
1732
  Older dialects of C treated all identifiers with external linkage
1733
  as if they had been declared <code>volatile</code> (i.e. by being
1734
  conservative in optimising such values).  This behaviour can be enabled
1735
  using the directive: 
1736
  <programlisting>
1737
  	#pragma TenDRA external volatile_t
1738
  </programlisting>
1739
  </para>
1740
  <para>
1741
  It is possible to set the default language linkage using the directive:
1742
  <programlisting>
1743
  	#pragma TenDRA++ external linkage <I>string-literal</I> 
1744
  </programlisting>
1745
  This is equivalent to enclosing the rest of the current checking scope
1746
  in: 
1747
  <programlisting>
1748
  	extern <I>string-literal</I> {
1749
  	    ....
1750
  	}
1751
  </programlisting>
1752
  It is unspecified what happens if such a directive is used within
1753
  an explicit linkage specification and does not nest correctly.  This
1754
  directive is particularly useful when used in a <A HREF="#scope">named
1755
  environment</A> associated with an include directory.  For example,
1756
  it can be used to express the fact that all the objects declared in
1757
  headers included from that directory have C linkage. 
1758
  </para>
1759
  <para>
1760
  A change in ISO C++ relative to older dialects is that the language
1761
  linkage of a function now forms part of the function type.  For example:
1762
  <programlisting>
1763
  	extern &quot;C&quot; int f ( int ) ;
1764
  	int ( *pf ) ( int ) = f ;		// error
1765
  </programlisting>
1766
  The directive: 
1767
  <programlisting>
1768
  	#pragma TenDRA++ external function linkage <I>on</I> 
1769
  </programlisting>
1770
  can be used to control whether function types with differing language
1771
  linkages, but which are otherwise compatible, are considered compatible
1772
  or not. 
1773
  </para>
1774
  </sect3>  
1775
 
1776
  <sect3 id="static">
1777
    <title>2.2.20. Static identifiers</title>
1778
  <para>
1779
  By default, objects and functions with internal linkage are mapped
1780
  to tags without external names in the output TDF capsule.  Thus such
1781
  names are not available to the installer and it needs to make up internal
1782
  names to represent such objects in its output.  This is not desirable
1783
  in such operations as profiling, where a meaningful internal name
1784
  is needed to make sense of the output.  The directive: 
1785
  <programlisting>
1786
  	#pragma TenDRA preserve <I>identifier-list</I>
1787
  </programlisting>
1788
  can be used to preserve the names of the given list of identifiers
1789
  with internal linkage.  This is done using the <code>static_name_def</code>
1790
  TDF construct.  The form: 
1791
  <programlisting>
1792
  	#pragma TenDRA preserve *
1793
  </programlisting>
1794
  will preserve the names of all identifiers with internal linkage in
1795
  this way. 
1796
  </para>
1797
  </sect3>  
1798
 
1799
  <sect3 id="decl_none">
1800
    <title>2.2.21. Empty declarations</title>
1801
  <para>
1802
  ISO C++ requires every declaration or member declaration to introduce
1803
  one or more names into the program.  The directive: 
1804
  <programlisting>
1805
  	#pragma TenDRA unknown struct/union <I>allow</I>
1806
  </programlisting>
1807
  can be used to relax one particular instance of this rule, by allowing
1808
  anonymous class definitions (recall that anonymous unions are objects,
1809
  not types, in C++ and so are not covered by this rule).  The C++ grammar
1810
  also allows a solitary semicolon as a declaration or member declaration;
1811
  however such a declaration does not introduce a name and so contravenes
1812
  the rule above.  The rule can be relaxed in this case using the directive:
1813
  <programlisting>
1814
  	#pragma TenDRA extra ; <I>allow</I>
1815
  </programlisting>
1816
  Note that the C++ grammar explicitly allows for an extra semicolon
1817
  following an inline member function definition, but that semicolons
1818
  following other function definitions are actually empty declarations
1819
  of the form above.  A solitary semicolon in a statement is interpreted
1820
  as an empty expression statement rather than an empty declaration
1821
  statement. 
1822
  </para>
1823
  </sect3>  
1824
 
1825
  <sect3 id="implicit">
1826
    <title>2.2.22. Implicit <code>int</code></title>
1827
  <para>
1828
  The C &quot;implicit <code>int</code>&quot; rule, whereby a type of
1829
  <code>int</code>
1830
  is inferred in a list of type or declaration specifiers which does
1831
  not contain a type name, has been removed in ISO C++, although it
1832
  was supported in older dialects of C++.  This check is controlled
1833
  by the directive: 
1834
  <programlisting>
1835
  	#pragma TenDRA++ implicit int type <I>allow</I> 
1836
  </programlisting>
1837
  Partial relaxations of this rules are allowed.  The directive: 
1838
  <programlisting>
1839
  	#pragma TenDRA++ implicit int type for const/volatile <I>allow</I> 
1840
  </programlisting>
1841
  will allow for implicit <code>int</code> when the list of type specifiers
1842
  contains a cv-qualifier.  Similarly the directive: 
1843
  <programlisting>
1844
  	#pragma TenDRA implicit int type for function return <I>allow</I>
1845
  </programlisting>
1846
  will allow for implicit <code>int</code> in the return type of a function
1847
  definition (this excludes constructors, destructors and conversion
1848
  functions, where special rules apply).  A function definition is the
1849
  only kind of declaration in ISO C where a declaration specifier is
1850
  not required. Older dialects of C allowed declaration specifiers to
1851
  be omitted in other cases.  Support for this behaviour can be enabled
1852
  using: 
1853
  <programlisting>
1854
  	#pragma TenDRA implicit int type for external declaration <I>allow</I>
1855
  </programlisting>
1856
  The four cases can be demonstrated in the following example: 
1857
  <programlisting>
1858
  	extern a ;		// implicit int
1859
  	const b = 1 ;		// implicit const int
1860
 
1861
  	f ()			// implicit function return
1862
  	{
1863
  	    return 2 ;
1864
  	}
1865
 
1866
  	c = 3 ;			// error: not allowed in C++
1867
  </programlisting>
1868
  </para>
1869
  </sect3>  
1870
 
1871
  <sect3 id="longlong">
1872
    <title>2.2.23. Extended integral types</title>
1873
  <para>
1874
  The <code>long long</code> integral types are not part of ISO C or
1875
  C++ by default, however support for them can be enabled using the
1876
  directive: 
1877
  <programlisting>
1878
  	#pragma TenDRA longlong type <I>allow</I>
1879
  </programlisting>
1880
  This support includes allowing <code>long long</code> in type specifiers
1881
  and allowing <code>LL</code> and <code>ll</code> as integer literal
1882
  suffixes. 
1883
  </para>
1884
  <para>
1885
  There is a further directive given by the two cases: 
1886
  <programlisting>
1887
  	#pragma TenDRA set longlong type : long long
1888
  	#pragma TenDRA set longlong type : long
1889
  </programlisting>
1890
  which can be used to control the implementation of the <code>long
1891
  long</code> types.  Either they can be mapped to the 
1892
  <A HREF="lib.html#arith">default representation</A>, which is guaranteed
1893
  to contain at least 64 bits, or they can be mapped to the corresponding
1894
  <code>long</code> types. 
1895
  </para>
1896
  <para>
1897
  Because these <code>long long</code> types are not an intrinsic part
1898
  of C++ the implementation does not integrate them into the language
1899
  as fully as is possible.  This is to prevent the presence or otherwise
1900
  of 
1901
  <code>long long</code> types affecting the semantics of code which
1902
  does not use them.  For example, it would be possible to extend the
1903
  rules for the types of integer literals, integer promotion types and
1904
  arithmetic types to say that if the given value does not fit into
1905
  the standard integral types then the extended types are tried.  This
1906
  has not been done, although these rules could be implemented by changing
1907
  the definitions of the <A HREF="lib.html#arith">standard tokens</A>
1908
  used to determine these types.  By default, only the rules for arithmetic
1909
  types involving a <code>long long</code> operand and for <code>LL</code>
1910
  integer literals mention <code>long long</code> types. 
1911
  </para>
1912
  </sect3>  
1913
 
1914
  <sect3 id="bitfield-types">
1915
    <title>2.2.24. Bitfield types</title>
1916
  <para>
1917
  The C++ rules on bitfield types differ slightly from the C rules.
1918
  Firstly any integral or enumeration type is allowed in a bitfield,
1919
  and secondly the bitfield width may exceed the underlying type size
1920
  (the extra bits being treated as padding).  These properties can be
1921
  controlled using the directives: 
1922
  <programlisting>
1923
  	#pragma TenDRA extra bitfield int type <I>allow</I>
1924
  	#pragma TenDRA bitfield overflow <I>allow</I>
1925
  </programlisting>
1926
  respectively. 
1927
  </para>
1928
  </sect3>  
1929
 
1930
  <sect3 id="elab">
1931
    <title>2.2.25. Elaborated type specifiers</title>
1932
  <para>
1933
  In elaborated type specifiers, the class key (<code>class</code>,
1934
  <code>struct</code>, <code>union</code> or <code>enum</code>) should
1935
  agree with any previous declaration of the type (except that <code>class</code>
1936
  and <code>struct</code> are interchangeable).  This requirement can
1937
  be relaxed using the directive: 
1938
  <programlisting>
1939
  	#pragma TenDRA ignore struct/union/enum tag <I>on</I>
1940
  </programlisting>
1941
  </para>
1942
  <para>
1943
  In ISO C and C++ it is not possible to give a forward declaration
1944
  of an enumeration type.  This constraint can be relaxed using the
1945
  directive: 
1946
  <programlisting>
1947
  	#pragma TenDRA forward enum declaration <I>allow</I>
1948
  </programlisting>
1949
  Until the end of its definition, an enumeration type is treated as
1950
  an incomplete type (as with class types).  In enumeration definitions,
1951
  and a couple of other contexts where comma-separated lists are required,
1952
  the directive: 
1953
  <programlisting>
1954
  	#pragma TenDRA extra , <I>allow</I>
1955
  </programlisting>
1956
  can be used to allow a trailing comma at the end of the list. 
1957
  </para>
1958
  <para>
1959
  The directive: 
1960
  <programlisting>
1961
  	#pragma TenDRA complete struct/union analysis <I>on</I>
1962
  </programlisting>
1963
  can be used to enable a check that every class or union has been completed
1964
  within each translation unit in which it is declared. 
1965
  </para>
1966
  </sect3>  
1967
 
1968
  <sect3 id="impl_func">
1969
    <title>2.2.26. Implicit function declarations</title>
1970
  <para>
1971
  C, but not C++, allows calls to undeclared functions, the function
1972
  being declared implicitly.  It is possible to enable support for implicit
1973
  function declarations using the directive: 
1974
  <programlisting>
1975
  	#pragma TenDRA implicit function declaration <I>on</I>
1976
  </programlisting>
1977
  Such implicitly declared functions have C linkage and type 
1978
  <code>int ( ... )</code>. 
1979
  </para>
1980
  </sect3>  
1981
 
1982
  <sect3 id="weak">
1983
    <title>2.2.27. Weak function prototypes</title>
1984
  <para>
1985
  The C producer supports a concept, weak prototypes, whereby type checking
1986
  can be applied to the arguments of a non-prototype function.  This
1987
  checking can be enabled using the directive: 
1988
  <programlisting>
1989
  	#pragma TenDRA weak prototype analysis <I>on</I>
1990
  </programlisting>
1991
  The concept of weak prototypes is not applicable to C++, where all
1992
  functions are prototyped.  The C++ producer does allow the syntax
1993
  for explicit weak prototype declarations, but treats them as if they
1994
  were normal prototypes.  These declarations are denoted by means of
1995
  a keyword, 
1996
  <code>WEAK</code> say, introduced by the directive: 
1997
  <programlisting>
1998
  	#pragma TenDRA keyword <I>identifier</I> for weak
1999
  </programlisting>
2000
  preceding the <code>(</code> of the function declarator.  The directives:
2001
  <programlisting>
2002
  	#pragma TenDRA prototype <I>allow</I>
2003
  	#pragma TenDRA prototype (weak) <I>allow</I>
2004
  </programlisting>
2005
  which can be used in the C producer to warn of prototype or weak prototype
2006
  declarations, are similarly ignored by the C++ producer. 
2007
  </para>
2008
  <para>
2009
  The C producer also allows the directives: 
2010
  <programlisting>
2011
  	#pragma TenDRA argument <I>type-id</I> as <I>type-id</I>
2012
  	#pragma TenDRA argument <I>type-id</I> as ...
2013
  	#pragma TenDRA extra ... <I>allow</I>
2014
  	#pragma TenDRA incompatible promoted function argument <I>allow</I>
2015
  </programlisting>
2016
  which control the compatibility of function types.  These directives
2017
  are ignored by the C++ producer (some of them would make sense in
2018
  the context of C++ but would over-complicate function overloading).
2019
  </para>
2020
  </sect3>  
2021
 
2022
  <sect3 id="printf">
2023
    <title>2.2.28. <code>printf</code> and <code>scanf</code>
2024
  argument checking</title>
2025
  <para>
2026
  The C producer includes a number of checks that the arguments in a
2027
  call to a function in the <code>printf</code> or <code>scanf</code>
2028
  families match the given format string.  The check is implemented
2029
  by using the directives: 
2030
  <programlisting>
2031
  	#pragma TenDRA type <I>identifier</I> for ... printf
2032
  	#pragma TenDRA type <I>identifier</I> for ... scanf
2033
  </programlisting>
2034
  to introduce a type representing a <code>printf</code> or <code>scanf</code>
2035
  format string.  For most purposes this type is treated as <code>const
2036
  char *</code>, but when it appears in a function declaration it alerts
2037
  the producer that any extra arguments passed to that function should
2038
  match the format string passed as the corresponding argument.  The
2039
  TenDRA API headers conditionally declare <code>printf</code>, 
2040
  <code>scanf</code> and similar functions in something like the form:
2041
  <programlisting>
2042
  	#ifdef __NO_PRINTF_CHECKS
2043
  	typedef const char *__printf_string ;
2044
  	#else
2045
  	#pragma TenDRA type __printf_string for ... printf
2046
  	#endif
2047
 
2048
  	int printf ( __printf_string, ... ) ;
2049
  	int fprintf ( FILE *, __printf_string, ... ) ;
2050
  	int sprintf ( char *, __printf_string, ... ) ;
2051
  </programlisting>
2052
  These declarations can be skipped, effectively disabling this check,
2053
  by defining the <code>__NO_PRINTF_CHECKS</code> macro. 
2054
  </para>
2055
  <para>
2056
  <IMG SRC="../images/warn.gif" ALT="warning"/>
2057
  These <code>printf</code> and <code>scanf</code> format string checks
2058
  have not yet been implemented in the C++ producer due to presence
2059
  of an alternative, type checked, I/O package - namely 
2060
  <code>&lt;iostream&gt;</code>.  The format string types are simply
2061
  treated as <code>const char *</code>. 
2062
  </para>
2063
  </sect3>  
2064
 
2065
  <sect3 id="typedef">
2066
    <title>2.2.29. Type declarations</title>
2067
  <para>
2068
  C does not allow multiple definitions of a <code>typedef</code> name,
2069
  whereas C++ allows multiple consistent definitions.  This behaviour
2070
  can be controlled using the directive: 
2071
  <programlisting>
2072
  	#pragma TenDRA extra type definition <I>allow</I>
2073
  </programlisting>
2074
  </para>
2075
  </sect3>  
2076
 
2077
  <sect3 id="compatible">
2078
    <title>2.2.30. Type compatibility</title>
2079
  <para>
2080
  The directive: 
2081
  <programlisting>
2082
  	#pragma TenDRA incompatible type qualifier <I>allow</I>
2083
  </programlisting>
2084
  allows objects to be redeclared with different cv-qualifiers (normally
2085
  such redeclarations would be incompatible).  The composite type is
2086
  qualified using the join of the cv-qualifiers in the various redeclarations.
2087
  </para>
2088
  <para>
2089
  The directive: 
2090
  <programlisting>
2091
  	#pragma TenDRA compatible type : <I>type-id</I> == <I>type-id</I> : <I>allow
2092
  </I>
2093
  </programlisting>
2094
  asserts that the given two types are compatible.  Currently the only
2095
  implemented version is <code>char * == void *</code> which enables
2096
  <code>char *</code> to be used as a generic pointer as it was in older
2097
  dialects of C. 
2098
  </para>
2099
  </sect3>  
2100
 
2101
  <sect3 id="complete">
2102
    <title>2.2.31. Incomplete types</title>
2103
  <para>
2104
  Some dialects of C allow incomplete arrays as member types.  These
2105
  are generally used as a place-holder at the end of a structure to
2106
  allow for the allocation of an arbitrarily sized array.  Support for
2107
  this feature can be enabled using the directive: 
2108
  <programlisting>
2109
  	#pragma TenDRA incomplete type as object type <I>allow</I>
2110
  </programlisting>
2111
  </para>
2112
  </sect3>  
2113
 
2114
  <sect3 id="type-conversions">
2115
    <title>2.2.32. Type conversions</title>
2116
  <para>
2117
  There are a number of directives which allow various classes of type
2118
  conversion to be checked.  The directives: 
2119
  <programlisting>
2120
  	#pragma TenDRA conversion analysis (int-int explicit) <I>on</I>
2121
  	#pragma TenDRA conversion analysis (int-int implicit) <I>on</I>
2122
  </programlisting>
2123
  will check for unsafe explicit or implicit conversions between arithmetic
2124
  types.  Similarly conversions between pointers and arithmetic types
2125
  can be checked using: 
2126
  <programlisting>
2127
  	#pragma TenDRA conversion analysis (int-pointer explicit) <I>on</I>
2128
  	#pragma TenDRA conversion analysis (int-pointer implicit) <I>on</I>
2129
  </programlisting>
2130
  or equivalently: 
2131
  <programlisting>
2132
  	#pragma TenDRA conversion analysis (pointer-int explicit) <I>on</I>
2133
  	#pragma TenDRA conversion analysis (pointer-int implicit) <I>on</I>
2134
  </programlisting>
2135
  Conversions between pointer types can be checked using: 
2136
  <programlisting>
2137
  	#pragma TenDRA conversion analysis (pointer-pointer explicit) <I>on</I>
2138
  	#pragma TenDRA conversion analysis (pointer-pointer implicit) <I>on</I>
2139
  </programlisting>
2140
  </para>
2141
  <para>
2142
  There are some further variants which can be used to enable useful
2143
  sets of conversion checks.  For example: 
2144
  <programlisting>
2145
  	#pragma TenDRA conversion analysis (int-int) <I>on</I>
2146
  </programlisting>
2147
  enables both implicit and explicit arithmetic conversion checks. 
2148
  The directives: 
2149
  <programlisting>
2150
  	#pragma TenDRA conversion analysis (int-pointer) <I>on</I>
2151
  	#pragma TenDRA conversion analysis (pointer-int) <I>on</I>
2152
  	#pragma TenDRA conversion analysis (pointer-pointer) <I>on</I>
2153
  </programlisting>
2154
  are equivalent to their corresponding explicit forms (because the
2155
  implicit forms are illegal by default).  The directive: 
2156
  <programlisting>
2157
  	#pragma TenDRA conversion analysis <I>on</I>
2158
  </programlisting>
2159
  is equivalent to the four directives just given.  It enables checks
2160
  on implicit and explicit arithmetic conversions, explicit arithmetic
2161
  to pointer conversions and explicit pointer conversions. 
2162
  </para>
2163
  <para>
2164
  The default settings for these checks are determined by the implicit
2165
  and explicit conversions allowed in C++.  Note that there are differences
2166
  between the conversions allowed in C and C++.  For example, an arithmetic
2167
  type can be converted implicitly to an enumeration type in C, but
2168
  not in C++.  The directive: 
2169
  <programlisting>
2170
  	#pragma TenDRA conversion analysis (int-enum implicit) <I>on</I> 
2171
  </programlisting>
2172
  can be used to control the status of this conversion.  The level of
2173
  severity for an error message arising from such a conversion is the
2174
  maximum of the severity set by this directive and that set by the
2175
  <code>int-int implicit</code> directive above. 
2176
  </para>
2177
  <para>
2178
  The implicit pointer conversions described above do not include conversions
2179
  to and from the generic pointer <code>void *</code>, which have their
2180
  own controlling directives.  A pointer of type <code>void *</code>
2181
  can be converted implicitly to another pointer type in C but not in
2182
  C++; this is controlled by the directive: 
2183
  <programlisting>
2184
  	#pragma TenDRA++ conversion analysis (void*-pointer implicit) <I>on</I> 
2185
  </programlisting>
2186
  The reverse conversion, from a pointer type to <code>void *</code>
2187
  is allowed in both C and C++, and has a controlling directive: 
2188
  <programlisting>
2189
  	#pragma TenDRA++ conversion analysis (pointer-void* implicit) <I>on</I> 
2190
  </programlisting>
2191
  </para>
2192
  <para>
2193
  In ISO C and C++, a function pointer can only be cast to other function
2194
  pointers, not to object pointers or <code>void *</code>.  Many dialects
2195
  however allow function pointers to be cast to and from other pointers.
2196
  This behaviour can be controlled using the directive: 
2197
  <programlisting>
2198
  	#pragma TenDRA function pointer as pointer <I>allow</I>
2199
  </programlisting>
2200
  which causes function pointers to be treated in the same way as all
2201
  other pointers. 
2202
  </para>
2203
  <para>
2204
  The integer conversion checks described above only apply to unsafe
2205
  conversions.  A simple-minded check for shortening conversions is
2206
  not adequate, as is shown by the following example: 
2207
  <programlisting>
2208
  	char a = 1, b = 2 ;
2209
  	char c = a + b ;
2210
  </programlisting>
2211
  the sum <code>a + b</code> is evaluated as an <code>int</code> which
2212
  is then shortened to a <code>char</code>.  Any check which does not
2213
  distinguish this sort of &quot;safe&quot; shortening conversion from
2214
  unsafe shortening conversions such as: 
2215
  <programlisting>
2216
  	int a = 1, b = 2 ;
2217
  	char c = a + b ;
2218
  </programlisting>
2219
  is not likely to be very useful.  The producer therefore associates
2220
  two types with each integral expression; the first is the normal,
2221
  representation type and the second is the underlying, semantic type.
2222
  Thus in the first example, the representation type of <code>a + b</code>
2223
  is <code>int</code>, but semantically it is still a <code>char</code>.
2224
  The conversion analysis is based on the semantic types. 
2225
  </para>
2226
  <para>
2227
  <IMG SRC="../images/warn.gif" ALT="warning"/>
2228
  The C producer supports a directive: 
2229
  <programlisting>
2230
  	#pragma TenDRA keyword <I>identifier</I> for type representation
2231
  </programlisting>
2232
  whereby a keyword can be introduced which can be used to explicitly
2233
  declare a type with given representation and semantic components.
2234
  Unfortunately this makes the <A HREF="parse.html">C++ grammar</A>
2235
  ambiguous, so it has not yet been implemented in the C++ producer.
2236
  </para>
2237
  <para>
2238
  It is possible to allow individual conversions by means of conversion
2239
  tokens.  A <A HREF="token.html">procedure token</A> which takes one
2240
  rvalue expression program parameter and returns an rvalue expression,
2241
  such as: 
2242
  <programlisting>
2243
  	#pragma token PROC ( EXP : t : ) EXP : s : conv #
2244
  </programlisting>
2245
  can be regarded as mapping expressions of type <code>t</code> to expressions
2246
  of type <code>s</code>.  The directive: 
2247
  <programlisting>
2248
  	#pragma TenDRA conversion <I>identifier-list</I> allow
2249
  </programlisting>
2250
  can be used to nominate such a token as a conversion token.  That
2251
  is to say, if the conversion, whether explicit or implicit, from <code>t</code>
2252
  to <code>s</code> cannot be done by other means, it is done by applying
2253
  the token <code>conv</code>, so: 
2254
  <programlisting>
2255
  	t a ;
2256
  	s b = a ;		// maps to conv ( a )
2257
  </programlisting>
2258
  Note that, unlike conversion functions, conversion tokens can be applied
2259
  to any types. 
2260
  </para>
2261
  </sect3>  
2262
 
2263
  <sect3 id="cast">
2264
    <title>2.2.33. Cast expressions</title>
2265
  <para>
2266
  ISO C++ introduces the constructs <code>static_cast</code>, 
2267
  <code>const_cast</code> and <code>reinterpret_cast</code>, which can
2268
  be used in various contexts where an old style explicit cast would
2269
  previously have been used.  By default, an explicit cast can perform
2270
  any combination of the conversions performed by these three constructs.
2271
  To aid migration to the new style casts the directives: 
2272
  <programlisting>
2273
  	#pragma TenDRA++ explicit cast as <I>cast-state allow</I> 
2274
  	#pragma TenDRA++ explicit cast <I>allow</I> 
2275
  </programlisting>
2276
  where <I>cast-state</I> is defined as follows: 
2277
  <programlisting>
2278
  	<I>cast-state</I> :
2279
  		static_cast
2280
  		const_cast
2281
  		reinterpret_cast
2282
  		static_cast | <I>cast-state</I>
2283
  		const_cast | <I>cast-state</I>
2284
  		reinterpret_cast | <I>cast-state</I>
2285
  </programlisting>
2286
  can be used to restrict the conversions which can be performed using
2287
  explicit casts.  The first form sets the interpretation of explicit
2288
  cast to be combinations of the given constructs; the second resets
2289
  the interpretation to the default.  For example: 
2290
  <programlisting>
2291
  	#pragma TenDRA++ explicit cast as static_cast | const_cast allow
2292
  </programlisting>
2293
  means that conversions requiring <code>reinterpret_cast</code> (the
2294
  most unportable conversions) will not be allowed to be performed using
2295
  explicit casts, but will have to be given as a <code>reinterpret_cast</code>
2296
  construct.  Changing <code>allow</code> to <code>warning</code> will
2297
  also cause a warning to be issued for every explicit cast expression.
2298
  </para>
2299
  </sect3>  
2300
 
2301
  <sect3 id="ellipsis">
2302
    <title>2.2.34. Ellipsis functions</title>
2303
  <para>
2304
  The directive: 
2305
  <programlisting>
2306
  	#pragma TenDRA ident ... <I>allow</I>
2307
  </programlisting>
2308
  may be used to enable or disable the use of <code>...</code> as a
2309
  primary expression in a function defined with ellipsis.  The type
2310
  of such an expression is implementation defined.  This expression
2311
  is used in the definition of the <A HREF="lib.html#ellipsis"><code>va_start
2312
  </code>
2313
  macro</A> in the <code>&lt;stdarg.h&gt;</code> header.  This header
2314
  automatically enables this switch. 
2315
  </para>
2316
  </sect3>  
2317
 
2318
  <sect3 id="overload">
2319
    <title>2.2.35. Overloaded functions</title>
2320
  <para>
2321
  Older dialects of C++ did not report ambiguous overloaded function
2322
  resolutions, but instead resolved the call to the first of the most
2323
  viable candidates to be declared.  This behaviour can be controlled
2324
  using the directive: 
2325
  <programlisting>
2326
  	#pragma TenDRA++ ambiguous overload resolution <I>allow</I> 
2327
  </programlisting>
2328
  There are occasions when the resolution of an overloaded function
2329
  call is not clear.  The directive: 
2330
  <programlisting>
2331
  	#pragma TenDRA++ overload resolution <I>allow</I> 
2332
  </programlisting>
2333
  can be used to report the resolution of any such call (whether explicit
2334
  or implicit) where there is more than one viable candidate. 
2335
  </para>
2336
  <para>
2337
  An interesting consequence of compiling C++ in a target independent
2338
  manner is that certain overload resolutions can only be determined
2339
  at install-time. For example, in: 
2340
  <programlisting>
2341
  	int f ( int ) ;
2342
  	int f ( unsigned int ) ;
2343
  	int f ( long ) ;
2344
  	int f ( unsigned long ) ;
2345
 
2346
  	int a = f ( sizeof ( int ) ) ;	// which f?
2347
  </programlisting>
2348
  the type of the <code>sizeof</code> operator, <code>size_t</code>,
2349
  is target dependent, but its promotion must be one of the types 
2350
  <code>int</code>, <code>unsigned int</code>, <code>long</code> or
2351
  <code>unsigned long</code>.  Thus the call to <code>f</code> always
2352
  has a unique resolution, but what it is is target dependent.  The
2353
  equivalent directives: 
2354
  <programlisting>
2355
  	#pragma TenDRA++ conditional overload resolution <I>allow</I> 
2356
  	#pragma TenDRA++ conditional overload resolution (complete) <I>allow</I> 
2357
  </programlisting>
2358
  can be used to warn about such target dependent overload resolutions.
2359
  By default, such resolutions are only allowed if there is a unique
2360
  resolution for each possible implementation of the argument types
2361
  (note that, for simplicity, the possibility of <code>long long</code>
2362
  implementation types is ignored).  The directive: 
2363
  <programlisting>
2364
  	#pragma TenDRA++ conditional overload resolution (incomplete) <I>allow</I> 
2365
  </programlisting>
2366
  can be used to allow target dependent overload resolutions which only
2367
  have resolutions for some of the possible implementation types (if
2368
  one of the <code>f</code> declarations above was removed, for example).
2369
  If the implementation does not match one of these types then an install-time
2370
  error is given. 
2371
  </para>
2372
  <para>
2373
  There are restrictions on the set of candidate functions involved
2374
  in a target dependent overload resolution.  Most importantly, it should
2375
  be possible to bring their return types to a common type, as if by
2376
  a series of <code>?:</code> operations.  This common type is the type
2377
  of the target dependent call.  By this means, target dependent types
2378
  are prevented from propagating further out into the program.  Note
2379
  that since sets of overloaded functions usually have the same semantics,
2380
  this does not usually present a problem. 
2381
  </para>
2382
  </sect3>  
2383
 
2384
  <sect3 id="expressions">
2385
    <title>2.2.36. Expressions</title>
2386
  <para>
2387
  The directive: 
2388
  <programlisting>
2389
  	#pragma TenDRA operator precedence analysis <I>on</I> 
2390
  </programlisting>
2391
  can be used to enable a check for expressions where the operator precedence
2392
  is not necessarily what might be expected.  The intended precedence
2393
  can be clarified by means of explicit parentheses.  The precedence
2394
  levels checked are as follows: 
2395
  <itemizedlist>
2396
  <listitem><code>&amp;&amp;</code> versus <code>||</code>. 
2397
  </listitem>
2398
  <listitem><code>&lt;&lt;</code> and <code>&gt;&gt;</code> versus binary
2399
  <code>+</code> and <code>-</code>. 
2400
  </listitem>
2401
  <listitem>Binary <code>&amp;</code> versus binary <code>+</code>,     <code>-</code>,
2402
  <code>==</code>, <code>!=</code>, <code>&gt;</code>,     <code>&gt;=</code>,
2403
  <code>&lt;</code> and <code>&lt;=</code>. 
2404
  </listitem>
2405
  <listitem><code>^</code> versus binary <code>&amp;</code>, <code>+</code>,
2406
  <code>-</code>, <code>==</code>, <code>!=</code>, <code>&gt;</code>,
2407
  <code>&gt;=</code>, <code>&lt;</code> and <code>&lt;=</code>. 
2408
  </listitem>
2409
  <listitem><code>|</code> versus binary <code>^</code>, <code>&amp;</code>,
2410
  <code>+</code>, <code>-</code>, <code>==</code>, <code>!=</code>,
2411
  <code>&gt;</code>, <code>&gt;=</code>, <code>&lt;</code> and     <code>&lt;=
2412
  </code>. 
2413
  </listitem>
2414
  </itemizedlist>
2415
  Also checked are expressions such as <code>a &lt; b &lt; c</code>
2416
  which do not have their normal mathematical meaning.  For example,
2417
  in: 
2418
  <programlisting>
2419
  	d = a &lt;&lt; b + c ;	// precedence is a &lt;&lt; ( b + c )
2420
  </programlisting>
2421
  the precedence is counter-intuitive, although strangely enough, it
2422
  isn't in: 
2423
  <programlisting>
2424
  	cout &lt;&lt; b + c ;		// precedence is cout &lt;&lt; ( b + c )
2425
  </programlisting>
2426
  </para>
2427
  <para>
2428
  Other dubious arithmetic operations can be checked for using the directive:
2429
  <programlisting>
2430
  	#pragma TenDRA integer operator analysis <I>on</I>
2431
  </programlisting>
2432
  This includes checks for operations, such as division by a negative
2433
  value, which are implementation dependent, and those such as testing
2434
  whether an unsigned value is less than zero, which serve no purpose.
2435
  Similarly the directive: 
2436
  <programlisting>
2437
  	#pragma TenDRA++ pointer operator analysis <I>on</I> 
2438
  </programlisting>
2439
  checks for dubious pointer operations.  This includes very simple
2440
  bounds checking for arrays and checking that only the simple literal
2441
  <code>0</code>
2442
  is used in null pointer constants: 
2443
  <programlisting>
2444
  	char *p = 1 - 1 ;	// valid, but weird
2445
  </programlisting>
2446
  </para>
2447
  <para>
2448
  The directive: 
2449
  <programlisting>
2450
  	#pragma TenDRA integer overflow analysis <I>on</I>
2451
  </programlisting>
2452
  is used to control the treatment of overflows in the evaluation of
2453
  integer constant expressions.  This includes the detection of division
2454
  by zero. 
2455
  </para>
2456
  </sect3>  
2457
 
2458
  <sect3 id="initialiser-expressions">
2459
    <title>2.2.37. Initialiser expressions</title>
2460
  <para>
2461
  C, but not C++, only allows constant expressions in static initialisers.
2462
  The directive: 
2463
  <programlisting>
2464
  	#pragma TenDRA variable initialization <I>allow</I>
2465
  </programlisting>
2466
  can be enable support for C++-style dynamic initialisers.  Conversely,
2467
  it can be used in C++ to detect such dynamic initialisers. 
2468
  </para>
2469
  <para>
2470
  In older dialects of C it was not possible to initialise an automatic
2471
  variable of structure or union type.  This can be checked for using
2472
  the directive: 
2473
  <programlisting>
2474
  	#pragma TenDRA initialization of struct/union (auto) <I>allow</I>
2475
  </programlisting>
2476
  </para>
2477
  <para>
2478
  The directive: 
2479
  <programlisting>
2480
  	#pragma TenDRA++ complete initialization analysis <I>on</I> 
2481
  </programlisting>
2482
  can be used to check aggregate initialisers.  The initialiser should
2483
  be fully bracketed (i.e. with no elision of braces), and should have
2484
  an entry for each member of the structure or array. 
2485
  </para>
2486
  </sect3>  
2487
 
2488
  <sect3 id="lvalue">
2489
    <title>2.2.38. Lvalue expressions</title>
2490
  <para>
2491
  C++ defines the results of several operations to be lvalues, whereas
2492
  they are rvalues in C.  The directive: 
2493
  <programlisting>
2494
  	#pragma TenDRA conditional lvalue <I>allow</I>
2495
  </programlisting>
2496
  is used to apply the C++ rules for lvalues in conditional (<code>?:</code>)
2497
  expressions. 
2498
  </para>
2499
  <para>
2500
  Older dialects of C++ allowed <code>this</code> to be treated as an
2501
  lvalue. It is possible to enable support for this dialect feature
2502
  using the directive: 
2503
  <programlisting>
2504
  	#pragma TenDRA++ this lvalue <I>allow</I> 
2505
  </programlisting>
2506
  however it is recommended that programs using this feature should
2507
  be modified. 
2508
  </para>
2509
  </sect3>  
2510
 
2511
  <sect3 id="discard">
2512
    <title>2.2.39. Discarded expressions</title>
2513
  <para>
2514
  The directive: 
2515
  <programlisting>
2516
  	#pragma TenDRA discard analysis <I>on</I>
2517
  </programlisting>
2518
  can be used to enable a check for values which are calculated but
2519
  not used.  There are three checks controlled by this directive, each
2520
  of which can be controlled independently.  The directive: 
2521
  <programlisting>
2522
  	#pragma TenDRA discard analysis (function return) <I>on</I>
2523
  </programlisting>
2524
  checks for functions which return a value which is not used.  The
2525
  check needs to be enabled for both the declaration and the call of
2526
  the function in order for a discarded function return to be reported.
2527
  Discarded returns for overloaded operator functions are never reported.
2528
  The directive: 
2529
  <programlisting>
2530
  	#pragma TenDRA discard analysis (value) <I>on</I>
2531
  </programlisting>
2532
  checks for other expressions which are not used.  Finally, the directive:
2533
  <programlisting>
2534
  	#pragma TenDRA discard analysis (static) <I>on</I>
2535
  </programlisting>
2536
  checks for variables with internal linkage which are defined but not
2537
  used. 
2538
  </para>
2539
  <para>
2540
  An unused function return or other expression can be asserted to be
2541
  deliberately discarded by explicitly casting it to <code>void</code>
2542
  or, equivalently, preceding it by a keyword introduced using the directive:
2543
  <programlisting>
2544
  	#pragma TenDRA keyword <I>identifier</I> for discard value
2545
  </programlisting>
2546
  A static variable can be asserted to be deliberately unused by including
2547
  it in list of identifiers in a directive of the form: 
2548
  <programlisting>
2549
  	#pragma TenDRA suspend static <I>identifier-list</I>
2550
  </programlisting>
2551
  </para>
2552
  </sect3>  
2553
 
2554
  <sect3 id="if">
2555
    <title>2.2.40. Conditional and iteration statements</title>
2556
  <para>
2557
  The directive: 
2558
  <programlisting>
2559
  	#pragma TenDRA const conditional <I>allow</I> 
2560
  </programlisting>
2561
  can be used to enable a check for constant expressions used in conditional
2562
  contexts.  A literal constant is allowed in the condition of a <code>while
2563
  </code>, <code>for</code> or <code>do</code> statement to allow for
2564
  such common constructs as: 
2565
  <programlisting>
2566
  	while ( true ) {
2567
  	    // while statement body
2568
  	}
2569
  </programlisting>
2570
  and target dependent constant expressions are allowed in the condition
2571
  of an <code>if</code> statement, but otherwise constant conditions
2572
  are reported according to the status of this check. 
2573
  </para>
2574
  <para>
2575
  The common error of writing <code>=</code> rather than <code>==</code>
2576
  in conditions can be detected using the directive: 
2577
  <programlisting>
2578
  	#pragma TenDRA assignment as bool <I>allow</I>
2579
  </programlisting>
2580
  which can be used to disallow such assignment expressions in contexts
2581
  where a boolean is expected.  The error message can be suppressed
2582
  by enclosing the assignment within parentheses. 
2583
  </para>
2584
  <para>
2585
  Another common error associated with iteration statements, particularly
2586
  with certain <A HREF="style.html">heretical</A> brace styles, is the
2587
  accidental insertion of an extra semicolon as in: 
2588
  <programlisting>
2589
  	for ( init ; cond ; step ) ;
2590
  	{
2591
  	    // for statement body
2592
  	}
2593
  </programlisting>
2594
  The directive: 
2595
  <programlisting>
2596
  	#pragma TenDRA extra ; after conditional <I>allow</I>
2597
  </programlisting>
2598
  can be used to enable a check for such suspicious empty iteration
2599
  statement bodies (it actually checks for <code>;{</code>). 
2600
  </para>
2601
  </sect3>  
2602
 
2603
  <sect3 id="switch">
2604
    <title>2.2.41. Switch statements</title>
2605
  <para>
2606
  A <code>switch</code> statement is said to be exhaustive if its control
2607
  statement is guaranteed to take one of the values of its 
2608
  <code>case</code> labels, or if it has a <code>default</code> label.
2609
  The TenDRA C and C++ producers allow a <code>switch</code> statement
2610
  to be asserted to be exhaustive using the syntax: 
2611
  <programlisting>
2612
  	switch ( cond ) EXHAUSTIVE {
2613
  	    // switch statement body
2614
  	}
2615
  </programlisting>
2616
  where <code>EXHAUSTIVE</code> is either the directive: 
2617
  <programlisting>
2618
  	#pragma TenDRA exhaustive
2619
  </programlisting>
2620
  or a keyword introduced using: 
2621
  <programlisting>
2622
  	#pragma TenDRA keyword <I>identifier</I> for exhaustive
2623
  </programlisting>
2624
  Knowing whether a <code>switch</code> statement is exhaustive or not
2625
  means that checks relying on flow analysis (including variable usage
2626
  checks) can be applied more precisely. 
2627
  </para>
2628
  <para>
2629
  In certain circumstances it is possible to deduce whether a 
2630
  <code>switch</code> statement is exhaustive or not.  For example,
2631
  the directive: 
2632
  <programlisting>
2633
  	#pragma TenDRA enum switch analysis <I>on</I> 
2634
  </programlisting>
2635
  enables a check on <code>switch</code> statements on values of enumeration
2636
  type.  Such statements should be exhaustive, either explicitly by
2637
  using the <code>EXHAUSTIVE</code> keyword or declaring a 
2638
  <code>default</code> label, or implicitly by having a <code>case</code>
2639
  label for each enumerator.  Conversely, the value of each <code>case</code>
2640
  label should equal the value of an enumerator.  For the purposes of
2641
  this check, boolean values are treated as if they were declared using
2642
  an enumeration type of the form: 
2643
  <programlisting>
2644
  	enum bool { false = 0, true = 1 } ;
2645
  </programlisting>
2646
  </para>
2647
  <para>
2648
  A common source of errors in <code>switch</code> statements is the
2649
  fall-through from one <code>case</code> or <code>default</code>
2650
  statement to the next.  A check for this can be enabled using: 
2651
  <programlisting>
2652
  	#pragma TenDRA fall into case <I>allow</I>
2653
  </programlisting>
2654
  <code>case</code> or <code>default</code> labels where fall-through
2655
  from the previous statement is intentional can be marked by preceding
2656
  them by a keyword, <code>FALL_THRU</code> say, introduced using the
2657
  directive: 
2658
  <programlisting>
2659
  	#pragma TenDRA keyword <I>identifier</I> for fall into case
2660
  </programlisting>
2661
  </para>
2662
  </sect3>  
2663
 
2664
  <sect3 id="for">
2665
    <title>2.2.42. For statements</title>
2666
  <para>
2667
  In ISO C++ the scope of a variable declared in a for-init-statement
2668
  is the body of the <code>for</code> statement; in older dialects it
2669
  extended to the end of the enclosing block.  So: 
2670
  <programlisting>
2671
  	for ( int i = 0 ; i &lt; 10 ; i++ ) {
2672
  	    // for statement body
2673
  	}
2674
  	return i ;	// OK in older dialects, error in ISO C++
2675
  </programlisting>
2676
  This behaviour is controlled by the directive: 
2677
  <programlisting>
2678
  	#pragma TenDRA++ for initialization block <I>on</I> 
2679
  </programlisting>
2680
  a state of <code>on</code> corresponding to the ISO rules and 
2681
  <code>off</code> to the older rules.  Perhaps most useful is the 
2682
  <code>warning</code> state which implements the old rules but gives
2683
  a warning if a variable declared in a for-init-statement is used outside
2684
  the corresponding <code>for</code> statement body.  A program which
2685
  does not give such warnings should compile correctly under either
2686
  set of rules. 
2687
  </para>
2688
  </sect3>  
2689
 
2690
  <sect3 id="return">
2691
    <title>2.2.43. Return statements</title>
2692
  <para>
2693
  In C, but not in C++, it is possible to have a <code>return</code>
2694
  statement without an expression in a function which does not return
2695
  <code>void</code>.  It is possible to enable this behaviour using
2696
  the directive: 
2697
  <programlisting>
2698
  	#pragma TenDRA incompatible void return <I>allow</I>
2699
  </programlisting>
2700
  Note that this check includes the implicit <code>return</code> caused
2701
  by falling off the end of a function.  The effect of such a 
2702
  <code>return</code> statement is undefined.  The C++ rule that falling
2703
  off the end of <code>main</code> is equivalent to returning a value
2704
  of 0 overrides this check. 
2705
  </para>
2706
  </sect3>  
2707
 
2708
  <sect3 id="reach">
2709
    <title>2.2.44. Unreached code analysis</title>
2710
  <para>
2711
  The directive: 
2712
  <programlisting>
2713
  	#pragma TenDRA unreachable code <I>allow</I>
2714
  </programlisting>
2715
  enables a flow analysis check to detect unreachable code.  It is possible
2716
  to assert that a statement is reached or not reached by preceding
2717
  it by a keyword introduced by one of the directives: 
2718
  <programlisting>
2719
  	#pragma TenDRA keyword <I>identifier</I> for set reachable
2720
  	#pragma TenDRA keyword <I>identifier</I> for set unreachable
2721
  </programlisting>
2722
  </para>
2723
  <para>
2724
  The fact that certain functions, such as <code>exit</code>, do not
2725
  return a value can be exploited in the flow analysis routines.  The
2726
  equivalent directives: 
2727
  <programlisting>
2728
  	#pragma TenDRA bottom <I>identifier</I>
2729
  	#pragma TenDRA++ type <I>identifier</I> for bottom
2730
  </programlisting>
2731
  can be used to introduce a <code>typedef</code> declaration for the
2732
  type, bottom, returned by such functions.  The TenDRA API headers
2733
  declare 
2734
  <code>exit</code> and similar functions in this way, for example:
2735
  <programlisting>
2736
  	#pragma TenDRA bottom __bottom
2737
  	__bottom exit ( int ) ;
2738
  	__bottom abort ( void ) ;
2739
  </programlisting>
2740
  The bottom type is compatible with <code>void</code> in function declarations
2741
  to allow such functions to be redeclared in their conventional form.
2742
  </para>
2743
  </sect3>  
2744
 
2745
  <sect3 id="variable">
2746
    <title>2.2.45. Variable flow analysis</title>
2747
  <para>
2748
  The directive: 
2749
  <programlisting>
2750
  	#pragma TenDRA variable analysis <I>on</I>
2751
  </programlisting>
2752
  enables checks on the uses of automatic variables and function parameters.
2753
  These checks detect: 
2754
  <itemizedlist>
2755
  <listitem>If a variable is not used in its scope. 
2756
  </listitem>
2757
  <listitem>If the value of a variable is used before it has been assigned
2758
  to. 
2759
  </listitem>
2760
  <listitem>If a variable is assigned to twice without an intervening use.
2761
  </listitem>
2762
  <listitem>If a variable is assigned to twice without an intervening sequence
2763
  point. 
2764
  </listitem>
2765
  </itemizedlist>
2766
  as illustrated by the variables <code>a</code>, <code>b</code>, 
2767
  <code>c</code> and <code>d</code> respectively in: 
2768
  <programlisting>
2769
  	void f ()
2770
  	{
2771
  	    int a ;			// a never used
2772
  	    int b ;
2773
  	    int c = b ;			// b not initialised
2774
  	    c = 0 ;			// c assigned to twice
2775
  	    int d = 0 ;
2776
  	    d = ++d ;			// d assigned to twice
2777
  	}
2778
  </programlisting>
2779
  The second, and more particularly the third, of these checks requires
2780
  some fairly sophisticated flow analysis, so any hints which can be
2781
  picked up from <A HREF="#switch">exhaustive <code>switch</code>
2782
  statements</A> etc. is likely to increase the accuracy of the errors
2783
  detected. 
2784
  </para>
2785
  <para>
2786
  In a non-static member function the various non-static data members
2787
  are analysed as if they were automatic variables.  It is checked that
2788
  each member is initialised in a constructor.  A common source of initialisation
2789
  problems in a constructor is that the base classes and members are
2790
  initialised in the canonical order of virtual bases, non-virtual direct
2791
  bases and members in the order of their declaration, rather than in
2792
  the order in which their initialisers appear in the constructor definition.
2793
  Therefore a check that the initialisers appear in the canonical order
2794
  is also applied. 
2795
  </para>
2796
  <para>
2797
  It is possible to change the state of a variable during the variable
2798
  analysis using the directives: 
2799
  <programlisting>
2800
  	#pragma TenDRA set <I>expression</I>
2801
  	#pragma TenDRA discard <I>expression</I>
2802
  </programlisting>
2803
  The first asserts that the variable given by the <I>expression</I>
2804
  has been assigned to; the second asserts that the variable is not
2805
  used.  An alternative way of expressing this is by means of keywords:
2806
  <programlisting>
2807
  	SET ( <I>expression</I> )
2808
  	DISCARD ( <I>expression</I> )
2809
  </programlisting>
2810
  introduced using the directives. 
2811
  <programlisting>
2812
  	#pragma TenDRA keyword <I>identifier</I> for set
2813
  	#pragma TenDRA keyword <I>identifier</I> for discard variable
2814
  </programlisting>
2815
  respectively.  These expressions can appear in expression statements
2816
  and as the first argument of a comma expression. 
2817
  </para>
2818
  <para>
2819
  <IMG SRC="../images/warn.gif" ALT="warning"/>
2820
  The variable flow analysis checks have not yet been completely implemented.
2821
  They may not detect errors in certain circumstances and for extremely
2822
  convoluted code may occasionally give incorrect errors. 
2823
  </para>
2824
  </sect3>  
2825
 
2826
  <sect3 id="hide">
2827
    <title>2.2.46. Variable hiding</title>
2828
  <para>
2829
  The directive: 
2830
  <programlisting>
2831
  	#pragma TenDRA variable hiding analysis <I>on</I>
2832
  </programlisting>
2833
  can be used to enable a check for hiding of other variables and, in
2834
  member functions, data members, by local variable declarations. 
2835
  </para>
2836
  </sect3>  
2837
 
2838
  <sect3 id="exception">
2839
    <title>2.2.47. Exception analysis</title>
2840
  <para>
2841
  The ISO C++ rules do not require exception specifications to be checked
2842
  statically.  This is to facilitate the integration of large systems
2843
  where a single change in an exception specification could have ramifications
2844
  throughout the system.  However it is often useful to apply such checks,
2845
  which can be enabled using the directive: 
2846
  <programlisting>
2847
  	#pragma TenDRA++ throw analysis <I>on</I>
2848
  </programlisting>
2849
  This detects any potentially uncaught exceptions and other exception
2850
  problems.  In the error messages arising from this check, an uncaught
2851
  exception of type <code>...</code> means that an uncaught exception
2852
  of an unknown type (arising, for example, from a function without
2853
  an exception specification) may be thrown.  For example: 
2854
  <programlisting>
2855
  	void f ( int ) throw ( int ) ;
2856
  	void g ( int ) throw ( long ) ;
2857
  	void h ( int ) ;
2858
 
2859
  	void e () throw ( int )
2860
  	{
2861
  	    f ( 1 ) ;			// OK
2862
  	    g ( 2 ) ;			// uncaught 'long' exception
2863
  	    h ( 3 ) ;			// uncaught '...' exception
2864
  	}
2865
  </programlisting>
2866
  </para>
2867
  </sect3>  
2868
 
2869
  <sect3 id="template">
2870
    <title>2.2.48. Template compilation</title>
2871
  <para>
2872
  The C++ producer makes the distinction between exported templates,
2873
  which may be used in one module and defined in another, and non-exported
2874
  templates, which must be defined in every module in which they are
2875
  used. As in the ISO C++ standard, the <code>export</code> keyword
2876
  is used to distinguish between the two cases.  In the past, different
2877
  compilers have had different template compilation models; either all
2878
  templates were exported or no templates were exported.  The latter
2879
  is easily emulated - if the <code>export</code> keyword is not used
2880
  then no templates will be exported.  To emulate the former behaviour
2881
  the directive: 
2882
  <programlisting>
2883
  	#pragma TenDRA++ implicit export template <I>on</I>
2884
  </programlisting>
2885
  can be used to treat all templates as if they had been declared using
2886
  the <code>export</code> keyword. 
2887
  </para>
2888
  <para>
2889
  <IMG SRC="../images/warn.gif" ALT="warning"/>
2890
  The automatic instantiation of exported templates has not yet been
2891
  implemented correctly.  It is intended that such instantiations will
2892
  be generated during <A HREF="link.html">intermodule analysis</A>
2893
  (where they conceptually belong).  At present it is necessary to work
2894
  round this using explicit instantiations. 
2895
  </para>
2896
  </sect3>  
2897
 
2898
  <sect3 id="catch_all">
2899
    <title>2.2.49. Other checks</title>
2900
  <para>
2901
  Several checks of varying utility have been implemented in the C++
2902
  producer but do not as yet have individual directives controlling
2903
  their use.  These can be enabled <I>en masse</I> using the directive:
2904
  <programlisting>
2905
  	#pragma TenDRA++ catch all <I>allow</I> 
2906
  </programlisting>
2907
  It is intended that this directive will be phased out as these checks
2908
  are assigned controlling directives.  It is possible to achieve finer
2909
  control over these checks by enabling their individual error messages
2910
  <A HREF="#low">as described above</A>. 
2911
  </para>
2912
  </sect3>
2913
  </sect2>
2914
 
2915
  <sect2 id="token">
2916
    <title>2.3. Token syntax</title>
2917
  <para>
2918
  The C and C++ producers allow place-holders for various categories
2919
  of syntactic classes to be expressed using directives of the form:
2920
  <programlisting>
2921
  	#pragma TenDRA token <I>token-spec</I>
2922
  </programlisting>
2923
  or simply: 
2924
  <programlisting>
2925
  	#pragma token <I>token-spec</I>
2926
  </programlisting>
2927
  These place-holders are represented as TDF tokens and hence are called
2928
  tokens.  These tokens stand for a certain type, expression or whatever
2929
  which is to be represented by a certain named TDF token in the producer
2930
  output.  This mechanism is used, for example, to allow C API specifications
2931
  to be represented target independently.  The types, functions and
2932
  expressions comprising the API can be described using <code>#pragma
2933
  token</code> directives and the target dependent definitions of these
2934
  tokens, representing the implementation of the API on a particular
2935
  machine, can be linked in later.  This mechanism is described in detail
2936
  elsewhere. 
2937
  </para>
2938
  <para>
2939
  A <A HREF="pragma1.html#token">summary of the grammar</A> for the
2940
  <code>#pragma token</code> directives accepted by the C++ producer
2941
  is given as an annex. 
2942
  </para>
2943
 
2944
 
2945
  <sect3 id="spec">
2946
    <title>2.3.1. Token specifications</title>
2947
  <para>
2948
  A token specification is divided into two components, a 
2949
  <I>token-introduction</I> giving the token sort, and a 
2950
  <I>token-identification</I> giving the internal and external token
2951
  names: 
2952
  <programlisting>
2953
  	<I>token-spec</I> :
2954
  		<I>token-introduction token-identification</I>
2955
 
2956
  	<I>token-introduction</I> :
2957
  		<I>exp-token</I>
2958
  		<I>statement-token</I>
2959
  		<I>type-token</I>
2960
  		<I>member-token</I>
2961
  		<I>procedure-token</I>
2962
 
2963
  	<I>token-identification</I> :
2964
  		<I>token-namespace<SUB>opt</SUB> identifier</I> # <I>external-identifier<SUB>opt</SUB></I>
2965
 
2966
  	<I>token-namespace</I> :
2967
  		TAG
2968
 
2969
  	<I>external-identifier</I> :
2970
  		-
2971
  		<I>preproc-token-list</I>
2972
  </programlisting>
2973
  The <code>TAG</code> qualifier is used to indicate that the internal
2974
  name lies in the C tag namespace.  This only makes sense for structure
2975
  and union types.  The external token name can be given by any sequence
2976
  of preprocessing tokens.  These tokens are not macro expanded.  If
2977
  no external name is given then the internal name is used.  The special
2978
  external name <code>-</code> is used to indicate that the token does
2979
  not have an associated external name, and hence is local to the current
2980
  translation unit.  Such a local token must be defined.  White space
2981
  in the external name (other than at the start or end) is used to indicate
2982
  that a TDF unique name should be used.  The white space serves as
2983
  a separator for the unique name components. 
2984
  </para>
2985
 
2986
  <H4><A id="expression-tokens">Expression tokens</A></H4>
2987
  <para>
2988
  Expression tokens are specified as follows: 
2989
  <programlisting>
2990
  	<I>exp-token</I> :
2991
  		EXP <I>exp-storage<SUB>opt</SUB></I> : <I>type-id</I> :
2992
  		NAT
2993
  		INTEGER
2994
  </programlisting>
2995
  representing a expression of the given type, a non-negative integer
2996
  constant and general integer constant, respectively.  Each expression
2997
  has an associated storage class: 
2998
  <programlisting>
2999
  	<I>exp-storage</I> :
3000
  		lvalue
3001
  		rvalue
3002
  		const
3003
  </programlisting>
3004
  indicating whether it is an lvalue, an rvalue or a compile-time constant
3005
  expression.  An absent <I>exp-storage</I> is equivalent to 
3006
  <code>rvalue</code>.  All expression tokens lie in the macro namespace;
3007
  that is, they may potentially be defined as macros. 
3008
  </para>
3009
  <para>
3010
  For backwards compatibility with the C producer, the directive:
3011
  <programlisting>
3012
  	#pragma TenDRA++ rvalue token as const <I>allow</I>
3013
  </programlisting>
3014
  causes <code>rvalue</code> tokens to be treated as <code>const</code>
3015
  tokens.</para>
3016
 
3017
  <H4>Statement tokens</H4>
3018
  <para>
3019
  Statement tokens are specified as follows: 
3020
  <programlisting>
3021
  	<I>statement-token</I> :
3022
  		STATEMENT
3023
  </programlisting>
3024
  All statement tokens lie in the macro namespace. 
3025
  </para>
3026
 
3027
  <H4>Type tokens</H4>
3028
  <para>
3029
  Type tokens are specified as follows: 
3030
  <programlisting>
3031
  	<I>type-token</I> :
3032
  		TYPE
3033
  		VARIETY
3034
  		VARIETY signed
3035
  		VARIETY unsigned
3036
  		FLOAT
3037
  		ARITHMETIC
3038
  		SCALAR
3039
  		CLASS
3040
  		STRUCT
3041
  		UNION
3042
  </programlisting>
3043
  representing a generic type, an integral type, a signed integral type,
3044
  an unsigned integral type, a floating point type, an arithmetic (integral
3045
  or floating point) type, a scalar (arithmetic or pointer) type, a
3046
  class type, a structure type and a union type respectively. 
3047
  </para>
3048
  <para>
3049
  <IMG SRC="../images/warn.gif" ALT="warning"/>
3050
  Floating-point, arithmetic and scalar token types have not yet been
3051
  implemented correctly in either the C or C++ producers. 
3052
  </para>
3053
 
3054
  <H4><A id="member">Member tokens</A></H4>
3055
  <para>
3056
  Member tokens are specified as follows: 
3057
  <programlisting>
3058
  	<I>member-token</I> :
3059
  		MEMBER <I>access-specifier<SUB>opt</SUB> member-type-id</I> : <I>type-id</I> :
3060
  </programlisting>
3061
  where an <I>access-specifier</I> of <code>public</code> is assumed
3062
  if none is given.  The member type is given by: 
3063
  <programlisting>
3064
  	<I>member-type-id</I> :
3065
  		<I>type-id</I>
3066
  		<I>type-id</I> % <I>constant-expression</I>
3067
  </programlisting>
3068
  where <code>%</code> is used to denote bitfield members (since 
3069
  <code>:</code> is used as a separator).  The second type denotes the
3070
  structure or union the given member belongs to.  Different types can
3071
  have members with the same internal name, but the external token name
3072
  must be unique.  Note that only non-static data members can be represented
3073
  in this form. 
3074
  </para>
3075
  <para>
3076
  Two declarations for the same <code>MEMBER</code> token (including token
3077
  definitions) should have the same type, however the directive:
3078
  <programlisting>
3079
  	#pragma TenDRA++ incompatible member declaration <I>allow</I>
3080
  </programlisting>
3081
  allows declarations with different types, provided these types have the
3082
  same size and alignment requirements.
3083
  </para>
3084
 
3085
  <H4>Procedure tokens</H4>
3086
  <para>
3087
  Procedure, or high-level, tokens are specified in one of three ways:
3088
  <programlisting>
3089
  	<I>procedure-token</I> :
3090
  		<I>general-procedure</I>
3091
  		<I>simple-procedure</I>
3092
  		<I>function-procedure</I>
3093
  </programlisting>
3094
  All procedure tokens (except ellipsis functions - see below) lie in
3095
  the macro namespace.  The most general form of procedure token specifies
3096
  two sets of parameters.  The bound parameters are those which are
3097
  used in encoding the actual TDF output, and the program parameters
3098
  are those which are <A HREF="#args">specified in the program</A>.
3099
  The program parameters are expressed in terms of the bound parameters.
3100
  A program parameter can be an expression token parameter, a statement
3101
  token parameter, a member token parameter, a procedure token parameter
3102
  or any type.  The bound parameters are deduced from the program parameters
3103
  by a similar process to that used in template argument deduction.
3104
  <programlisting>
3105
  	<I>general-procedure</I> :
3106
  		PROC { <I>bound-toks<SUB>opt</SUB></I> | <I>prog-pars<SUB>opt</SUB></I> } <I>token-introduction
3107
  </I>
3108
 
3109
  	<I>bound-toks</I> :
3110
  		<I>bound-token</I>
3111
  		<I>bound-token</I> , <I>bound-toks</I>
3112
 
3113
  	<I>bound-token</I> :
3114
  		<I>token-introduction token-namespace<SUB>opt</SUB> identifier</I>
3115
 
3116
  	<I>prog-pars</I> :
3117
  		<I>program-parameter</I>
3118
  		<I>program-parameter</I> , <I>prog-pars</I>
3119
 
3120
  	<I>program-parameter</I> :
3121
  		EXP <I>identifier</I>
3122
  		STATEMENT <I>identifier</I>
3123
  		TYPE <I>type-id</I>
3124
  		MEMBER <I>type-id</I> : <I>identifier</I>
3125
  		PROC <I>identifier</I>
3126
  </programlisting>
3127
  </para>
3128
  <para>
3129
  The simplest form of a <I>general-procedure</I> is one in which the
3130
  <I>prog-pars</I> correspond precisely to the <I>bound-toks</I>.  In
3131
  this case the syntax: 
3132
  <programlisting>
3133
  	<I>simple-procedure</I> :
3134
  		PROC ( <I>simple-toks<SUB>opt</SUB></I> ) <I>token-introduction</I>
3135
 
3136
  	<I>simple-toks</I> :
3137
  		<I>simple-token</I>
3138
  		<I>simple-token</I> , <I>simple-toks</I>
3139
 
3140
  	<I>simple-token</I> :
3141
  		<I>token-introduction token-namespace<SUB>opt</SUB> identifier<SUB>opt</SUB></I>
3142
  </programlisting>
3143
  may be used.  Note that the parameter names are optional. 
3144
  </para>
3145
  <para>
3146
  A function token is specified as follows: 
3147
  <programlisting>
3148
  	<I>function-procedure</I> :
3149
  		FUNC <I>type-id</I> :
3150
  </programlisting>
3151
  where the given type is a function type.  This has two effects: firstly
3152
  a function with the given type is declared; secondly, if the function
3153
  type has the form: 
3154
  <programlisting>
3155
  	r ( p1, ...., pn )
3156
  </programlisting>
3157
  a procedure token with sort: 
3158
  <programlisting>
3159
  	PROC ( EXP rvalue : p1 :, ...., EXP rvalue : pn : ) EXP rvalue : r :
3160
  </programlisting>
3161
  is declared.  For ellipsis function types only the function, not the
3162
  token, is declared.  Note that the token behaves like a macro definition
3163
  of the corresponding function.  Unless explicitly enclosed in a linkage
3164
  specification, a function declared using a <code>FUNC</code>
3165
  token has C linkage.  Note that it is possible for two <code>FUNC</code>
3166
  tokens to have the same internal name, because of function overloading,
3167
  however external names must be unique. 
3168
  </para>
3169
  <para>
3170
  The directive: 
3171
  <programlisting>
3172
  	#pragma TenDRA incompatible interface declaration <I>allow</I>
3173
  </programlisting>
3174
  can be used to allow incompatible redeclarations of functions declared
3175
  using <code>FUNC</code> tokens.  The token declaration takes precedence.
3176
  </para>
3177
  <para>
3178
  <IMG SRC="../images/warn.gif" ALT="warning"/>
3179
  Certain of the more complex examples of <code>PROC</code> tokens such
3180
  as, for example, tokens with <code>PROC</code> parameters, have not
3181
  been implemented in either the C or C++ producers. 
3182
  </para>
3183
  </sect3>
3184
 
3185
  <sect3 id="token-arguments">
3186
    <title>2.3.2. Token arguments</title>
3187
  <para>
3188
  As mentioned above, the program parameters for a <code>PROC</code>
3189
  token are those specified in the program itself.  These arguments
3190
  are expressed as a comma-separated list enclosed in brackets, the
3191
  form of each argument being determined by the corresponding program
3192
  parameter. 
3193
  </para>
3194
  <para>
3195
  An <code>EXP</code> argument is an assignment expression.  This must
3196
  be an lvalue for <code>lvalue</code> tokens and a constant expression
3197
  for 
3198
  <code>const</code> tokens.  The argument is converted to the token
3199
  type (for <code>lvalue</code> tokens this is essentially a conversion
3200
  between the corresponding reference types).  A <code>NAT</code> or
3201
  <code>INTEGER</code> argument is an integer constant expression. 
3202
  In the former case this must be non-negative. 
3203
  </para>
3204
  <para>
3205
  A <code>STATEMENT</code> argument is a statement.  This statement
3206
  should not contain any labels or any <code>goto</code> or <code>return</code>
3207
  statements. 
3208
  </para>
3209
  <para>
3210
  A type argument is a type identifier.  This must name a type of the
3211
  correct category for the corresponding token.  For example, a 
3212
  <code>VARIETY</code> token requires an integral type. 
3213
  </para>
3214
  <para>
3215
  <A id="offset">A member argument must describe the offset of a member
3216
  or nested member of the given structure or union type</A>.  The type
3217
  of the member should agree with that of the <code>MEMBER</code> token.
3218
  The general form of a member offset can be described in terms of member
3219
  selectors and array indexes as follows: 
3220
  <programlisting>
3221
  	<I>member-offset</I> :
3222
  		::<I><SUB>opt</SUB> id-expression</I>
3223
  		<I>member-offset</I> . ::<I><SUB>opt</SUB> id-expression</I>
3224
  		<I>member-offset</I> [ <I>constant-expression</I> ]
3225
  </programlisting>
3226
  </para>
3227
  <para>
3228
  A <code>PROC</code> argument is an identifier.  This identifier must
3229
  name a <code>PROC</code> token of the appropriate sort. 
3230
  </para>
3231
  </sect3>  
3232
 
3233
  <sect3 id="tokdef">
3234
    <title>2.3.3. Defining tokens</title>
3235
  <para>
3236
  Given a token specification of a syntactic object and a normal language
3237
  definition of the same object (including macro definitions if the
3238
  token lies in the macro namespace), the producers attempt to unify
3239
  the two by defining the TDF token in terms of the given definition.
3240
  Whether the token specification occurs before or after the language
3241
  definition is immaterial.  Unification also takes place in situations
3242
  where, for example, two types are known to be compatible.  Multiple
3243
  consistent explicit token definitions are allowed by default when
3244
  allowed by the language; this is controlled by the directive: 
3245
  <programlisting>
3246
  	#pragma TenDRA compatible token <I>allow</I>
3247
  </programlisting>
3248
  The default unification behaviour may be modified using the directives:
3249
  <programlisting>
3250
  	#pragma TenDRA no_def <I>token-list</I>
3251
  	#pragma TenDRA define <I>token-list</I>
3252
  	#pragma TenDRA reject <I>token-list</I>
3253
  </programlisting>
3254
  or equivalently: 
3255
  <programlisting>
3256
  	#pragma no_def <I>token-list</I>
3257
  	#pragma define <I>token-list</I>
3258
  	#pragma ignore <I>token-list</I>
3259
  </programlisting>
3260
  which set the state of the tokens given in <I>token-list</I>.  A state
3261
  of <code>no_def</code> means that no unification is attempted and
3262
  that any attempt to explicitly define the token results in an error.
3263
  A state of <code>define</code> means that unification takes place
3264
  and that the token must be defined somewhere in the translation unit.
3265
  A state of <code>reject</code> means that unification takes place as
3266
  normal, but any resulting token definition is discarded and not output
3267
  to the TDF capsule. 
3268
  </para>
3269
  <para>
3270
  If a token with the state <code>define</code> is not defined, then the
3271
  behaviour depends on the sort of the token.  A <code>FUNC</code> token
3272
  is implicitly defined in terms of its underlying function, such as:
3273
  <programlisting>
3274
  	#define f( a1, ...., an )	( f ) ( a1, ...., an )
3275
  </programlisting>
3276
  Other undefined tokens cause an error.  This behaviour can be modified
3277
  using the directives:
3278
  <programlisting>
3279
  	#pragma TenDRA++ implicit token definition <I>allow</I>
3280
  	#pragma TenDRA++ no token definition <I>allow</I>
3281
  </programlisting>
3282
  respectively.</para>
3283
  <para>
3284
  The primitive operations, <code>no_def</code>, <code>define</code> and
3285
  <code>reject</code>, can also be expressed using the context sensitive
3286
  directive: 
3287
  <programlisting>
3288
  	#pragma TenDRA interface <I>token-list</I>
3289
  </programlisting>
3290
  or equivalently: 
3291
  <programlisting>
3292
  	#pragma interface <I>token-list</I>
3293
  </programlisting>
3294
  By default this is equivalent to <code>no_def</code>, but may be modified
3295
  by inclusion using one of the directives: 
3296
  <programlisting>
3297
  	#pragma TenDRA extend <I>header-name</I>
3298
  	#pragma TenDRA implement <I>header-name</I>
3299
  </programlisting>
3300
  or equivalently: 
3301
  <programlisting>
3302
  	#pragma extend interface <I>header-name</I>
3303
  	#pragma implement interface <I>header-name</I>
3304
  </programlisting>
3305
  These are equivalent to: 
3306
  <programlisting>
3307
  	#include <I>header-name</I>
3308
  </programlisting>
3309
  except that the form <code>[....]</code> is allowed as a header name.
3310
  This is equivalent to <code>&lt;....&gt;</code> except that it starts
3311
  the directory search after the point at which the including file was
3312
  found, rather than at the start of the path (i.e. it is equivalent
3313
  to the 
3314
  <code>#include_next</code> directive found in some preprocessors).
3315
  The effect of the <code>extend</code> directive on the state of the
3316
  <code>interface</code> directive is as follows: 
3317
  <programlisting>
3318
  	no_def -&gt; no_def
3319
  	define -&gt; reject
3320
  	reject -&gt; reject
3321
  </programlisting>
3322
  The effect of the <code>implement</code> directive is as follows:
3323
  <programlisting>
3324
  	no_def -&gt; define
3325
  	define -&gt; define
3326
  	reject -&gt; reject
3327
  </programlisting>
3328
  That is to say, a <code>implement</code> directive will cause all
3329
  the tokens in the given header to be defined and their definitions
3330
  output. Any tokens included in this header by <code>extend</code>
3331
  may be defined, but their definitions will not be output.  This is
3332
  precisely the behaviour which is required to ensure that each token
3333
  is defined exactly once in an API library build. 
3334
  </para>
3335
  <para>
3336
  The lists of tokens in the directives above are expressed in the form:
3337
  <programlisting>
3338
  	<I>token-list</I> :
3339
  		<I>token-id token-list<SUB>opt</SUB></I>
3340
  		# <I>preproc-token-list</I>
3341
  </programlisting>
3342
  where a <I>token-id</I> represents an internal token name: 
3343
  <programlisting>
3344
  	<I>token-id</I> :
3345
  		<I>token-namespace<SUB>opt</SUB> identifier</I>
3346
  		<I>type-id</I> . <I>identifier</I>
3347
  </programlisting>
3348
  Note that member tokens are specified by means of both the member
3349
  name and its parent type.  In this type specifier, <code>TAG</code>,
3350
  rather than 
3351
  <code>class</code>, <code>struct</code> or <code>union</code>, may
3352
  be used in elaborated type specifiers for structure and union tokens.
3353
  If the 
3354
  <I>token-id</I> names an overloaded function then the directive is
3355
  applied to all <code>FUNC</code> tokens of that name.  It is possible
3356
  to  be more selective using the <code>#</code> form which allows the
3357
  external token name to be specified.  Such an entry must be the last
3358
  in a <I>token-list</I>. 
3359
  </para>
3360
  <para>
3361
  A related directive has the form: 
3362
  <programlisting>
3363
  	#pragma TenDRA++ undef token <I>token-list</I>
3364
  </programlisting>
3365
  which undefines all the given tokens so that they are no longer visible.
3366
  </para>
3367
  <para>
3368
  As noted above, a macro is only considered as a token definition if
3369
  the token lies in the macro namespace.  Tokens which are not in the
3370
  macro namespace, such as types and members, cannot be defined using
3371
  macros. Occasionally API implementations do define member selector
3372
  as macros in terms of other member selectors.  Such a token needs
3373
  to be explicitly defined using a directive of the form: 
3374
  <programlisting>
3375
  	#pragma TenDRA member definition <I>type-id</I> : <I>identifier member-offset
3376
  </I>
3377
  </programlisting>
3378
  where <I>member-offset</I> is <A HREF="#offset">as above</A>. 
3379
  </para>
3380
  </sect3>
3381
  </sect2>
3382
 
3383
  <sect2>
3384
    <title>2.4. Symbol table dump</title>
3385
  <para>
3386
  The symbol table dump provides a method whereby third party tools
3387
  can interface with the C and C++ producers.  The producer outputs
3388
  information on the identifiers declared within a source file, their
3389
  uses etc. into a file which can then be post-processed by a separate
3390
  tool. Any error messages and warnings can also be included in this
3391
  file, allowing more sophisticated error presentation tools to be written.
3392
  </para>
3393
  <para>
3394
  The file to be used as the symbol table output file, plus details
3395
  of what information is to be included in the dump file can be specified
3396
  using the <A HREF="man.html#dump"><code>-d</code> command-line option</A>.
3397
  The format of the dump file is described below; a 
3398
  <A HREF="dump1.html">summary of the syntax</A> is given as an annex.
3399
  </para>
3400
 
3401
 
3402
  <sect3 id="lexical-elements">
3403
    <title>2.4.1. Lexical elements</title>
3404
  <para>
3405
  A symbol table dump file consists of a sequence of characters giving
3406
  information on identifiers, errors etc. arising from a translation
3407
  unit. The fundamental lexical tokens are a <I>number</I>, consisting
3408
  of a sequence of decimal digits, and a <I>string</I>, consisting of
3409
  a sequence of characters enclosed in angle braces.  A <I>string</I>
3410
  can have one of two forms: 
3411
  <programlisting>
3412
  	<I>string</I> :
3413
  		&lt;<I>characters</I>&gt;
3414
  		&amp;<I>number</I>&lt;<I>characters</I>&gt;
3415
  </programlisting>
3416
  In the first form, the <I>characters</I> are terminated by the first
3417
  <code>&gt;</code> character encountered.  In the second form, the
3418
  number of characters is given by the preceding <I>number</I>.  No
3419
  white space is allowed either before or after the <I>number</I>. 
3420
  To aid parsers, the C++ producer always uses the second form for strings
3421
  containing more than 100 characters.  There are no escape characters
3422
  in strings; the 
3423
  <I>characters</I> can contain any characters, including newlines and
3424
  <code>#</code>, except that the first form cannot contain a 
3425
  <code>&gt;</code> character. 
3426
  </para>
3427
  <para>
3428
  Space, tab and newline characters are white space.  Comments begin
3429
  with 
3430
  <code>#</code> and run to the end of the line.  Comments are treated
3431
  as white space.  All other characters are treated as distinct lexical
3432
  tokens. 
3433
  </para>
3434
  </sect3>  
3435
 
3436
  <sect3 id="main">
3437
    <title>2.4.2. Overall syntax</title>
3438
  <para>
3439
  A symbol table dump file takes the form of a list of commands of various
3440
  kinds conveying information on the analysed file.  This can be represented
3441
  as follows: 
3442
  <programlisting>
3443
  	<I>dump-file</I> :
3444
  		<I>command-list<SUB>opt</SUB></I>
3445
 
3446
  	<I>command-list</I> :
3447
  		<I>command command-list<SUB>opt</SUB></I>
3448
 
3449
  	<I>command</I> :
3450
  		<I>version-command</I>
3451
  		<I>identifier-command</I>
3452
  		<I>scope-command</I>
3453
  		<I>override-command</I>
3454
  		<I>base-command</I>
3455
  		<I>api-command</I>
3456
  		<I>template-command</I>
3457
  		<I>promotion-command</I>
3458
  		<I>error-command</I>
3459
  		<I>path-command</I>
3460
  		<I>file-command</I>
3461
  		<I>include-command</I>
3462
  		<I>string-command</I>
3463
  </programlisting>
3464
  The various kinds of command are discussed below.  The first command
3465
  in the dump file should be of the form: 
3466
  <programlisting>
3467
  	<I>version-command</I> :
3468
  		V <I>number number string</I>
3469
  </programlisting>
3470
  where the two numbers give the version of the dump file format (the
3471
  version described here is 1.1 so both numbers should be 1) and the
3472
  string gives the language being represented, for example, 
3473
  <code>&lt;C++&gt;</code>. 
3474
  </para>
3475
  </sect3>  
3476
 
3477
  <sect3 id="file-locations">
3478
    <title>2.4.3. File locations</title>
3479
  <para>
3480
  A location within a source file can be specified using three 
3481
  <I>number</I>s and two <I>string</I>s.  These give respectively, the
3482
  column number, the line number taking <code>#line</code> directives
3483
  into account, the line number not taking <code>#line</code> directives
3484
  into account, the file name taking <code>#line</code> directives into
3485
  account, and the file name not taking <code>#line</code> directives
3486
  into account.  Any or all of the trailing elements can be replaced
3487
  by 
3488
  <code>*</code> to indicate that they have not changed relative to
3489
  the last <I>location</I> given.  Note that for the two line numbers,
3490
  unchanged means that the difference of the line numbers, taking 
3491
  <code>#line</code> directives into account or not, is unchanged. 
3492
  Thus: 
3493
  <programlisting>
3494
  	<I>location</I> :
3495
  		<I>number number number string string</I>
3496
  		<I>number number number string</I> *
3497
  		<I>number number number</I> *
3498
  		<I>number number</I> *
3499
  		<I>number</I> *
3500
  		*
3501
  </programlisting>
3502
  Note that there is a concept of the <A id="crt_loc">current file
3503
  location</A>, relative to which other locations are given.  The initial
3504
  value of the current file location is undefined.  Unless otherwise
3505
  stated, all <I>location</I> elements update the current file location.
3506
  </para>
3507
  </sect3>  
3508
 
3509
  <sect3 id="identifiers">
3510
    <title>2.4.4. Identifiers</title>
3511
  <para>
3512
  Each identifier is represented in the symbol table dump by a unique
3513
  number.  The same number always represents the same identifier. 
3514
  </para>
3515
 
3516
  <H4><A id="hashid">Identifier names</A></H4>
3517
  <para>
3518
  The number representing an identifier is introduced in the first declaration
3519
  or use of that identifier and thereafter the number alone is used
3520
  to denote the identifier: 
3521
  <programlisting>
3522
  	<I>identifier</I> :
3523
  		<I>number</I> = <I>identifier-name access<SUB>opt</SUB> scope-identifier</I>
3524
  		<I>number</I>
3525
  </programlisting>
3526
  </para>
3527
  <para>
3528
  The identifier name is given by: 
3529
  <programlisting>
3530
  	<I>identifier-name</I> :
3531
  		<I>string</I>
3532
  		C <I>type</I>
3533
  		D <I>type</I>
3534
  		O <I>string</I>
3535
  		T <I>type</I>
3536
  </programlisting>
3537
  denoting respectively, a simple identifier name, a constructor for
3538
  a type, a destructor for a type, an overloaded operator function name,
3539
  and a conversion function name.  The empty string is used for anonymous
3540
  identifiers. 
3541
  </para>
3542
  <para>
3543
  The optional identifier access is given by: 
3544
  <programlisting>
3545
  	<I>access</I> :
3546
  		N
3547
  		B
3548
  		P
3549
  </programlisting>
3550
  denoting <code>public</code>, <code>protected</code> and 
3551
  <code>private</code> respectively.  An absent <I>access</I> is equivalent
3552
  to <code>public</code>.  Note that all identifiers, not just class
3553
  members, can have access specifiers; however the access of a non-member
3554
  is always <code>public</code>. 
3555
  </para>
3556
  <para>
3557
  The <A HREF="#scope">scope</A> (i.e. class, namespace, block etc.)
3558
  in which an identifier is declared is given by: 
3559
  <programlisting>
3560
  	<I>scope-identifier</I> :
3561
  		<I>identifier</I>
3562
  		*
3563
  </programlisting>
3564
  denoting either a named or an unnamed scope. 
3565
  </para>
3566
 
3567
  <H4><A id="use">Identifier uses</A></H4>
3568
  <para>
3569
  Each declaration or use of an identifier is represented by a command
3570
  of the form: 
3571
  <programlisting>
3572
  	<I>identifier-command</I> :
3573
  		D <I>identifier-info type-info</I>
3574
  		M <I>identifier-info type-info</I>
3575
  		T <I>identifier-info type-info</I>
3576
  		Q <I>identifier-info</I>
3577
  		U <I>identifier-info</I>
3578
  		L <I>identifier-info</I>
3579
  		C <I>identifier-info</I>
3580
  		W <I>identifier-info type-info</I>
3581
  </programlisting>
3582
  where: 
3583
  <programlisting>
3584
  	<I>identifier-info</I> :
3585
  		<I>identifier-key location identifier</I>
3586
  </programlisting>
3587
  gives the kind of identifier being declared or used, the location
3588
  of the declaration or use, and the number associated with the identifier.
3589
  Each declaration may, depending on the <I>identifier-key</I>, associate
3590
  various <I>type-info</I> with the identifier, giving its type etc.
3591
  </para>
3592
  <para>
3593
  The various kinds of <I>identifier-command</I> are described below.
3594
  Any can be preceded by <code>I</code> to indicate an implicit declaration
3595
  or use.  <code>D</code> denotes a definition.  <code>M</code> (make)
3596
  denotes a declaration.  <code>T</code> denotes a tentative definition
3597
  (C only).  <code>Q</code> denotes the end of a definition, for those
3598
  identifiers such as classes and functions whose definitions may be
3599
  spread over several lines.  <code>U</code> denotes an undefine operation
3600
  (such as <code>#undef</code> for macro identifiers).  <code>C</code>
3601
  denotes a call to a function identifier; <code>L</code> (load) denotes
3602
  other identifier uses.  Finally <code>W</code> denotes implicit type
3603
  information such as the C producer gleans from its 
3604
  <A HREF="pragma.html#weak">weak prototype analysis</A>. 
3605
  </para>
3606
  <para>
3607
  The various <I>identifier-key</I>s are their associated <I>type-info</I>
3608
  fields are given by the following table: 
3609
  </para>
3610
 
3611
  <table>
3612
  <tr><th>Key</th>
3613
  <th>Type information</th>
3614
  <th>Description</th>
3615
  </tr>
3616
  <tr><td><code>K</code></td>
3617
  <td><code>*</code></td>
3618
  <td>keyword</td>
3619
  </tr>
3620
  <tr><td><code>MO</code></td>
3621
  <td><I>sort</I></td>
3622
  <td>object macro</td>
3623
  </tr>
3624
  <tr><td><code>MF</code></td>
3625
  <td><I>sort</I></td>
3626
  <td>function macro</td>
3627
  </tr>
3628
  <tr><td><code>MB</code></td>
3629
  <td><I>sort</I></td>
3630
  <td>built-in macro</td>
3631
  </tr>
3632
  <tr><td><code>TC</code></td>
3633
  <td><I>type</I></td>
3634
  <td>class tag</td>
3635
  </tr>
3636
  <tr><td><code>TS</code></td>
3637
  <td><I>type</I></td>
3638
  <td>structure tag</td>
3639
  </tr>
3640
  <tr><td><code>TU</code></td>
3641
  <td><I>type</I></td>
3642
  <td>union tag</td>
3643
  </tr>
3644
  <tr><td><code>TE</code></td>
3645
  <td><I>type</I></td>
3646
  <td>enumeration tag</td>
3647
  </tr>
3648
  <tr><td><code>TA</code></td>
3649
  <td><I>type</I></td>
3650
  <td><code>typedef</code> name</td>
3651
  </tr>
3652
  <tr><td><code>NN</code></td>
3653
  <td><code>*</code></td>
3654
  <td>namespace name</td>
3655
  </tr>
3656
  <tr><td><code>NA</code></td>
3657
  <td><I>scope-identifier</I></td>
3658
  <td>namespace alias</td>
3659
  </tr>
3660
  <tr><td><code>VA</code></td>
3661
  <td><I>type</I></td>
3662
  <td>automatic variable</td>
3663
  </tr>
3664
  <tr><td><code>VP</code></td>
3665
  <td><I>type</I></td>
3666
  <td>function parameter</td>
3667
  </tr>
3668
  <tr><td><code>VE</code></td>
3669
  <td><I>type</I></td>
3670
  <td><code>extern</code> variable</td>
3671
  </tr>
3672
  <tr><td><code>VS</code></td>
3673
  <td><I>type</I></td>
3674
  <td><code>static</code> variable</td>
3675
  </tr>
3676
  <tr><td><code>FE</code></td>
3677
  <td><I>type identifier<SUB>opt</SUB></I></td>
3678
  <td><code>extern</code> function</td>
3679
  </tr>
3680
  <tr><td><code>FS</code></td>
3681
  <td><I>type identifier<SUB>opt</SUB></I></td>
3682
  <td><code>static</code> function</td>
3683
  </tr>
3684
  <tr><td><code>FB</code></td>
3685
  <td><I>type identifier<SUB>opt</SUB></I></td>
3686
  <td>built-in operator function</td>
3687
  </tr>
3688
  <tr><td><code>CF</code></td>
3689
  <td><I>type identifier<SUB>opt</SUB></I></td>
3690
  <td>member function</td>
3691
  </tr>
3692
  <tr><td><code>CS</code></td>
3693
  <td><I>type identifier<SUB>opt</SUB></I></td>
3694
  <td><code>static</code> member function</td>
3695
  </tr>
3696
  <tr><td><code>CV</code></td>
3697
  <td><I>type identifier<SUB>opt</SUB></I></td>
3698
  <td>virtual member function</td>
3699
  </tr>
3700
  <tr><td><code>CM</code></td>
3701
  <td><I>type</I></td>
3702
  <td>data member</td>
3703
  </tr>
3704
  <tr><td><code>CD</code></td>
3705
  <td><I>type</I></td>
3706
  <td><code>static</code> data member</td>
3707
  </tr>
3708
  <tr><td><code>E</code></td>
3709
  <td><I>type</I></td>
3710
  <td>enumerator</td>
3711
  </tr>
3712
  <tr><td><code>L</code></td>
3713
  <td><code>*</code></td>
3714
  <td>label</td>
3715
  </tr>
3716
  <tr><td><code>XO</code></td>
3717
  <td><I>sort</I></td>
3718
  <td>object token</td>
3719
  </tr>
3720
  <tr><td><code>XF</code></td>
3721
  <td><I>sort</I></td>
3722
  <td>procedure token</td>
3723
  </tr>
3724
  <tr><td><code>XP</code></td>
3725
  <td><I>sort</I></td>
3726
  <td>token parameter</td>
3727
  </tr>
3728
  <tr><td><code>XT</code></td>
3729
  <td><I>sort</I></td>
3730
  <td>template parameter</td>
3731
  </tr>
3732
  </table>
3733
 
3734
  <para>
3735
  The function identifier keys can optionally be followed by 
3736
  <code>C</code> indicating that the function has C linkage, and 
3737
  <code>I</code> indicating that the function is inline.  By default,
3738
  functions declared in a C++ dump file have C++ linkage and functions
3739
  declared in a C dump file have C linkage.  The optional 
3740
  <I>identifier</I> which forms part of the <I>type-info</I> of these
3741
  functions is used to form linked lists of overloaded functions. 
3742
  </para>
3743
 
3744
  <H4><A id="scope">Identifier scopes</A></H4>
3745
  <para>
3746
  Each identifier belongs to a scope, called its parent scope, in which
3747
  it is declared.  For example, the parent of a member of a class is
3748
  the class itself.  This information is expressed in an identifier
3749
  declaration using a <I>scope-identifier</I>.  In addition to the obvious
3750
  scopes such as classes and namespaces, there are other scopes such
3751
  as blocks in function definitions.  It is possible to introduce dummy
3752
  identifiers to name such scopes.  The parent of such a dummy identifier
3753
  will be the enclosing scope identifier, so these dummy identifiers
3754
  naturally represent the block structure.  The parent of the top-level
3755
  block in a function definition can be considered to be the function
3756
  itself. 
3757
  </para>
3758
  <para>
3759
  Information on the start and end of such scopes is given by: 
3760
  <programlisting>
3761
  	<I>scope-command</I> :
3762
  		SS <I>scope-key location identifier</I>
3763
  		SE <I>scope-key location identifier</I>
3764
  </programlisting>
3765
  where: 
3766
  <programlisting>
3767
  	<I>scope-key</I> :
3768
  		N
3769
  		S
3770
  		B
3771
  		D
3772
  		H
3773
  		CT
3774
  		CF
3775
  		CC
3776
  </programlisting>
3777
  gives the kind of scope involved: a namespace, a class, a block, some
3778
  other declarative scope, a declaration block (see below), a true conditional
3779
  scope, a false conditional scope or a target dependent conditional
3780
  scope. 
3781
  </para>
3782
  <para>
3783
  A declaration block is a sequence of declarations enclosed in directives
3784
  of the form: 
3785
  <programlisting>
3786
  	#pragma TenDRA declaration block <I>identifier</I> begin
3787
  	....
3788
  	#pragma TenDRA declaration block end
3789
  </programlisting>
3790
  This allows the sequence of declarations to be associated with the
3791
  given 
3792
  <I>identifier</I> in the symbol dump file.  This technique is used
3793
  in the API description files to aid analysis tools in determining
3794
  which declarations are part of the API. 
3795
  </para>
3796
 
3797
  <H4><A id="scope">Other identifier information</A></H4>
3798
  <para>
3799
  Other information associated with an identifier may be expressed using
3800
  other dump commands.  For example: 
3801
  <programlisting>
3802
  	<I>override-command</I> :
3803
  		O <I>identifier identifier</I>
3804
  </programlisting>
3805
  is used to express the fact that the two <I>identifier</I>s are virtual
3806
  member functions, the first of which overrides the second. 
3807
  </para>
3808
  <para>
3809
  The command: 
3810
  <programlisting>
3811
  	<I>base-command</I> :
3812
  		B <I>identifier-key identifier base-graph</I>
3813
 
3814
  	<I>base-graph</I> :
3815
  		<I>base-class</I>
3816
  		<I>base-class</I> ( <I>base-list</I> )
3817
 
3818
  	<I>base-class</I> :
3819
  		<I>number</I> = V<I><SUB>opt</SUB> access<SUB>opt</SUB> type-name</I>
3820
  		<I>number</I> :
3821
 
3822
  	<I>base-list</I> :
3823
  		<I>base-graph base-list<SUB>opt</SUB></I>
3824
 
3825
  </programlisting>
3826
  associates a base class graph with a class identifier.  Any class
3827
  which does not have an associated <I>base-command</I> can be assumed
3828
  to have no base classes.  Each node in the graph is a <I>type-name</I>
3829
  with an associated list of base classes.  A <code>V</code> is used
3830
  to indicate a virtual base class.  Each node is numbered; duplicate
3831
  numbers are used to indicate bases identified via the virtual base
3832
  class structure.  Any base class can then be referred to as: 
3833
  <programlisting>
3834
  	<I>base-number</I> :
3835
  		<I>number</I> : <I>type-name</I>
3836
  </programlisting>
3837
  indicating the base class with the given number in the given class.
3838
  </para>
3839
  <para>
3840
  The command: 
3841
  <programlisting>
3842
  	<I>api-command</I> :
3843
  		X <I>identifier-key identifier string</I>
3844
  </programlisting>
3845
  associates the external token name given by the <I>string</I> with
3846
  the given tokenised identifier. 
3847
  </para>
3848
  <para>
3849
  The command: 
3850
  <programlisting>
3851
  	<I>template-command</I> :
3852
  		Z <I>identifier-key identifier token-application specialise-info</I>
3853
  </programlisting>
3854
  is used to introduce an identifier corresponding to an instance of
3855
  a template, <I>token-application</I>.  This instance may correspond
3856
  to a specialisation of the primary template; this information is represented
3857
  by: 
3858
  <programlisting>
3859
  	<I>specialise-info</I> :
3860
  		<I>identifier</I>
3861
  		<I>token-application</I>
3862
  		*
3863
  </programlisting>
3864
  where <code>*</code> indicates a non-specialised instance. 
3865
  </para>
3866
  </sect3>  
3867
 
3868
  <sect3 id="types">
3869
    <title>2.4.5. Types</title>
3870
  <para>
3871
  The <A id="built-in">built-in types</A> are represented in the symbol
3872
  table dump as follows: 
3873
  </para>
3874
 
3875
  <table>
3876
  <tr><th>Type</th>
3877
  <th>Encoding</th>
3878
  <th>Type</th>
3879
  <th>Encoding</th>
3880
  </tr>
3881
  <tr><td>char</td>
3882
  <td><code>c</code></td>
3883
  <td>float</td>
3884
  <td><code>f</code></td>
3885
  </tr>
3886
  <tr><td>signed char</td>
3887
  <td><code>Sc</code></td>
3888
  <td>double</td>
3889
  <td><code>d</code></td>
3890
  </tr>
3891
  <tr><td>unsigned char</td>
3892
  <td><code>Uc</code></td>
3893
  <td>long double</td>
3894
  <td><code>r</code></td>
3895
  </tr>
3896
  <tr><td>signed short</td>
3897
  <td><code>s</code></td>
3898
  <td>void</td>
3899
  <td><code>v</code></td>
3900
  </tr>
3901
  <tr><td>unsigned short</td>
3902
  <td><code>Us</code></td>
3903
  <td>(bottom)</td>
3904
  <td><code>u</code></td>
3905
  </tr>
3906
  <tr><td>signed int</td>
3907
  <td><code>i</code></td>
3908
  <td>bool</td>
3909
  <td><code>b</code></td>
3910
  </tr>
3911
  <tr><td>unsigned int</td>
3912
  <td><code>Ui</code></td>
3913
  <td>ptrdiff_t</td>
3914
  <td><code>y</code></td>
3915
  </tr>
3916
  <tr><td>signed long</td>
3917
  <td><code>l</code></td>
3918
  <td>size_t</td>
3919
  <td><code>z</code></td>
3920
  </tr>
3921
  <tr><td>unsigned long</td>
3922
  <td><code>Ul</code></td>
3923
  <td>wchar_t</td>
3924
  <td><code>w</code></td>
3925
  </tr>
3926
  <tr><td>signed long long</td>
3927
  <td><code>x</code></td>
3928
  <td>-</td>
3929
  <td>-</td>
3930
  </tr>
3931
  <tr><td>unsigned long long</td>
3932
  <td><code>Ux</code></td>
3933
  <td>-</td>
3934
  <td>-</td>
3935
  </tr>
3936
  </table>
3937
 
3938
  <para>
3939
  Named types (classes, enumeration types etc.) can be represented by
3940
  the corresponding identifier or token application: 
3941
  <programlisting>
3942
  	<I>type-name</I> :
3943
  		<I>identifier</I>
3944
  		<I>token-application</I>
3945
  </programlisting>
3946
  <A id="composite">Composite and qualified types</A> are represented
3947
  in terms of their subtypes as follows: 
3948
  </para>
3949
 
3950
  <table>
3951
  <tr><th>Type</th>
3952
  <th>Encoding</th>
3953
  </tr>
3954
  <tr><td><code>const</code> type</td>
3955
  <td><code>C</code> <I>type</I></td>
3956
  </tr>
3957
  <tr><td><code>volatile</code> type</td>
3958
  <td><code>V</code> <I>type</I></td>
3959
  </tr>
3960
  <tr><td>pointer type</td>
3961
  <td><code>P</code> <I>type</I></td>
3962
  </tr>
3963
  <tr><td>reference type</td>
3964
  <td><code>R</code> <I>type</I></td>
3965
  </tr>
3966
  <tr><td>pointer to member type</td>
3967
  <td><code>M</code> <I>type-name</I> <code>:</code> <I>type</I></td>
3968
  </tr>
3969
  <tr><td>function type</td>
3970
  <td><code>F</code> <I>type parameter-types</I></td>
3971
  </tr>
3972
  <tr><td>array type</td>
3973
  <td><code>A</code> <I>nat<SUB>opt</SUB></I> <code>:</code> <I>type</I></td>
3974
  </tr>
3975
  <tr><td>bitfield type</td>
3976
  <td><code>B</code> <I>nat</I> <code>:</code> <I>type</I></td>
3977
  </tr>
3978
  <tr><td>template type</td>
3979
  <td><code>t</code> <I>parameter-list<SUB>opt</SUB></I> <code>:</code> <I>type</I></td>
3980
  </tr>
3981
  <tr><td>promotion type</td>
3982
  <td><code>p</code> <I>type</I></td>
3983
  </tr>
3984
  <tr><td>arithmetic type</td>
3985
  <td><code>a</code> <I>type</I> <code>:</code> <I>type</I></td>
3986
  </tr>
3987
  <tr><td>integer literal type</td>
3988
  <td><code>n</code> <I>lit-base<SUB>opt</SUB> lit-suffix<SUB>opt</SUB></I></td>
3989
  </tr>
3990
  <tr><td>weak function prototype (C only)</td>
3991
  <td><code>W</code> <I>type parameter-types</I></td>
3992
  </tr>
3993
  <tr><td>weak parameter type (C only)</td>
3994
  <td><code>q</code> <I>type</I></td>
3995
  </tr>
3996
  </table>
3997
 
3998
  <para>
3999
  Other types can be represented by their textual representation using
4000
  the form <code>Q</code> <I>string</I>, or by <code>*</code>, indicating
4001
  an unknown type. 
4002
  </para>
4003
  <para>
4004
  The parameter types for a function type are represented as follows:
4005
  <programlisting>
4006
  	<I>parameter-types</I> :
4007
  		: <I>exception-spec<SUB>opt</SUB> func-qualifier<SUB>opt</SUB></I> :
4008
  		. <I>exception-spec<SUB>opt</SUB> func-qualifier<SUB>opt</SUB></I> :
4009
  		. <I>exception-spec<SUB>opt</SUB> func-qualifier<SUB>opt</SUB></I> .
4010
  		, <I>type parameter-types</I>
4011
  </programlisting>
4012
  where the <code>::</code> form indicates that there are no further
4013
  parameters, the <code>.:</code> form indicates that the parameters
4014
  are terminated by an ellipsis, and the <code>..</code> form indicates
4015
  that no information is available on the further parameters (this can
4016
  only happen with non-prototyped functions in C).  The function qualifiers
4017
  are given by: 
4018
  <programlisting>
4019
  	<I>func-qualifier</I> :
4020
  		C <I>func-qualifier<SUB>opt</SUB></I>
4021
  		V <I>func-qualifier<SUB>opt</SUB></I>
4022
  </programlisting>
4023
  representing <code>const</code> and <code>volatile</code> member functions.
4024
  The function exception specifier is given by: 
4025
  <programlisting>
4026
  	<I>exception-spec</I> :
4027
  		( <I>exception-list<SUB>opt</SUB></I> )
4028
 
4029
  	<I>exception-list</I> :
4030
  		<I>type</I>
4031
  		<I>type</I> , <I>exception-list</I>
4032
  </programlisting>
4033
  with an absent exception specifier, as in C++, indicating that any
4034
  exception may be thrown. 
4035
  </para>
4036
  <para>
4037
  Array and bitfield sizes are represented as follows: 
4038
  <programlisting>
4039
  	<I>nat</I> :
4040
  		+ <I>number</I>
4041
  		- <I>number</I>
4042
  		<I>identifier</I>
4043
  		<I>token-application</I>
4044
  		<I>string</I>
4045
  </programlisting>
4046
  where a <I>string</I> is used to hold a textual representation of
4047
  complex values. 
4048
  </para>
4049
  <para>
4050
  Template types are represented by a list of template parameters, which
4051
  will have previously been declared using the <code>XT</code> identifier
4052
  key, followed by the underlying type expressed in terms of these parameters.
4053
  The parameters are represented as follows: 
4054
  <programlisting>
4055
  	<I>parameter-list</I> :
4056
  		<I>identifier</I>
4057
  		<I>identifier</I> , <I>parameter-list</I>
4058
  </programlisting>
4059
  </para>
4060
  <para>
4061
  Integer literal types are represented by the value of the literal
4062
  followed by a representation of the literal base and suffix.  These
4063
  are given by: 
4064
  <programlisting>
4065
  	<I>lit-base</I> :
4066
  		O
4067
  		X
4068
  </programlisting>
4069
  representing octal and hexadecimal literals respectively (decimal
4070
  is the default), and: 
4071
  <programlisting>
4072
  	<I>lit-suffix</I> :
4073
  		U
4074
  		l
4075
  		Ul
4076
  		x
4077
  		Ux
4078
  </programlisting>
4079
  representing the <code>U</code>, <code>L</code>, <code>UL</code>,
4080
  <code>LL</code> and <code>ULL</code> suffixes respectively. 
4081
  </para>
4082
  <para>
4083
  Target dependent integral promotion types are represented using 
4084
  <code>p</code>, so for example the promotion of <code>unsigned short</code>
4085
  is represented as <code>pUs</code>.  Information on the other cases,
4086
  where the promotion type is known, can be given in a command of the
4087
  form: 
4088
  <programlisting>
4089
  	<I>promotion-command</I> :
4090
  		P <I>type</I> : <I>type</I>
4091
  </programlisting>
4092
  Thus the fact that the promotion of <code>short</code> is <code>int</code>
4093
  would be expressed by the command <code>Ps:i</code>. 
4094
  </para>
4095
  </sect3>  
4096
 
4097
  <sect3 id="sort">
4098
    <title>2.4.6. Sorts</title>
4099
  <para>
4100
  A <I>sort</I> in the symbol table dump corresponds to the sort of
4101
  a token declared in the <A HREF="token.html#spec"><code>#pragma token</code>
4102
  syntax</A>.  Expression tokens are represented as follows: 
4103
  <programlisting>
4104
  	<I>expression-sort</I> :
4105
  		ZEL <I>type</I>
4106
  		ZER <I>type</I>
4107
  		ZEC <I>type</I>
4108
  		ZN
4109
  </programlisting>
4110
  corresponding to <code>lvalue</code>, <code>rvalue</code> and 
4111
  <code>const</code> <code>EXP</code> tokens of the given type, and
4112
  <code>NAT</code> or <code>INTEGER</code> tokens, respectively. Statement
4113
  tokens are represent by: 
4114
  <programlisting>
4115
  	<I>statement-sort</I> :
4116
  		ZS
4117
  </programlisting>
4118
  </para>
4119
  <para>
4120
  Type tokens are represented as follows: 
4121
  <programlisting>
4122
  	<I>type-sort</I> :
4123
  		ZTO
4124
  		ZTI
4125
  		ZTF
4126
  		ZTA
4127
  		ZTP
4128
  		ZTS
4129
  		ZTU
4130
  </programlisting>
4131
  corresponding to <code>TYPE</code>, <code>VARIETY</code>, <code>FLOAT</code>,
4132
  <code>ARITHMETIC</code>, <code>SCALAR</code>, <code>STRUCT</code>
4133
  or 
4134
  <code>CLASS</code>, and <code>UNION</code> token respectively.  There
4135
  are corresponding <code>TAG</code> forms: 
4136
  <programlisting>
4137
  	<I>tag-type-sort</I> :
4138
  		ZTTS
4139
  		ZTTU
4140
  </programlisting>
4141
  </para>
4142
  <para>
4143
  Member tokens are represented using: 
4144
  <programlisting>
4145
  	<I>member-sort</I> :
4146
  		ZM <I>type</I> : <I>type-name</I>
4147
  </programlisting>
4148
  where the first type gives the member type and the second gives the
4149
  parent structure or union type. 
4150
  </para>
4151
  <para>
4152
  Procedure tokens can be represented using: 
4153
  <programlisting>
4154
  	<I>proc-sort</I> :
4155
  		ZPG <I>parameter-list<SUB>opt</SUB></I> ; <I>parameter-list<SUB>opt</SUB></I> : <I>sort</I>
4156
  		ZPS <I>parameter-list<SUB>opt</SUB></I> : <I>sort</I>
4157
  </programlisting>
4158
  The first form corresponds to the more general form of <code>PROC</code>
4159
  token, that expressed using <code>{ .... | .... }</code>, which has
4160
  separate lists of bound and program parameters.  These token parameters
4161
  will have previously been declared using the <code>XP</code> identifier
4162
  key.  The second form corresponds to the case where the bound and
4163
  program parameter lists are equal, that expressed as a <code>PROC</code>
4164
  token using <code>( .... )</code>.  A more specialised version of
4165
  this second form is a <code>FUNC</code> token, which is represented
4166
  as: 
4167
  <programlisting>
4168
  	<I>func-sort</I> :
4169
  		ZF <I>type</I>
4170
  </programlisting>
4171
  </para>
4172
  <para>
4173
  As noted above, template parameters are represented by a <I>sort</I>.
4174
  Template type parameters are represented by <code>ZTO</code>, while
4175
  template expression parameters are represent by <code>ZEC</code>
4176
  (recall that such parameters are always constant expressions).  The
4177
  remaining case, template template parameters, can be represented as:
4178
  <programlisting>
4179
  	<I>template-sort</I> :
4180
  		ZTt <I>parameter-list<SUB>opt</SUB></I> :
4181
  </programlisting>
4182
  </para>
4183
  <para>
4184
  Finally, the number of parameters in a macro definition is represented
4185
  by a <I>sort</I> of the form: 
4186
  <programlisting>
4187
  	<I>macro-sort</I> :
4188
  		ZUO
4189
  		ZUF <I>number</I>
4190
  </programlisting>
4191
  corresponding to a object-like macro and a function-like macro with
4192
  the given number of parameters, respectively. 
4193
  </para>
4194
  </sect3>  
4195
 
4196
  <sect3 id="token-applications">
4197
    <title>2.4.7. Token applications</title>
4198
  <para>
4199
  Given an identifier representing a <code>PROC</code> token or a template,
4200
  an application of that token or an instance of that template can be
4201
  represented using: 
4202
  <programlisting>
4203
  	<I>token-application</I> :
4204
  		T <I>identifier</I> , <I>token-argument-list</I> :
4205
  </programlisting>
4206
  where the token or template arguments are given by: 
4207
  <programlisting>
4208
  	<I>token-argument-list</I> :
4209
  		<I>token-argument</I>
4210
  		<I>token-argument</I> , <I>token-argument-list</I>
4211
  </programlisting>
4212
  Note that the case where there are no arguments is generally just
4213
  represented by <I>identifier</I>; this case is specified separately
4214
  in the rest of the grammar. 
4215
  </para>
4216
  <para>
4217
  A <I>token-argument</I> can represent a value of any of the sorts
4218
  listed above: expressions, integer constants, statements, types, members,
4219
  functions and templates.  These are given respectively by: 
4220
  <programlisting>
4221
  	<I>token-argument</I> :
4222
  		E <I>expression</I>
4223
  		N <I>nat</I>
4224
  		S <I>statement</I>
4225
  		T <I>type</I>
4226
  		M <I>member</I>
4227
  		F <I>identifier</I>
4228
  		C <I>identifier</I>
4229
  </programlisting>
4230
  where: 
4231
  <programlisting>
4232
  	<I>expression</I> :
4233
  		<I>nat</I>
4234
 
4235
  	<I>statement</I> :
4236
  		<I>expression</I>
4237
 
4238
  	<I>member</I> :
4239
  		<I>identifier</I>
4240
  		<I>string</I>
4241
  </programlisting>
4242
  </para>
4243
  </sect3>  
4244
 
4245
  <sect3 id="error">
4246
    <title>2.4.8. Errors</title>
4247
  <para>
4248
  Each error in the C++ <A HREF="error.html">error catalogue</A> is
4249
  represented by a number.  These numbers happen to correspond to the
4250
  position of the error within the catalogue, but in general this need
4251
  not be the case.  The first use of each error introduces the error
4252
  number by associating it with a <I>string</I> giving the error name.
4253
  This has the form <code>cpp.</code><I>error</I> where <I>error</I>
4254
  gives an error name from the C++ (<code>cpp</code>) error catalogue.
4255
  Thus: 
4256
  <programlisting>
4257
  	<I>error-name</I> :
4258
  		<I>number</I> = <I>string</I>
4259
  		<I>number</I>
4260
  </programlisting>
4261
  </para>
4262
  <para>
4263
  Each error message written to the symbol table dump has the form:
4264
  <programlisting>
4265
  	<I>error-command</I> :
4266
  		ES <I>location error-info</I>
4267
  		EW <I>location error-info</I>
4268
  		EI <I>location error-info</I>
4269
  		EF <I>location error-info</I>
4270
  		EC <I>error-info</I>
4271
  		EA <I>error-argument</I>
4272
  </programlisting>
4273
  denoting constraint errors, warnings, internal errors, fatal errors,
4274
  continuation errors and error arguments respectively.  Note that an
4275
  error message may consist of several components; the initial error
4276
  plus a number of continuation errors.  Each error message may also
4277
  have a number of error argument associated with it.  This error information
4278
  is given by: 
4279
  <programlisting>
4280
  	<I>error-info</I> :
4281
  		<I>error-name number number</I>
4282
  </programlisting>
4283
  where the first <I>number</I> gives the number of error arguments
4284
  which should be read, and the second is nonzero to indicate that a
4285
  continuation error should be read. 
4286
  </para>
4287
  <para>
4288
  Each error argument has one of the forms: 
4289
  <programlisting>
4290
  	<I>error-argument</I> :
4291
  		B <I>base-number</I>
4292
  		C <I>scope-identifier</I>
4293
  		E <I>expression</I>
4294
  		H <I>identifier-name</I>
4295
  		I <I>identifier</I>
4296
  		L <I>location</I>
4297
  		N <I>nat</I>
4298
  		S <I>string</I>
4299
  		T <I>type</I>
4300
  		V <I>number</I>
4301
  		V - <I>number</I>
4302
  </programlisting>
4303
  corresponding to the various syntactic categories described above.
4304
  Note that a <I>location</I> error argument, while expressed relative
4305
  to the 
4306
  <A HREF="#crt_loc">current file location</A>, does not change this
4307
  location. 
4308
  </para>
4309
  </sect3>  
4310
 
4311
  <sect3 id="file">
4312
    <title>2.4.9. File inclusions</title>
4313
  <para>
4314
  It is possible to include information on header files within the symbol
4315
  table dump.  Firstly a number is associated with each directory on
4316
  the <code>#include</code> search path: 
4317
  <programlisting>
4318
  	<I>path-command</I> :
4319
  		FD <I>number</I> = <I>string string<SUB>opt</SUB></I>
4320
  </programlisting>
4321
  The first <I>string</I> gives the directory pathname; the second,
4322
  if present, gives the associated directory name as specified in the
4323
  <A HREF="man.html#directory"><code>-N</code> command-line option</A>.
4324
  </para>
4325
  <para>
4326
  Now the start and end of each file are marked using: 
4327
  <programlisting>
4328
  	<I>file-command</I> :
4329
  		FS <I>location directory</I>
4330
  		FE <I>location</I>
4331
  </programlisting>
4332
  where <I>directory</I> gives the number of the directory in the search
4333
  path where the file was found, or <code>*</code> if the file was found
4334
  by other means.  It is worth noting that if, for example, a function
4335
  definition is the last item in a file, the <code>FE</code> command
4336
  will appear in the symbol table dump before the <code>QFE</code> command
4337
  for the end of the function definition.  This is because lexical analysis,
4338
  where the end of file is detected, takes place before parsing, where
4339
  the end of function is detected. 
4340
  </para>
4341
  <para>
4342
  A <code>#include</code> directive, whether explicit or implicit, can
4343
  be represented using: 
4344
  <programlisting>
4345
  	<I>include-command</I> :
4346
  		FIA <I>location string</I>
4347
  		FIQ <I>location string</I>
4348
  		FIN <I>location string</I>
4349
  		FIS <I>location string</I>
4350
  		FIE <I>location string</I>
4351
  		FIR <I>location</I>
4352
  </programlisting>
4353
  the first three corresponding to header names of the forms 
4354
  <code>&lt;....&gt;</code>, <code>&quot;....&quot;</code> and <code>[....]</code>
4355
  respectively, the next two corresponding to <A HREF="man.html#start-up">start-up
4356
  </A>
4357
  and <A HREF="man.html#end-up">end-up</A> files, and the final form
4358
  being used to resume the original file after the <code>#include</code>
4359
  directive has been processed. 
4360
  </para>
4361
  </sect3>  
4362
 
4363
  <sect3 id="string-literals">
4364
    <title>2.4.10. String literals</title>
4365
  <para>
4366
  It is possible to dump information on string literals to the symbol
4367
  table dump file using the commands: 
4368
  <programlisting>
4369
  	<I>string-command</I> :
4370
  		A <I>location string</I>
4371
  		AC <I>location string</I>
4372
  		AL <I>location string</I>
4373
  		ACL <I>location string</I>
4374
  </programlisting>
4375
  representing string literals, character literals, wide string literals
4376
  and wide character literals respectively.  The given <I>string</I>
4377
  gives the string text. 
4378
  </para>
4379
  </sect3>
4380
  </sect2>
4381
 
4382
  <sect2>
4383
    <title>2.5. Intermodule analysis</title>
4384
  <para>
4385
  <IMG SRC="../images/warn.gif" ALT="warning"/>
4386
  The C++ spec linking routines have not yet been completely implemented,
4387
  and so are disabled in the current version of the C++ producer. 
4388
  </para>
4389
  <para>
4390
  A C++ spec file is a dump of the C++ producer's <A HREF="alg.html">internal
4391
  representation</A> of a translation unit.  Such files can be written
4392
  to, and read from, disk to perform such operations as intermodule
4393
  analysis. 
4394
  </para>
4395
  <para>
4396
  Note that the format of a C++ spec file is specific to the C++ producer
4397
  and may change between releases to reflect modifications in the internal
4398
  type system.  The C producer has a similar dump format, called a C
4399
  spec file, however the two are incompatible.  If intermodule analysis
4400
  between C and C++ source files is required then the <A HREF="dump.html">symbol
4401
  table dump</A> format should be used. 
4402
  </para>
4403
  </sect2>
4404
 
4405
  <sect2>
4406
    <title>2.6. Implementation details</title>
4407
  <para>
4408
  This section describes various of the implementation details of the
4409
  C++ producer TDF output.  In particular it describes the standard
4410
  TDF tokens used to represent the target dependent aspects of the language
4411
  and to provide links into the run-time system.  Many of these tokens
4412
  are common to the C and C++ producers.  Those which are unique to
4413
  the C++ producer have names of the form <code>~cpp.*</code>.  Note
4414
  that the description is in terms of TDF tokens, not the internal tokens
4415
  introduced by the 
4416
  <A HREF="token.html"><code>#pragma token</code> syntax</A>. 
4417
  </para>
4418
  <para>
4419
  There are two levels of implementation in the run-time system.  The
4420
  actual interface between the producer and the run-time system is given
4421
  by the standard tokens.  The provided implementation defines these
4422
  tokens in a way appropriate to itself.  An alternative implementation
4423
  would have to define the tokens differently.  It is intended that
4424
  the standard tokens are sufficiently generic to allow a variety of
4425
  implementations to hook into the producer output in the manner they
4426
  require. 
4427
  </para>
4428
 
4429
 
4430
  <sect3 id="arith">
4431
    <title>2.6.1. Arithmetic types</title>
4432
  <para>
4433
  The representations of the basic arithmetic types are target dependent,
4434
  so, for example, an <code>int</code> may contain 16, 32, 64 or some
4435
  other number of bits.  Thus it is necessary to introduce a token to
4436
  stand for each of the built-in arithmetic types (including the 
4437
  <A HREF="pragma.html#longlong"><code>long long</code> types</A>).
4438
  Each integral type is represented by a <code>VARIETY</code> token
4439
  as follows: </para>
4440
 
4441
  <table>
4442
  <tr><th>Type</th>
4443
  <th>Token</th>
4444
  <th>Encoding</th>
4445
  </tr>
4446
  <tr><td>char</td>
4447
  <td>~char</td>
4448
  <td>0</td>
4449
  </tr>
4450
  <tr><td>signed char</td>
4451
  <td>~signed_char</td>
4452
  <td>0 | 4 = 4</td>
4453
  </tr>
4454
  <tr><td>unsigned char</td>
4455
  <td>~unsigned_char</td>
4456
  <td>0 | 8 = 8</td>
4457
  </tr>
4458
  <tr><td>signed short</td>
4459
  <td>~signed_short</td>
4460
  <td>1 | 4 = 5</td>
4461
  </tr>
4462
  <tr><td>unsigned short</td>
4463
  <td>~unsigned_short</td>
4464
  <td>1 | 8 = 9</td>
4465
  </tr>
4466
  <tr><td>signed int</td>
4467
  <td>~signed_int</td>
4468
  <td>2 | 4 = 6</td>
4469
  </tr>
4470
  <tr><td>unsigned int</td>
4471
  <td>~unsigned_int</td>
4472
  <td>2 | 8 = 10</td>
4473
  </tr>
4474
  <tr><td>signed long</td>
4475
  <td>~signed_long</td>
4476
  <td>3 | 4 = 7</td>
4477
  </tr>
4478
  <tr><td>unsigned long</td>
4479
  <td>~unsigned_long</td>
4480
  <td>3 | 8 = 11</td>
4481
  </tr>
4482
  <tr><td>signed long long</td>
4483
  <td>~signed_longlong</td>
4484
  <td>3 | 4 | 16 = 23 </td>
4485
  </tr>
4486
  <tr><td>unsigned long long</td>
4487
  <td>~unsigned_longlong</td>
4488
  <td>3 | 8 | 16 = 27</td>
4489
  </tr>
4490
  </table>
4491
 
4492
  <para>
4493
  Similarly each floating point type is represent by a 
4494
  <code>FLOATING_VARIETY</code> token: 
4495
  </para>
4496
 
4497
  <table>
4498
  <tr><th>Type</th>   <th>Token</th>
4499
  </tr>
4500
  <tr><td>float</td>  <td>~float</td>
4501
  </tr>
4502
  <tr><td>double</td> <td>~double</td>
4503
  </tr>
4504
  <tr><td>long double</td> <td>~long_double</td>
4505
  </tr>
4506
  </table>
4507
 
4508
  <para>
4509
  Each integral type also has an encoding as a <code>SIGNED_NAT</code>
4510
  as shown above.  This number is a bit pattern built up from the following
4511
  values: 
4512
  </para>
4513
 
4514
  <table>
4515
  <tr><th>Type</th>   <th>Encoding</th>
4516
  </tr>
4517
  <tr><td>char</td>  <td>0</td>
4518
  </tr>
4519
  <tr><td>short</td>  <td>1</td>
4520
  </tr>
4521
  <tr><td>int</td>  <td>2</td>
4522
  </tr>
4523
  <tr><td>long</td>  <td>3</td>
4524
  </tr>
4525
  <tr><td>signed</td> <td>4</td>
4526
  </tr>
4527
  <tr><td>unsigned</td> <td>8</td>
4528
  </tr>
4529
  <tr><td>long long</td> <td>16</td>
4530
  </tr>
4531
  </table>
4532
 
4533
  <para>
4534
  Any target dependent integral type can be represented by a 
4535
  <code>SIGNED_NAT</code> token using this encoding.  This representation,
4536
  rather than one based on <code>VARIETY</code>s, is used for ease of
4537
  manipulation.  The token: 
4538
  <programlisting>
4539
  	~convert : ( SIGNED_NAT ) -&gt; VARIETY
4540
  </programlisting>
4541
  gives the mapping from the integral encoding to the representing variety.
4542
  For example, it will map <code>6</code> to <code>~signed_int</code>.
4543
  </para>
4544
  <para>
4545
  The token: 
4546
  <programlisting>
4547
  	~promote : ( SIGNED_NAT ) -&gt; SIGNED_NAT
4548
  </programlisting>
4549
  describes how to form the promotion of an integral type according
4550
  to the ISO C/C++ value preserving rules, and is used by the producer
4551
  to represent target dependent promotion types.  For example, the promotion
4552
  of <code>unsigned short</code> may be <code>int</code> or <code>unsigned
4553
  int</code> depending on the representation of these types; that is
4554
  to say, <code>~promote ( 9 )</code> will be <code>6</code> on some
4555
  machines and <code>10</code> on others.  Although <code>~promote</code>
4556
  is used by default, a program may specify another token with the same
4557
  sort signature to be used in its place by means of the directive:
4558
  <programlisting>
4559
  	#pragma TenDRA compute promote <I>identifier</I>
4560
  </programlisting>
4561
  For example, a standard token <code>~sign_promote</code> is defined
4562
  which gives the older C sign preserving promotion rules.  In addition,
4563
  the promotion of an individual type can be specified using: 
4564
  <programlisting>
4565
  	#pragma TenDRA promoted <I>type-id</I> : <I>promoted-type-id</I>
4566
  </programlisting>
4567
  </para>
4568
  <para>
4569
  The token: 
4570
  <programlisting>
4571
  	~arith_type : ( SIGNED_NAT, SIGNED_NAT ) -&gt; SIGNED_NAT
4572
  </programlisting>
4573
  similarly describes how to form the usual arithmetic result type from
4574
  two promoted integral operand types.  For example, the arithmetic
4575
  type of <code>long</code> and <code>unsigned int</code> may be 
4576
  <code>long</code> or <code>unsigned long</code> depending on the representation
4577
  of these types; that is to say, 
4578
  <code>~arith_type ( 7, 10 )</code> will be <code>7</code> on some
4579
  machines and <code>11</code> on others. 
4580
  </para>
4581
  <para>
4582
  Any tokenised type declared using: 
4583
  <programlisting>
4584
  	#pragma token VARIETY v # tv
4585
  </programlisting>
4586
  will be represented by a <code>SIGNED_NAT</code> token with external
4587
  name 
4588
  <code>tv</code> corresponding to the encoding of <code>v</code>. 
4589
  Special cases of this are the implementation dependent integral types
4590
  which arise naturally within the language.  The external token names
4591
  for these types are given below: 
4592
  </para>
4593
 
4594
  <table>
4595
  <tr><th>Type</th>   <th>Token</th>
4596
  </tr>
4597
  <tr><td>bool</td>  <td>~cpp.bool</td>
4598
  </tr>
4599
  <tr><td>ptrdiff_t</td> <td>ptrdiff_t</td>
4600
  </tr>
4601
  <tr><td>size_t</td> <td>size_t</td>
4602
  </tr>
4603
  <tr><td>wchar_t</td> <td>wchar_t</td>
4604
  </tr>
4605
  </table>
4606
 
4607
  <para>
4608
  So, for example, a <code>sizeof</code> expression has shape 
4609
  <code>~convert ( size_t )</code>.  The token <code>~cpp.bool</code>
4610
  is defined in the default implementation, but the other tokens are
4611
  defined according to their definitions on the target machine in the
4612
  normal API library building mechanism. 
4613
  </para>
4614
  </sect3>  
4615
 
4616
  <sect3 id="literal">
4617
    <title>2.6.2. Integer literal types</title>
4618
  <para>
4619
  The <A HREF="pragma.html#int">type of an integer literal</A> is defined
4620
  in terms of the first in a list of possible integral types.  The first
4621
  type in which the literal value can be represented gives the type
4622
  of the literal.  For small literals it is possible to work out the
4623
  type exactly, however for larger literals the result is target dependent.
4624
  For example, the literal <code>50000</code> will have type <code>int</code>
4625
  on machines in which <code>50000</code> fits into an <code>int</code>,
4626
  and 
4627
  <code>long</code> otherwise.  This target dependent mapping is given
4628
  by a series of tokens of the form: 
4629
  <programlisting>
4630
  	~lit_* : ( SIGNED_NAT ) -&gt; SIGNED_NAT
4631
  </programlisting>
4632
  which map a literal value to the representation of an integral type.
4633
  The token used depends on the list of possible types, which in turn
4634
  depends on the base used to represent the literal and the integer
4635
  suffix used, as given in the following table: 
4636
  </para>
4637
 
4638
  <table>
4639
  <tr><th>Base</th>
4640
  <th>Suffix</th>
4641
  <th>Token</th>
4642
  <th>Types</th>
4643
  </tr>
4644
  <tr><td>decimal</td>
4645
  <td>none</td>
4646
  <td>~lit_int</td>
4647
  <td>int, long, unsigned long</td>
4648
  </tr>
4649
  <tr><td>octal</td>
4650
  <td>none</td>
4651
  <td>~lit_hex</td>
4652
  <td>int, unsigned int, long, unsigned long</td>
4653
  </tr>
4654
  <tr><td>hexadecimal</td>
4655
  <td>none</td>
4656
  <td>~lit_hex</td>
4657
  <td>int, unsigned int, long, unsigned long</td>
4658
  </tr>
4659
  <tr><td>any</td>
4660
  <td>U</td>
4661
  <td>~lit_unsigned</td>
4662
  <td>unsigned int, unsigned long</td>
4663
  </tr>
4664
  <tr><td>any</td>
4665
  <td>L</td>
4666
  <td>~lit_long</td>
4667
  <td>long, unsigned long</td>
4668
  </tr>
4669
  <tr><td>any</td>
4670
  <td>UL</td>
4671
  <td>~lit_ulong</td>
4672
  <td>unsigned long</td>
4673
  </tr>
4674
  <tr><td>any</td>
4675
  <td>LL</td>
4676
  <td>~lit_longlong</td>
4677
  <td>long long, unsigned long long</td>
4678
  </tr>
4679
  <tr><td>any</td>
4680
  <td>ULL</td>
4681
  <td>~lit_ulonglong</td>
4682
  <td>unsigned long long</td>
4683
  </tr>
4684
  </table>
4685
 
4686
  <para>
4687
  Thus, for example, the shape of the integer literal 50000 is: 
4688
  <programlisting>
4689
  	~convert ( ~lit_int ( 50000 ) )
4690
  </programlisting>
4691
  </para>
4692
  </sect3>  
4693
 
4694
  <sect3 id="bitfield">
4695
    <title>2.6.3. Bitfield types</title>
4696
  <para>
4697
  The sign of a plain bitfield type, declared without using 
4698
  <code>signed</code> or <code>unsigned</code>, is left unspecified
4699
  in C and C++.  The token: 
4700
  <programlisting>
4701
  	~cpp.bitf_sign : ( SIGNED_NAT ) -&gt; BOOL
4702
  </programlisting>
4703
  is used to give a mapping from integral types to the sign of a plain
4704
  bitfield of that type, in a form suitable for use in the TDF 
4705
  <code>bfvar_bits</code> construct.  (Note that <code>~cpp.bitf_sign</code>
4706
  should have been a standard C token but was omitted.) 
4707
  </para>
4708
  </sect3>  
4709
 
4710
  <sect3 id="pointer">
4711
    <title>2.6.4. Generic pointers</title>
4712
  <para>
4713
  TDF has no concept of a generic pointer type, so tokens are used to
4714
  defer the representation of <code>void *</code> and the basic operations
4715
  on it to the target machine.  The fundamental token is: 
4716
  <programlisting>
4717
  	~ptr_void : () -&gt; SHAPE
4718
  </programlisting>
4719
  which gives the representation of <code>void *</code>.  This shape
4720
  will be denoted by <code>pv</code> in the description of the following
4721
  tokens.  It is not guaranteed that <code>pv</code> is a TDF <code>pointer</code>
4722
  shape, although normally it will be implemented as a pointer to a
4723
  suitable alignment. 
4724
  </para>
4725
  <para>
4726
  The token: 
4727
  <programlisting>
4728
  	~null_pv : () -&gt; EXP pv
4729
  </programlisting>
4730
  gives the value of a null pointer of type <code>void *</code>.  Generic
4731
  pointers can also be converted to and from other pointers.  These
4732
  conversions are represented by the tokens: 
4733
  <programlisting>
4734
  	~to_ptr_void : ( ALIGNMENT a, EXP POINTER a ) -&gt; EXP pv
4735
  	~from_ptr_void : ( ALIGNMENT a, EXP pv ) -&gt; EXP POINTER a
4736
  </programlisting>
4737
  where the given alignment describes the destination or source pointer
4738
  type.  Finally a generic pointer may be tested against the null pointer
4739
  or two generic pointers may be compared.  These operations are represented
4740
  by the tokens: 
4741
  <programlisting>
4742
  	~pv_test : ( EXP pv, LABEL, NTEST ) -&gt; EXP TOP
4743
  	~cpp.pv_compare : ( EXP pv, EXP pv, LABEL, NTEST ) -&gt; EXP TOP
4744
  </programlisting>
4745
  where the given <code>NTEST</code> gives the comparison to be applied
4746
  and the given label gives the destination to jump to if the test fails.
4747
  (Note that <code>~cpp.pv_compare</code> should have been a standard
4748
  C token but was omitted.) 
4749
  </para>
4750
  </sect3>  
4751
 
4752
  <sect3 id="undefined-conversions">
4753
    <title>2.6.5. Undefined conversions</title>
4754
  <para>
4755
  Several conversions in C and C++ can only be represented by undefined
4756
  TDF.  For example, converting a pointer to an integer can only be
4757
  represented in TDF by forming a union of the pointer and integer shapes,
4758
  putting the pointer into the union and pulling the integer out.  Such
4759
  conversions are tokenised.  Undefined conversions not mentioned below
4760
  may be performed by combining those given with the standard, well-defined,
4761
  conversions. 
4762
  </para>
4763
  <para>
4764
  The token: 
4765
  <programlisting>
4766
  	~ptr_to_ptr : ( ALIGNMENT a, ALIGNMENT b, EXP POINTER a ) -&gt; EXP POINTER b
4767
  </programlisting>
4768
  is used to convert between two incompatible pointer types.  The first
4769
  alignment describes the source pointer shape while the second describes
4770
  the destination pointer shape.  Note that if the destination alignment
4771
  is greater than the source alignment then the source pointer can be
4772
  used in most TDF constructs in place of the destination pointer, so
4773
  the use of <code>~ptr_to_ptr</code> can be omitted (the exception
4774
  is 
4775
  <code>pointer_test</code> which requires equal alignments).  Base
4776
  class pointer conversions are examples of these well-behaved, alignment
4777
  preserving conversions. 
4778
  </para>
4779
  <para>
4780
  The tokens: 
4781
  <programlisting>
4782
  	~f_to_pv : ( EXP PROC ) -&gt; EXP pv
4783
  	~pv_to_f : ( EXP pv ) -&gt; EXP PROC
4784
  </programlisting>
4785
  are used to convert pointers to functions to and from <code>void *</code>
4786
  (these conversions are not allowed in ISO C/C++ but are in older dialects).
4787
  </para>
4788
  <para>
4789
  The tokens: 
4790
  <programlisting>
4791
  	~i_to_p : ( VARIETY v, ALIGNMENT a, EXP INTEGER v ) -&gt; EXP POINTER a
4792
  	~p_to_i : ( ALIGNMENT a, VARIETY v, EXP POINTER a ) -&gt; EXP INTEGER v
4793
  	~i_to_pv : ( VARIETY v, EXP INTEGER v ) -&gt; EXP pv
4794
  	~pv_to_i : ( VARIETY v, EXP pv ) -&gt; EXP INTEGER v
4795
  </programlisting>
4796
  are used to convert integers to and from <code>void *</code> and other
4797
  pointers. 
4798
  </para>
4799
  </sect3>  
4800
 
4801
  <sect3 id="div">
4802
    <title>2.6.6. Integer division</title>
4803
  <para>
4804
  The precise form of the integer division and remainder operations
4805
  in C and C++ is left unspecified with respect to the sign of the result
4806
  if either operand is negative.  The tokens: 
4807
  <programlisting>
4808
  	~div : ( EXP INTEGER v, EXP INTEGER v ) -&gt; EXP INTEGER v
4809
  	~rem : ( EXP INTEGER v, EXP INTEGER v ) -&gt; EXP INTEGER v
4810
  </programlisting>
4811
  are used to represent integer division and remainder.  They will map
4812
  onto one of the pairs of TDF constructs, <code>div0</code> and <code>rem0</code>,
4813
  <code>div1</code> and <code>rem1</code> or <code>div2</code> and 
4814
  <code>rem2</code>. 
4815
  </para>
4816
  </sect3>  
4817
 
4818
  <sect3 id="call">
4819
    <title>2.6.7. Calling conventions</title>
4820
  <para>
4821
  The function calling conventions used by the C++ producer are essentially
4822
  the same as those used by the C producer with one exception.  That
4823
  is to say, all types except arrays are passed by value (note that
4824
  individual installers may modify these conventions to conform to their
4825
  own ABIs). 
4826
  </para>
4827
  <para>
4828
  The exception concerns classes with a non-trivial constructor, destructor
4829
  or assignment operator.  These classes are passed as function arguments
4830
  by taking a reference to a copy of the object (although it is often
4831
  possible to eliminate the copy and pass a reference to the object
4832
  directly).  They are passed as function return values by adding an
4833
  extra parameter to the start of the function parameters giving a reference
4834
  to a location into which the return value should be copied. 
4835
  </para>
4836
 
4837
  <H4>Member functions</H4>
4838
  <para>
4839
  Non-static member functions are implemented in the obvious fashion,
4840
  by passing a pointer to the object the method is being applied to
4841
  as the first argument (or the second argument if the method has an
4842
  extra argument for its return value). 
4843
  </para>
4844
 
4845
  <H4><A id="ellipsis">Ellipsis functions</A></H4>
4846
  <para>
4847
  Calls to functions declared with ellipses are via the 
4848
  <code>apply_proc</code> TDF construct, with all the arguments being
4849
  treated as non-variable.  However the definition of such a function
4850
  uses the <code>make_proc</code> construct with a variable parameter.
4851
  This parameter can be referred to within the program using the 
4852
  <A HREF="pragma.html#ellipsis"><code>...</code> expression</A>.  The
4853
  type of this expression is given by the built-in token: 
4854
  <programlisting>
4855
  	~__va_t : () -&gt; SHAPE
4856
  </programlisting>
4857
  The <code>va_start</code> macro declared in the 
4858
  <code>&lt;stdarg.h&gt;</code> header then describes how the variable
4859
  parameter (expressed as <code>...</code>) can be converted to an expression
4860
  of type <code>va_list</code> suitable for use in the 
4861
  <code>va_arg</code> macro. 
4862
  </para>
4863
  <para>
4864
  Note that the variable parameter is in effect only being used to determine
4865
  where the first optional parameter is defined.  The assumption is
4866
  that all such parameters are located contiguously on the stack, however
4867
  the fact that calls to such functions do not use the variable parameter
4868
  mechanism means that this is not automatically the case.  Strictly
4869
  speaking this means that the implementation of ellipsis functions
4870
  uses undefined behaviour in TDF, however given the non-type-safe function
4871
  calling rules in C this is unavoidable and installers need to make
4872
  provision for such calls (by dumping any parameters from registers
4873
  to the stack if necessary).  Given the theoretically type-safe nature
4874
  of C++ it would be possible to avoid such undefined behaviour, but
4875
  the need for C-compatible calling conventions prevents this. 
4876
  </para>
4877
  </sect3>  
4878
 
4879
  <sect3 id="ptr_mem">
4880
    <title>2.6.8. Pointers to data members</title>
4881
  <para>
4882
  The representation of, and operations on, pointers to data members
4883
  are represented by tokens to allow for a variety of implementations.
4884
  It is assumed that all pointers to data members (as opposed to pointers
4885
  to function members) are represented by the same shape: 
4886
  <programlisting>
4887
  	~cpp.pm.type : () -&gt; SHAPE
4888
  </programlisting>
4889
  This shape will be denoted by <code>pm</code> in the description of
4890
  the following tokens. 
4891
  </para>
4892
  <para>
4893
  There are two basic methods of constructing a pointer to a data member.
4894
  The first is to take the address of a data member of a class.  A data
4895
  member is represented in TDF by an expression which gives the offset
4896
  of the member from the start of its enclosing <code>compound</code>
4897
  shape (note that it is not possible to take the address of a member
4898
  of a virtual base). The mapping from this offset to a pointer to a
4899
  data member is given by: 
4900
  <programlisting>
4901
  	~cpp.pm.make : ( EXP OFFSET ) -&gt; EXP pm
4902
  </programlisting>
4903
  The second way of constructing a pointer to a data member is to use
4904
  a null pointer to member: 
4905
  <programlisting>
4906
  	~cpp.pm.null : () -&gt; EXP pm
4907
  </programlisting>
4908
  The other fundamental operation on a pointer to data member is to
4909
  turn it back into an offset expression which can be added to a pointer
4910
  to a class to access a member of that class in a <code>.*</code> or
4911
  <code>-&gt;*</code>
4912
  operation.  This is done by the token: 
4913
  <programlisting>
4914
  	~cpp.pm.offset : ( EXP pm, ALIGNMENT a ) -&gt; EXP OFFSET ( a, a )
4915
  </programlisting>
4916
  Note that it is necessary to specify an alignment in order to describe
4917
  the shape of the result.  The value of this token is undefined if
4918
  the given expression is a null pointer to data member. 
4919
  </para>
4920
  <para>
4921
  A pointer to a data member of a non-virtual base class can be converted
4922
  to a pointer to a data member of a derived class.  The reverse conversion
4923
  is also possible using <code>static_cast</code>.  If the base is a
4924
  <A HREF="#primary">primary base class</A> then these conversions are
4925
  trivial and have no effect.  Otherwise null pointers to data members
4926
  are converted to null pointers to data members, and the non-null cases
4927
  are handled by the tokens: 
4928
  <programlisting>
4929
  	~cpp.pm.cast : ( EXP pm, EXP OFFSET ) -&gt; EXP pm
4930
  	~cpp.pm.uncast : ( EXP pm, EXP OFFSET ) -&gt; EXP pm
4931
  </programlisting>
4932
  where the given offset is the offset of the base class within the
4933
  derived class.  It is also possible to convert between any two pointers
4934
  to data members using <code>reinterpret_cast</code>.  This conversion
4935
  is implied by the equality of representation between any two pointers
4936
  to data members and has no effect. 
4937
  </para>
4938
  <para>
4939
  The only remaining operations on pointer to data members are to test
4940
  one against the null pointer to data member and to compare two pointer
4941
  to data members.  These are represented by the tokens: 
4942
  <programlisting>
4943
  	~cpp.pm.test : ( EXP pm, LABEL, NTEST ) -&gt; EXP TOP
4944
  	~cpp.pm.compare : ( EXP pm, EXP pm, LABEL, NTEST ) -&gt; EXP TOP
4945
  </programlisting>
4946
  where the given <code>NTEST</code> gives the comparison to be applied
4947
  and the given label gives the destination to jump to if the test fails.
4948
  </para>
4949
  <para>
4950
  In the default implementation, pointers to data members are implemented
4951
  as <code>int</code>.  The null pointer to member is represented by
4952
 
4953
  of the member (in bytes).  Casting to and from a derived class then
4954
  correspond to adding or subtracting the base class offset (in bytes),
4955
  and pointer to member comparisons correspond to integer comparisons.
4956
  </para>
4957
  </sect3>  
4958
 
4959
  <sect3 id="ptr_mem_func">
4960
    <title>2.6.9. Pointers to function members</title>
4961
  <para>
4962
  As with pointers to data members, pointers to function members and
4963
  the operations on them are represented by tokens to allow for a range
4964
  of implementations.  All pointers to function members are represented
4965
  by the same shape: 
4966
  <programlisting>
4967
  	~cpp.pmf.type : () -&gt; SHAPE
4968
  </programlisting>
4969
  This shape will be denoted by <code>pmf</code> in the description
4970
  of the following tokens.  Many of the tokens take an expression which
4971
  has a shape which is a pointer to the alignment of <code>pmf</code>.
4972
  This will be denoted by <code>ppmf</code>. 
4973
  </para>
4974
  <para>
4975
  There are two basic methods for constructing a pointer to a function
4976
  member.  The first is to take the address of a non-static member function
4977
  of a class.  There are two cases, depending on whether or not the
4978
  member function is virtual.  The non-virtual case is given by the
4979
  token: 
4980
  <programlisting>
4981
  	~cpp.pmf.make : ( EXP PROC, EXP OFFSET, EXP OFFSET ) -&gt; EXP pmf
4982
  </programlisting>
4983
  where the first argument is the address of the corresponding function,
4984
  the second argument gives any base class offset which is to be added
4985
  when calling this function (to deal with inherited member functions),
4986
  and the third argument is a zero offset. 
4987
  </para>
4988
  <para>
4989
  For virtual functions, a pointer to function member of the form above
4990
  is entered in the <A HREF="#vtable">virtual function table</A> for
4991
  the corresponding class.  The actual pointer to the virtual function
4992
  member then gives a reference into the virtual function table as follows:
4993
  <programlisting>
4994
  	~cpp.pmf.vmake : ( SIGNED_NAT, EXP OFFSET, EXP, EXP ) -&gt; EXP pmf
4995
  </programlisting>
4996
  where the first argument gives the index of the function within the
4997
  virtual function table, the second argument gives the offset of the
4998
  <I>vptr</I> field within the class, and the third and fourth arguments
4999
  are zero offsets. 
5000
  </para>
5001
  <para>
5002
  The second way of constructing a pointer to a function member is to
5003
  use a null pointer to function member: 
5004
  <programlisting>
5005
  	~cpp.pmf.null : () -&gt; EXP pmf
5006
  	~cpp.pmf.null2 : () -&gt; EXP pmf
5007
  </programlisting>
5008
  For technical reasons there are two versions of this token, although
5009
  they have the same value.  The first token is used in static initialisers;
5010
  the second token is used in other expressions. </para>
5011
  <para>
5012
  The cast operations on pointers to function members are more complex
5013
  than those on pointers to data members.  The value to be cast is copied
5014
  into a temporary and one of the tokens: 
5015
  <programlisting>
5016
  	~cpp.pmf.cast : ( EXP ppmf, EXP OFFSET, EXP, EXP OFFSET ) -&gt; EXP TOP
5017
  	~cpp.pmf.uncast : ( EXP ppmf, EXP OFFSET, EXP, EXP OFFSET ) -&gt; EXP TOP
5018
  </programlisting>
5019
  is applied to modify the value of the temporary according to the given
5020
  cast.  The first argument gives the address of the temporary, the
5021
  second gives the base class offset to be added or subtracted, the
5022
  third gives the number to be added or subtracted to convert virtual
5023
  function indexes for the base class into virtual function indexes
5024
  for the derived class, and the fourth gives the offset of the <I>vptr</I>
5025
  field within the class.  Again, the ability to use <code>reinterpret_cast</code>
5026
  to convert between any two pointer to function member types arises
5027
  because of the uniform representation of these types. 
5028
  </para>
5029
  <para>
5030
  As with pointers to data members, there are tokens implementing comparisons
5031
  on pointers to function members: 
5032
  <programlisting>
5033
  	~cpp.pmf.test : ( EXP ppmf, LABEL, NTEST ) -&gt; EXP TOP
5034
  	~cpp.pmf.compare : ( EXP ppmf, EXP ppmf, LABEL, NTEST ) -&gt; EXP TOP
5035
  </programlisting>
5036
  Note however that the arguments are passed by reference. 
5037
  </para>
5038
  <para>
5039
  The most important, and most complex, operation is calling a function
5040
  through a pointer to function member.  The first step is to copy the
5041
  pointer to function member into a temporary.  The token: 
5042
  <programlisting>
5043
  	~cpp.pmf.virt : ( EXP ppmf, EXP, ALIGNMENT ) -&gt; EXP TOP
5044
  </programlisting>
5045
  is then applied to the temporary to convert a pointer to a virtual
5046
  function member to a normal pointer to function member by looking
5047
  it up in the corresponding virtual function table.  The first argument
5048
  gives the address of the temporary, the second gives the object to
5049
  which the function is to be applied, and the third gives the alignment
5050
  of the corresponding class.  Now the base class conversion to be applied
5051
  to the object can be determined by applying the token: 
5052
  <programlisting>
5053
  	~cpp.pmf.delta : ( ALIGNMENT a, EXP ppmf ) -&gt; EXP OFFSET ( a, a )
5054
  </programlisting>
5055
  to the temporary to find the offset to be added.  Finally the function
5056
  to be called can be extracted from the temporary using the token:
5057
  <programlisting>
5058
  	~cpp.pmf.func : ( EXP ppmf ) -&gt; EXP PROC
5059
  </programlisting>
5060
  The function call then procedes as normal. 
5061
  </para>
5062
  <para>
5063
  The default implementation is that described in the ARM, where each
5064
  pointer to function member is represented in the form: 
5065
  <programlisting>
5066
  	struct PTR_MEM_FUNC {
5067
  	    short delta ;
5068
  	    short index ;
5069
  	    union {
5070
  		void ( *func ) () ;
5071
  		short off ;
5072
  	    } u ;
5073
  	} ;
5074
  </programlisting>
5075
  The <code>delta</code> field gives the base class offset (in bytes)
5076
  to be added before applying the function.  The <code>index</code>
5077
  field is 0 for null pointers, -1 for non-virtual function pointers
5078
  and the index into the virtual function table for virtual function
5079
  pointers (as described below these indexes start from 1).  For non-virtual
5080
  function pointers the function itself is given by the <code>u.func</code>
5081
  field. For virtual function pointers the offset of the <I>vptr</I>
5082
  field within the class is given by the <code>u.off</code> field. 
5083
  </para>
5084
  </sect3>  
5085
 
5086
  <sect3 id="class">
5087
    <title>2.6.10. Class layout</title>
5088
  <para>
5089
  Consider a class with no base classes: 
5090
  <programlisting>
5091
  	class A {
5092
  	    // A's members
5093
  	} ;
5094
  </programlisting>
5095
  Each object of class <I>A</I> needs its own copy of the non-static
5096
  data members of <I>A</I> and, for polymorphic types, a means of referencing
5097
  the virtual function table and run-time type information for <I>A</I>.
5098
  This is accomplished using a layout of the form: 
5099
 
5100
  <IMG SRC="../images/class.gif" ALT="class A"/>
5101
 
5102
  where the <I>A</I> component consists of the non-static data members
5103
  and 
5104
  <I>vptr A</I> is a pointer to the virtual function table for <I>A</I>.
5105
  For non-polymorphic classes the <I>vptr A</I> field is omitted; otherwise
5106
  space for <I>vptr A</I> needs to be allocated within the class and
5107
  the pointer needs to be initialised in each constructor for <I>A</I>.
5108
  The precise layout of the <A HREF="#vtable">virtual function table</A>
5109
  and the <A HREF="#rtti">run-time type information</A> is given below.
5110
  </para>
5111
  <para>
5112
  Two alternative ways of laying out the non-static data members within
5113
  the class are implemented.  The first, which is default, gives them
5114
  in the order in which they are declared in the class definition. 
5115
  The second lays out the <code>public</code>, the <code>protected</code>,
5116
  and the <code>private</code> members in three distinct sections, the
5117
  members within each section being given in the order in which they
5118
  are declared. The latter can be enabled using the <code>-jo</code>
5119
  command-line option. 
5120
  </para>
5121
  <para>
5122
  The offset of each member within the class (including <I>vptr A</I>)
5123
  can be calculated in terms of the offset of the previous member. 
5124
  The first member has offset zero.  The offset of any other member
5125
  is given by the offset of the previous member plus the size of the
5126
  previous member, rounded up to the alignment of the current member.
5127
  The overall size of the class is given by the offset of the last member
5128
  plus the size of the last member, rounded up using the token: 
5129
  <programlisting>
5130
  	~comp_off : ( EXP OFFSET ) -&gt; EXP OFFSET
5131
  </programlisting>
5132
  which allows for any target dependent padding at the end of the class.
5133
  The shape of the class is then a <code>compound</code> shape with
5134
  this offset. 
5135
  </para>
5136
  <para>
5137
  Classes with no members need to be treated slightly differently. 
5138
  The shape of such a class is given by the token: 
5139
  <programlisting>
5140
  	~cpp.empty.shape : () -&gt; SHAPE
5141
  </programlisting>
5142
  (recall that an empty class still has a nonzero size).  The token:
5143
  <programlisting>
5144
  	~cpp.empty.offset : () -&gt; EXP OFFSET
5145
  </programlisting>
5146
  is used to represent the offset required for an empty class when it
5147
  is used as a base class.  This may be a zero offset. 
5148
  </para>
5149
  <para>
5150
  Bitfield members provide a slight complication to the picture above.
5151
  The offset of a bitfield is additionally padded using the token: 
5152
  <programlisting>
5153
  	~pad : ( EXP OFFSET, SHAPE, SHAPE ) -&gt; EXP OFFSET
5154
  </programlisting>
5155
  where the two shapes give the type underlying the bitfield and the
5156
  bitfield itself. 
5157
  </para>
5158
  <para>
5159
  The layout of unions is similar to that of classes except that all
5160
  members have zero offset, and the size of the union is the maximum
5161
  of the sizes of its members, suitably padded.  Of course unions cannot
5162
  be polymorphic and cannot have base classes. 
5163
  </para>
5164
  <para>
5165
  Pointers to incomplete classes are represented by means of the alignment:
5166
  <programlisting>
5167
  	~cpp.empty.align : () -&gt; ALIGNMENT
5168
  </programlisting>
5169
  This token is also used for the alignment of a complete class if that
5170
  class is never used in the generated TDF in a manner which requires
5171
  it to be complete.  This can lead to savings on the size of the generated
5172
  code by preventing the need to define all the member offset tokens
5173
  in order to find the shape of the class. 
5174
  </para>
5175
  </sect3>  
5176
 
5177
  <sect3 id="derive">
5178
    <title>2.6.11. Derived class layout</title>
5179
  <para>
5180
  The description of the implementation of derived classes will be given
5181
  in terms of the example class hierarchy given by: 
5182
  <programlisting>
5183
  	class A {
5184
  	    // A's members
5185
  	} ;
5186
 
5187
  	class B : public A {
5188
  	    // B's members
5189
  	} ;
5190
 
5191
  	class C : public A {
5192
  	    // C's members
5193
  	} ;
5194
 
5195
  	class D : public B, public C {
5196
  	    // D's members
5197
  	} ;
5198
  </programlisting>
5199
  or, as a directed acyclic graph: 
5200
  </para>
5201
 
5202
  <IMG SRC="../images/graph.gif" ALT="class D"/>
5203
 
5204
 
5205
  <H4>Single inheritance</H4>
5206
  <para>
5207
  The layout of class <I>A</I> is given by: 
5208
 
5209
  <IMG SRC="../images/classA.gif" ALT="class A"/>
5210
 
5211
  as above.  Class <I>B</I> inherits all the members of class <I>A</I>
5212
  plus those members explicitly declared within class <I>B</I>.  In
5213
  addition, class <I>B</I> inherits all the virtual member functions
5214
  of <I>A</I>, some of which may be overridden in <I>B</I>, extended
5215
  by any additional virtual functions declared in <I>B</I>.  This may
5216
  be represented as follows: 
5217
 
5218
  <IMG SRC="../images/classB.gif" ALT="class B"/>
5219
 
5220
  where <I>A</I> denotes those members inherited from the base class
5221
  and 
5222
  <I>B</I> denotes those members added in the derived class.  Note that
5223
  an object of class <I>B</I> contains a sub-object of class <I>A</I>.
5224
  The fact that this sub-object is located at the start of <I>B</I>
5225
  means that the base class conversion from <I>B</I> to <I>A</I> is
5226
  trivial.  Any base class with this property is called a 
5227
  <A id="primary">primary base class</A>. 
5228
  </para>
5229
  <para>
5230
  Note that in theory two virtual function tables are required, the
5231
  normal virtual function table for <I>B</I>, denoted by <I>vtbl B</I>,
5232
  and a modified virtual function table for <I>A</I>, denoted by <I>vtbl
5233
  B::A</I>, taking into account any overriding virtual functions within
5234
  <I>B</I>, and pointing to <I>B</I>'s run-time type information.  This
5235
  latter means that the dynamic type information for the <I>A</I> sub-object
5236
  relates to 
5237
  <I>B</I> rather than <I>A</I>.  However these two tables can usually
5238
  be combined - if the virtual functions added in <I>B</I> are listed
5239
  in the virtual function table after those inherited from <I>A</I>
5240
  and the form of the overriding is <A HREF="#override">suitably well
5241
  behaved</A>
5242
  (in the sense defined below) then <I>vptr B::A</I> is an initial segment
5243
  of <I>vptr B</I>.  It is also possible to remove the <I>vptr B</I>
5244
  field and use <I>vptr B::A</I> in its place in this case (it has to
5245
  be this way round to preserve the <I>A</I> sub-object).  Thus the
5246
  items shaded in the diagram can be removed. 
5247
  </para>
5248
  <para>
5249
  The class <I>C</I> is similarly given by: 
5250
 
5251
  <IMG SRC="../images/classC.gif" ALT="class C"/>
5252
 
5253
  </para>
5254
 
5255
  <H4>Multiple inheritance</H4>
5256
  <para>
5257
  Class <I>D</I> is more complex because of the presence of multiple
5258
  inheritance.  <I>D</I> inherits all the members of <I>B</I>, including
5259
  those which <I>B</I> inherits from <I>A</I>, plus all the members
5260
  of 
5261
  <I>C</I>, including those which <I>C</I> inherits from <I>A</I>. 
5262
  It also inherits all of the virtual member functions from <I>B</I>
5263
  and 
5264
  <I>C</I>, some of which may be overridden in <I>D</I>, extended by
5265
  any additional virtual functions declared in <I>D</I>.  This may be
5266
  represented as follows: 
5267
 
5268
  <IMG SRC="../images/classD.gif" ALT="class D"/>
5269
 
5270
  Note that there are two copies of <I>A</I> in <I>D</I> because virtual
5271
  inheritance has not been used. 
5272
  </para>
5273
  <para>
5274
  The <I>B</I> base class of <I>D</I> is essentially similar to the
5275
  single inheritance case already discussed; the <I>C</I> base class
5276
  is different however.  Note firstly that the <I>C</I> sub-object of
5277
  <I>D</I> is located at a non-zero offset, <I>delta D::C</I>, from
5278
  the start of the object. This means that the base class conversion
5279
  from <I>D</I> to <I>C</I>
5280
  consists of adding this offset (for pointer conversions things are
5281
  further complicated by the need to allow for null pointers).  Also
5282
  <I>vtbl D::C</I> is not an initial segment of <I>vtbl D</I> because
5283
  this contains the virtual functions inherited from <I>B</I> first,
5284
  followed by those inherited from <I>C</I>, followed by those first
5285
  declared in <I>D</I> (there are <A HREF="#override">other reasons</A>
5286
  as well).  Thus <I>vtbl D::C</I> cannot be eliminated. 
5287
  </para>
5288
 
5289
  <H4>Virtual inheritance</H4>
5290
  <para>
5291
  Virtual inheritance introduces a further complication.  Now consider
5292
  the class hierarchy given by: 
5293
  <programlisting>
5294
  	class A {
5295
  	    // A's members
5296
  	} ;
5297
 
5298
  	class B : virtual public A {
5299
  	    // B's members
5300
  	} ;
5301
 
5302
  	class C : virtual public A {
5303
  	    // C's members
5304
  	} ;
5305
 
5306
  	class D : public B, public C {
5307
  	    // D's members
5308
  	} ;
5309
  </programlisting>
5310
  or, as a <A id="diamond">directed acyclic graph</A>: 
5311
 
5312
  <IMG SRC="../images/diamond.gif" ALT="class D"/>
5313
 
5314
  As before <I>A</I> is given by: 
5315
 
5316
  <IMG SRC="../images/classA.gif" ALT="class A"/>
5317
 
5318
  but now <I>B</I> is given by: 
5319
 
5320
  <IMG SRC="../images/virtualB.gif" ALT="class B"/>
5321
 
5322
  Rather than having the sub-object of class <I>A</I> directly as part
5323
  of 
5324
  <I>B</I>, the class now contains a pointer, <I>ptr A</I>, to this
5325
  sub-object.  The virtual sub-objects are always located at the end
5326
  of a class layout; their offset may therefore vary for different objects,
5327
  however the offset for <I>ptr A</I> is always fixed.  The <I>ptr A</I>
5328
  field is initialised in each constructor for <I>B</I>.  In order to
5329
  perform the base class conversion from <I>B</I> to <I>A</I>, the contents
5330
  of <I>ptr A</I> are taken (again provision needs to be made for null
5331
  pointers in pointer conversions).  In cases when the dynamic type
5332
  of the <I>B</I> object can be determined statically it is possible
5333
  to access the <I>A</I> sub-object directly by adding a suitable offset.
5334
  Because this conversion is non-trivial (see <A HREF="#override">below</A>)
5335
  the virtual function table <I>vtbl B::A</I> is not an initial segment
5336
  of 
5337
  <I>vtbl B</I> and cannot be eliminated. 
5338
  </para>
5339
  <para>
5340
  The class <I>C</I> is similarly given by: 
5341
 
5342
  <IMG SRC="../images/virtualC.gif" ALT="class C"/>
5343
 
5344
  Now the class <I>D</I> is given by: 
5345
 
5346
  <IMG SRC="../images/virtualD.gif" ALT="class D"/>
5347
 
5348
  Note that there is a single <I>A</I> sub-object of <I>D</I> referenced
5349
  by the <I>ptr A</I> fields in both the <I>B</I> and <I>C</I> sub-objects.
5350
  The elimination of <I>vtbl D::B</I> is as above. 
5351
  </para>
5352
  </sect3>  
5353
 
5354
  <sect3 id="constr">
5355
    <title>2.6.12. Constructors and destructors</title>
5356
  <para>
5357
  The implementation of constructors and destructors, whether explicitly
5358
  or implicitly defined, is slightly more complex than that of other
5359
  member functions.  For example, the constructors need to set up the
5360
  internal <I>vptr</I> and <I>ptr</I> fields mentioned above. 
5361
  </para>
5362
  <para>
5363
  The order of initialisation in a constructor is as follows: 
5364
  <itemizedlist>
5365
  <listitem>The internal <I>ptr</I> fields giving the locations of the virtual
5366
  base classes are initialised. 
5367
  </listitem>
5368
  <listitem>The constructors for the virtual base classes are called. 
5369
  </listitem>
5370
  <listitem>The constructors for the non-virtual direct base classes are called.
5371
  </listitem>
5372
  <listitem>The internal <I>vptr</I> fields giving the locations of the virtual
5373
  function tables are initialised. 
5374
  </listitem>
5375
  <listitem>The constructors for the members of the class are called. 
5376
  </listitem>
5377
  <listitem>The main constructor body is executed. 
5378
  </listitem>
5379
  </itemizedlist>
5380
  To ensure that each virtual base is only initialised once, if a class
5381
  has a virtual base class then all its constructors have an implicit
5382
  extra parameter of type <code>int</code>.  The first two steps above
5383
  are then only applied if this flag is nonzero.  In normal applications
5384
  of the constructor this argument will be 1, however in base class
5385
  initialisations such as those in the third and fourth steps above,
5386
  it will be 0. 
5387
  </para>
5388
  <para>
5389
  Note that similar steps to protect virtual base classes are not taken
5390
  in an implicitly declared <code>operator=</code> function.  The order
5391
  of assignment in this case is as follows: 
5392
  <itemizedlist>
5393
  <listitem>The assignment operators for the direct base classes (both virtual
5394
  and non-virtual) are called. 
5395
  </listitem>
5396
  <listitem>The assignment operators for the members of the class are called.
5397
  </listitem>
5398
  <listitem>A reference to the object assigned to (i.e. <code>*this</code>)
5399
  is     returned. 
5400
  </listitem>
5401
  </itemizedlist>
5402
  </para>
5403
  <para>
5404
  The order of destruction in a destructor is essentially the reverse
5405
  of the order of construction: 
5406
  <itemizedlist>
5407
  <listitem>The main destructor body is executed. 
5408
  </listitem>
5409
  <listitem>The destructor for the members of the class are called. 
5410
  </listitem>
5411
  <listitem>The internal <I>vptr</I> fields giving the locations of the virtual
5412
  function tables are re-initialised. 
5413
  </listitem>
5414
  <listitem>The destructors for the non-virtual direct base classes are called.
5415
  </listitem>
5416
  <listitem>The destructors for the virtual base classes are called. 
5417
  </listitem>
5418
  <listitem>If necessary the space occupied by the object is deallocated.
5419
  </listitem>
5420
  </itemizedlist>
5421
  All destructors have an extra parameter of type <code>int</code>.
5422
  The virtual base classes are only destroyed if this flag is nonzero
5423
  when and-ed with 2.  The space occupied by the object is only deallocated
5424
  if this flag is nonzero when and-ed with 1.  This deallocation is
5425
  equivalent to inserting:  
5426
  <programlisting>
5427
  	delete this ;
5428
  </programlisting>
5429
  in the destructor.  The <code>operator delete</code> function is called
5430
  via the destructor in this way in order to implement the pseudo-virtual
5431
  nature of these deallocation functions.  Thus for normal destructor
5432
  calls the extra argument is 2, for base class destructor calls it
5433
  is 0, and for calls arising from a <code>delete</code> expression
5434
  it is 3. 
5435
  </para>
5436
  <para>
5437
  The point at which the virtual function tables are initialised in
5438
  the constructor, and the fact that they are re-initialised in the
5439
  destructor, is to ensure that virtual functions called from base class
5440
  initialisers are handled correctly (see ISO C++ 12.7). 
5441
  </para>
5442
  <para>
5443
  A further complication arises from the need to destroy 
5444
  <A id="partial">partially constructed objects</A> if an exception
5445
  is thrown in a constructor.  A count is maintained of the number of
5446
  base classes and members constructed within a constructor.  If an
5447
  exception is thrown then it is caught in the constructor, the constructed
5448
  base classes and members are destroyed, and the exception is re-thrown.
5449
  The count variable is used to determine which bases and members need
5450
  to be destroyed. 
5451
  </para>
5452
  <para>
5453
  <IMG SRC="../images/warn.gif" ALT="warning"/> These partial destructors
5454
  currently do not interact correctly with any exception specification
5455
  on the constructor.  Exceptions thrown within destructors are not
5456
  correctly handled either. 
5457
  </para>
5458
  </sect3>  
5459
 
5460
  <sect3 id="vtable">
5461
    <title>2.6.13. Virtual function tables</title>
5462
  <para>
5463
  The virtual functions in a polymorphic class are given in its virtual
5464
  function table in the following order: firstly those virtual functions
5465
  inherited from its direct base classes (which may be overridden in
5466
  the derived class) followed by those first declared in the derived
5467
  class in the order in which they are declared.  Note that this can
5468
  result in virtual functions inherited from virtual base classes appearing
5469
  more than once.  The virtual functions are numbered from 1 (this is
5470
  slightly more convenient than numbering from 0 in the default implementation).
5471
  </para>
5472
  <para>
5473
  The virtual function table for this class has shape: 
5474
  <programlisting>
5475
  	~cpp.vtab.type : ( NAT ) -&gt; SHAPE
5476
  </programlisting>
5477
  the argument being <I>n + 1</I> where <I>n</I> is the number of virtual
5478
  functions in the class (there is also a token: 
5479
  <programlisting>
5480
  	~cpp.vtab.diag : () -&gt; SHAPE
5481
  </programlisting>
5482
  which is used in the diagnostic output for a generic virtual function
5483
  table).  The table is created using the token: 
5484
  <programlisting>
5485
  	~cpp.vtab.make : ( EXP pti, EXP OFFSET, NAT, EXP NOF ) -&gt; EXP vt
5486
  </programlisting>
5487
  where the first expression gives the address of the <A HREF="#rtti">run-time
5488
  type information structure</A> for the class, the second expression
5489
  gives the offset of the <I>vptr</I> field within the class (i.e. <I>voff</I>),
5490
  the integer constant is <I>n + 1</I>, and the final expression is
5491
  a 
5492
  <code>make_nof</code> construct giving information on each of the
5493
  <I>n</I>
5494
  virtual functions. 
5495
  </para>
5496
  <para>
5497
  The information given on each virtual function in this table has the
5498
  form of a <A HREF="#ptr_mem_func">pointer to function member</A> formed
5499
  using the token: 
5500
  <programlisting>
5501
  	~cpp.pmf.make : ( EXP PROC, EXP OFFSET, EXP OFFSET ) -&gt; EXP pmf
5502
  </programlisting>
5503
  as above, except that the third argument gives the offset of the base
5504
  class in virtual function tables such as <I>vtbl B::A</I>.  For pure
5505
  virtual functions the function pointer in this token is given by:
5506
  <programlisting>
5507
  	~cpp.vtab.pure : () -&gt; EXP PROC
5508
  </programlisting>
5509
  In the default implementation this gives a function 
5510
  <code>__TCPPLUS_pure</code> which just calls <code>abort</code>. 
5511
  </para>
5512
  <para>
5513
  To avoid duplicate copies of virtual function tables and run-time
5514
  type information structures being created, the ARM algorithm is used.
5515
  The virtual function table and run-time type information structure
5516
  for a class are defined in the module containing the definition of
5517
  the first non-inline, non-pure virtual function declared in that class.
5518
  If such a function does not exist then duplicate copies are created
5519
  in every module which requires them.  In the former case the virtual
5520
  function table will have an <A HREF="#other">external tag name</A>;
5521
  in the latter case it will be an internal tag.  This scheme can be
5522
  overridden using the <code>-jv</code> command-line option, which causes
5523
  local virtual function tables to be output for all classes. 
5524
  </para>
5525
  <para>
5526
  Note that the discussion above applies to both simple virtual function
5527
  tables, such as <I>vtbl B</I> above, and to those arising from base
5528
  classes, such as <I>vtbl B::A</I>.  <A id="override">We are now
5529
  in a position to precisely determine when <I>vtbl B::A</I> is an initial
5530
  segment of <I>vtbl B</I> and hence can be eliminated</A>.  Firstly,
5531
  <I>A</I> must be the first direct base class of <I>B</I> and cannot
5532
  be virtual.  This is to ensure both that there are no virtual functions
5533
  in <I>vtbl B</I> before those inherited from <I>A</I>, and that the
5534
  corresponding base class conversion is trivial so that the pointers
5535
  to function members of <I>B</I> comprising the virtual function table
5536
  can be equally regarded as pointers to function members of <I>A</I>.
5537
  The second requirement is that if a virtual function for <I>A</I>,
5538
  <I>f</I>, is overridden in <I>B</I> then the return type for <I>B::f</I>
5539
  cannot differ from the return type for <I>A::f</I> by a non-trivial
5540
  conversion (recall that ISO C++ allows the return types to differ
5541
  by a base class conversion).  In the non-trivial conversion case the
5542
  function entered in <I>vtbl B::A</I> needs to be, not <I>B::f</I>
5543
  as in <I>vtbl B</I>, but a stub function which calls <I>B::f</I> and
5544
  converts its return value to the return type of <I>A::f</I>. 
5545
  </para>
5546
 
5547
  <H4>Calling virtual functions</H4>
5548
  <para>
5549
  The virtual function call mechanism is implemented using the token:
5550
  <programlisting>
5551
  	~cpp.vtab.func : ( EXP ppvt, SIGNED_NAT ) -&gt; EXP ppmf
5552
  </programlisting>
5553
  which has as its arguments a reference to the <I>vptr</I> field of
5554
  the object the function is to be called for, and the number of the
5555
  virtual function to be called.  It returns a reference to the corresponding
5556
  pointer to function member within the object's virtual function table.
5557
  The function is then called by extracting the base class offset to
5558
  be added, and the function to be called, from this reference using
5559
  the tokens: 
5560
  <programlisting>
5561
  	~cpp.pmf.delta : ( ALIGNMENT a, EXP ppmf ) -&gt; EXP OFFSET ( a, a )
5562
  	~cpp.pmf.func : ( EXP ppmf ) -&gt; EXP PROC
5563
  </programlisting>
5564
  described as part of the <A HREF="#ptr_mem_func">pointer to function
5565
  member call mechanism</A> above. 
5566
  </para>
5567
  </sect3>  
5568
 
5569
  <sect3 id="rtti">
5570
    <title>2.6.14. Run-time type information</title>
5571
  <para>
5572
  Each C++ type can be associated with a run-time type information structure
5573
  giving information about that type.  These type information structures
5574
  have shape given by the token: 
5575
  <programlisting>
5576
  	~cpp.typeid.type : () -&gt; SHAPE
5577
  </programlisting>
5578
  which corresponds to the representation for the standard type 
5579
  <code>std::type_info</code> declared in the header 
5580
  <code>&lt;typeinfo&gt;</code>.  Each type information structure consists
5581
  of a tag number, giving information on the kind of type represented,
5582
  a string literal, giving the name of the type, and a pointer to a
5583
  list of base type information structures.  These are combined to give
5584
  a type information structure using the token: 
5585
  <programlisting>
5586
  	~cpp.typeid.make : ( SIGNED_NAT, EXP, EXP ) -&gt; EXP ti
5587
  </programlisting>
5588
  Each base type information structure has shape given by the token:
5589
  <programlisting>
5590
  	~cpp.baseid.type : () -&gt; SHAPE
5591
  </programlisting>
5592
  It consists of a pointer to a type information structure, an expression
5593
  used to describe the offset of a base class, a pointer to the next
5594
  base type information structure in the list, and two integers giving
5595
  information on type qualifiers etc.  These are combined to give a
5596
  base type information structure using the token: 
5597
  <programlisting>
5598
  	~cpp.baseid.make : ( EXP, EXP, EXP, SIGNED_NAT, SIGNED_NAT ) -&gt; EXP bi
5599
  </programlisting>
5600
  </para>
5601
  <para>
5602
  The following table gives the various tag numbers used in type information
5603
  structures plus a list of the base type information structures associated
5604
  with each type.  Macros giving these tag numbers are provided in the
5605
  default implementation in a header, <code>interface.h</code>, which
5606
  is shared by the C++ producer. 
5607
  </para>
5608
  <para>
5609
 
5610
  <table>
5611
  <tr><th>Type</th>
5612
  <th>Form</th>
5613
  <th>Tag</th>
5614
  <th>Base information</th>
5615
  </tr>
5616
  <tr><td>integer</td>
5617
  <td>-</td>
5618
  <td>0</td>
5619
  <td>-</td>
5620
  </tr>
5621
  <tr><td>floating point</td>
5622
  <td>-</td>
5623
  <td>1</td>
5624
  <td>-</td>
5625
  </tr>
5626
  <tr><td>void</td>
5627
  <td>-</td>
5628
  <td>2</td>
5629
  <td>-</td>
5630
  </tr>
5631
  <tr><td>class or struct</td>
5632
  <td>class T</td>
5633
  <td>3</td>
5634
  <td>[base,access,virtual], ....</td>
5635
  </tr>
5636
  <tr><td>union</td>
5637
  <td>union T</td>
5638
  <td>4</td>
5639
  <td>-</td>
5640
  </tr>
5641
  <tr><td>enumeration</td>
5642
  <td>enum T</td>
5643
  <td>5</td>
5644
  <td>-</td>
5645
  </tr>
5646
  <tr><td>pointer</td>
5647
  <td>cv T *</td>
5648
  <td>6</td>
5649
  <td>[T,cv,0]</td>
5650
  </tr>
5651
  <tr><td>reference</td>
5652
  <td>cv T &amp;</td>
5653
  <td>7</td>
5654
  <td>[T,cv,0]</td>
5655
  </tr>
5656
  <tr><td>pointer to member</td>
5657
  <td>cv T S::*</td>
5658
  <td>8</td>
5659
  <td>[S,0,0], [T,cv,0]</td>
5660
  </tr>
5661
  <tr><td>array</td>
5662
  <td>cv T [n]</td>
5663
  <td>9</td>
5664
  <td>[T,cv,n]</td>
5665
  </tr>
5666
  <tr><td>bitfield</td>
5667
  <td>cv T : n</td>
5668
  <td>10</td>
5669
  <td>[T,cv,n]</td>
5670
  </tr>
5671
  <tr><td>C++ function</td>
5672
  <td>cv T ( S1, ...., Sn )</td>
5673
  <td>11</td>
5674
  <td>[T,cv,0], [S1,0,0], ...., [Sn,0,0]</td>
5675
  </tr>
5676
  <tr><td>C function</td>
5677
  <td>cv T ( S1, ...., Sn )</td>
5678
  <td>12</td>
5679
  <td>[T,cv,0], [S1,0,0], ...., [Sn,0,0]</td>
5680
  </tr>
5681
  </table>
5682
 
5683
  </para>
5684
  <para>
5685
  In the form column <code>cv T</code> is used to denote not only the
5686
  normal cv-qualifiers but, when <code>T</code> is a function type,
5687
  the member function cv-qualifiers.  Arrays with an unspecified bound
5688
  are treated as if their bound was zero.  Functions with ellipsis are
5689
  treated as if they had an extra parameter of a dummy type named 
5690
  <code>...</code> (see below).  Note the distinction between C++ and
5691
  C function types. 
5692
  </para>
5693
  <para>
5694
  Each base type information structure is described as a triple consisting
5695
  of a type and two integers.  One of these integers may be used to
5696
  encode a type qualifier, <code>cv</code>, as follows: 
5697
  </para>
5698
  <para>
5699
 
5700
  <table>
5701
  <tr><th>Qualifier</th>   <th>Encoding</th>
5702
  </tr>
5703
  <tr><td>none</td>  <td>0</td>
5704
  </tr>
5705
  <tr><td>const</td>  <td>1</td>
5706
  </tr>
5707
  <tr><td>volatile</td> <td>2</td>
5708
  </tr>
5709
  <tr><td>const volatile</td><td>3</td>
5710
  </tr>
5711
  </table>
5712
 
5713
  </para>
5714
  <para>
5715
  The base type information for a class consists of information on each
5716
  of its direct base classes.  The includes the offset of this base
5717
  within the class (for a virtual base class this is the offset of the
5718
  corresponding 
5719
  <I>ptr</I> field), whether the base is virtual (1) or not (0), and
5720
  the base class access, encoded as follows: 
5721
  </para>
5722
  <para>
5723
 
5724
  <table>
5725
  <tr><th>Access</th>   <th>Encoding</th>
5726
  </tr>
5727
  <tr><td>public</td> <td>0</td>
5728
  </tr>
5729
  <tr><td>protected</td> <td>1</td>
5730
  </tr>
5731
  <tr><td>private</td> <td>2</td>
5732
  </tr>
5733
  </table>
5734
 
5735
  </para>
5736
  <para>
5737
  For example, the run-time type information structures for the classes
5738
  declared in the <A HREF="#diamond">diamond lattice</A> above can be
5739
  represented as follows: 
5740
 
5741
  <IMG SRC="../images/rttiD.gif" ALT="typeid D"/>
5742
 
5743
  </para>
5744
 
5745
  <H4>Defining run-time type information structures</H4>
5746
  <para>
5747
  For built-in types, the run-time type information structure may be
5748
  referenced by the token: 
5749
  <programlisting>
5750
  	~cpp.typeid.basic : ( SIGNED_NAT ) -&gt; EXP pti
5751
  </programlisting>
5752
  where the argument gives the encoding of the type as given in the
5753
  following table: 
5754
  </para>
5755
 
5756
  <table>
5757
  <tr><th>Type</th>   <th>Encoding</th>
5758
  <th>Type</th>   <th>Encoding</th>
5759
  </tr>
5760
  <tr><td>char</td>  <td>0</td>
5761
  <td>unsigned long</td> <td>11</td>
5762
  </tr>
5763
  <tr><td>(error)</td> <td>1</td>
5764
  <td>float</td>  <td>12</td>
5765
  </tr>
5766
  <tr><td>void</td>  <td>2</td>
5767
  <td>double</td> <td>13</td>
5768
  </tr>
5769
  <tr><td>(bottom)</td> <td>3</td>
5770
  <td>long double</td> <td>14</td>
5771
  </tr>
5772
  <tr><td>signed char</td> <td>4</td>
5773
  <td>wchar_t</td> <td>16</td>
5774
  </tr>
5775
  <tr><td>signed short</td> <td>5</td>
5776
  <td>bool</td>  <td>17</td>
5777
  </tr>
5778
  <tr><td>signed int</td> <td>6</td>
5779
  <td>(ptrdiff_t)</td> <td>18</td>
5780
  </tr>
5781
  <tr><td>signed long</td> <td>7</td>
5782
  <td>(size_t)</td> <td>19</td>
5783
  </tr>
5784
  <tr><td>unsigned char</td> <td>8</td>
5785
  <td>(...)</td>  <td>20</td>
5786
  </tr>
5787
  <tr><td>unsigned short</td><td>9</td>
5788
  <td>signed long long</td>
5789
  <td>23</td>
5790
  </tr>
5791
  <tr><td>unsigned int</td> <td>10</td>
5792
  <td>unsigned long long</td>
5793
  <td>27</td>
5794
  </tr>
5795
  </table>
5796
 
5797
  <para>
5798
  Note that the encoding for the basic integral types is the same as
5799
  that 
5800
  <A HREF="#arith">given above</A>.  The other types are assigned to
5801
  unused values.  Note that the encodings for <code>ptrdiff_t</code>
5802
  and 
5803
  <code>size_t</code> are not used, instead that for their implementation
5804
  is used (using the standard tokens <code>ptrdiff_t</code> and 
5805
  <code>size_t</code>).  The encodings for <code>bool</code> and 
5806
  <code>wchar_t</code> are used because they are conceptually distinct
5807
  types even though they are implemented as one of the basic integral
5808
  types.  The type labelled <code>...</code> is the dummy used in the
5809
  representation of ellipsis functions.  The default implementation
5810
  uses an array of type information structures, <code>__TCPPLUS_typeid</code>,
5811
  to implement <code>~cpp.typeid.basic</code>. 
5812
  </para>
5813
  <para>
5814
  The run-time type information structures for classes are defined in
5815
  the same place as their <A HREF="#vtable">virtual function tables</A>.
5816
  Other run-time type information structures are defined in whatever
5817
  modules require them.  In the former case the type information structure
5818
  will have an <A HREF="#other">external tag name</A>; in the latter
5819
  case it will be an internal tag. 
5820
  </para>
5821
 
5822
  <H4>Accessing run-time type information</H4>
5823
  <para>
5824
  The primary means of accessing the run-time type information for an
5825
  object is using the <code>typeid</code> construct.  In cases where
5826
  the operand type can be determined statically, the address of the
5827
  corresponding type information structure is returned.  In other cases
5828
  the token: 
5829
  <programlisting>
5830
  	~cpp.typeid.ref : ( EXP ppvt ) -&gt; EXP pti
5831
  </programlisting>
5832
  is used, where the argument gives a reference to the <I>vptr</I> field
5833
  of the object being checked.  From this information it is trivial
5834
  to trace the corresponding type information. 
5835
  </para>
5836
  <para>
5837
  Another means of querying the run-time type information for an object
5838
  is using the <code>dynamic_cast</code> construct.  When the result
5839
  cannot be determined statically, this is implemented using the token:
5840
  <programlisting>
5841
  	~cpp.dynam.cast : ( EXP ppvt, EXP pti ) -&gt; EXP pv
5842
  </programlisting>
5843
  where the first expression gives a reference to the <I>vptr</I> field
5844
  of the object being cast and the second gives the run-time type information
5845
  for the type being cast to.  In the default implementation this token
5846
  is implemented by the procedure <code>__TCPPLUS_dynamic_cast</code>.
5847
  The key point to note is that the virtual function table contains
5848
  the offset, <I>voff</I>, of the <I>vptr</I> field from the start of
5849
  the most complete object.  Thus it is possible to find the address
5850
  of the most complete object.  The run-time type information contains
5851
  enough information to determine whether this object has a sub-object
5852
  of the type being cast to, and if so, how to find the address of this
5853
  sub-object.  The result is returned as a <code>void *</code>, with
5854
  the null pointer indicating that the conversion is not possible. 
5855
  </para>
5856
  </sect3>  
5857
 
5858
  <sect3 id="dynamic-initialisation">
5859
    <title>2.6.15. Dynamic initialisation</title>
5860
  <para>
5861
  The dynamic initialisation of variables with static storage duration
5862
  in C++ is implemented by means of the TDF <code>initial_value</code>
5863
  construct.  However in order for the producer to maintain control
5864
  over the order of initialisation, rather than each variable being
5865
  initialised separately using <code>initial_value</code>, a single
5866
  expression is created which initialises all the variables in a module,
5867
  and this initialiser expression is used to initialise a single dummy
5868
  variable using <code>initial_value</code>.  Note that, while this
5869
  enables the variables within a single module to be initialised in
5870
  the order in which they are defined, the order of initialisation between
5871
  different modules is unspecified. 
5872
  </para>
5873
  <para>
5874
  The implementation needs to keep a list of those variables with static
5875
  storage duration which have been initialised so that it can call the
5876
  destructors for these objects at the end of the program. This is done
5877
  by declaring a variable of shape: 
5878
  <programlisting>
5879
  	~cpp.destr.type : () -&gt; SHAPE
5880
  </programlisting>
5881
  for each such object with a non-trivial destructor.  Each element
5882
  of an array is considered a distinct object.  Immediately after the
5883
  variable has been initialised the token: 
5884
  <programlisting>
5885
  	~cpp.destr.global : ( EXP pd, EXP POINTER c, EXP PROC ) -&gt; EXP TOP
5886
  </programlisting>
5887
  is called to add the variable to the list of objects to be destroyed.
5888
  The first argument is the address of the dummy variable just declared,
5889
  the second is the address of the object to be destroyed, and the third
5890
  is the destructor to be used.  In this way a list giving the objects
5891
  to be destroyed, and the order in which to destroy them, is built
5892
  up.  Note that partially constructed objects are destroyed within
5893
  their constructors (see <A HREF="#partial">above</A>) so that only
5894
  completely constructed objects need to be considered. 
5895
  </para>
5896
  <para>
5897
  The implementation also needs to ensure that it calls the destructors
5898
  in this list at the end of the program, including calls of 
5899
  <code>exit</code>.  This is done by calling the token: 
5900
  <programlisting>
5901
  	~cpp.destr.init : () -&gt; EXP TOP
5902
  </programlisting>
5903
  at the start of each <code>initial_value</code> construct.  In the
5904
  default implementation this uses <code>atexit</code> to register a
5905
  function, <code>__TCPPLUS_term</code>, which calls the destructors.
5906
  To aid alternative implementations the token: 
5907
  <programlisting>
5908
  	~cpp.start : () -&gt; EXP TOP
5909
  </programlisting>
5910
  is called at the start of the <code>main</code> function, however
5911
  this has no effect in the default implementation. 
5912
  </para>
5913
  </sect3>  
5914
 
5915
  <sect3 id="except">
5916
    <title>2.6.16. Exception handling</title>
5917
  <para>
5918
  Conceptually, exception handling can be described in terms of the
5919
  following diagram: 
5920
 
5921
  <IMG SRC="../images/try.gif" ALT="try stack"/>
5922
 
5923
  At any point in the execution of the program there is a stack of currently
5924
  active <code>try</code> blocks and currently active local variables.
5925
  A 
5926
  <code>try</code> block is pushed onto the stack as it is entered and
5927
  popped from the stack when it is left (whether directly or via a jump).
5928
  A local variable with a non-trivial destructor is pushed onto the
5929
  stack just after its constructor has been called at the start of its
5930
  scope, and popped from the stack just before its destructor is called
5931
  at the end of its scope (including before jumps out of its scope).
5932
  Each element of an array is considered a separate object.  Each <code>try</code>
5933
  block has an associated list of handlers.  Each local variable has
5934
  an associated destructor. 
5935
  </para>
5936
  <para>
5937
  Provided no exception is thrown this stack grows and shrinks in a
5938
  well-behaved manner as execution proceeds.  When an exception is thrown
5939
  an exception manager is invoked to find a matching exception handler.
5940
  The exception manager proceeds to execute a loop to unwind the stack
5941
  as follows.  If the stack is empty then the exception cannot be caught
5942
  and 
5943
  <code>std::terminate</code> is called.  Otherwise the top element
5944
  is popped from the stack.  If this is a local variable then the associated
5945
  destructor is called for the variable.  If the top element is a 
5946
  <code>try</code> block then the current exception is compared in turn
5947
  to each of the associated handlers.  If a match is found then execution
5948
  jumps to the handler body, otherwise the exception manager continues
5949
  to the next element of the stack. 
5950
  </para>
5951
  <para>
5952
  Note that this description is purely conceptual.  There is no need
5953
  for exception handling to be implemented by a stack in this way (although
5954
  the default implementation uses a similar technique).  It does however
5955
  serve to illustrate the various stages which must exist in any implementation.
5956
  </para>
5957
 
5958
  <H4>Try blocks</H4>
5959
  <para>
5960
  At the start of a <code>try</code> block a variable of shape: 
5961
  <programlisting>
5962
  	~cpp.try.type : () -&gt; SHAPE
5963
  </programlisting>
5964
  is declared corresponding to the stack element for this block.  This
5965
  is then initialised using the token: 
5966
  <programlisting>
5967
  	~cpp.try.begin : ( EXP ptb, EXP POINTER fa, EXP POINTER ca ) -&gt; EXP TOP
5968
  </programlisting>
5969
  </para>
5970
  where the first argument is a pointer to this variable, the second
5971
  argument is the TDF <code>current_env</code> construct, and the third
5972
  argument is the result of the TDF <code>make_local_lv</code> construct
5973
  on the label which is used to mark the first handler associated with
5974
  the block.  Note that the last two arguments enable a TDF 
5975
  <code>long_jump</code> construct to be applied to transfer control
5976
  to the first handler. 
5977
  <para>
5978
  When control exits from a <code>try</code> block, whether by reaching
5979
  the end of the block or jumping out of it, the block is removed from
5980
  the stack using the token: 
5981
  <programlisting>
5982
  	~cpp.try.end : ( EXP ptb ) -&gt; EXP TOP
5983
  </programlisting>
5984
  where the argument is a pointer to the <code>try</code> block variable.
5985
  </para>
5986
 
5987
  <H4>Local variables</H4>
5988
  <para>
5989
  The technique used to add a local variable with a non-trivial destructor
5990
  to the stack is similar to that used in the dynamic initialisation
5991
  of global variables.  A local variable of shape <code>~cpp.destr.type</code>
5992
  is declared at the start of the variable scope.  This is initialised
5993
  just after the constructor for the variable is called using the token:
5994
  <programlisting>
5995
  	~cpp.destr.local : ( EXP pd, EXP POINTER c, EXP PROC ) -&gt; EXP TOP
5996
  </programlisting>
5997
  where the first argument is a pointer to the variable being initialised,
5998
  the  second is a pointer to the local variable to be destroyed, and
5999
  the third is the destructor to be called.  At the end of the variable
6000
  scope, just before its destructor is called, the token: 
6001
  <programlisting>
6002
  	~cpp.destr.end : ( EXP pd ) -&gt; EXP TOP
6003
  </programlisting>
6004
  where the argument is a pointer to destructor variable, is called
6005
  to remove the local variable destructor from the stack.  Note that
6006
  partially constructed objects are destroyed within their constructors
6007
  (see 
6008
  <A HREF="#partial">above</A>) so that only completely constructed
6009
  objects need to be considered. 
6010
  </para>
6011
  <para>
6012
  In cases where the local variable may be conditionally initialised
6013
  (for example a temporary variable in the second operand of a <code>||</code>
6014
  operation) the local variable of shape <code>~cpp.destr.type</code>
6015
  is initialised to the value given by the token: 
6016
  <programlisting>
6017
  	~cpp.destr.null : () -&gt; EXP d
6018
  </programlisting>
6019
  (normally it is  left uninitialised).  Before the destructor for this
6020
  variable is called the value of the token: 
6021
  <programlisting>
6022
  	~cpp.destr.ptr : ( EXP pd ) -&gt; EXP POINTER c
6023
  </programlisting>
6024
  is tested.  If <code>~cpp.destr.local</code> has been called for this
6025
  variable then this token returns a pointer to the variable, otherwise
6026
  it returns a null pointer.  The token <code>~cpp.destr.end</code>
6027
  and the destructor are only called if this token indicates that the
6028
  variable has been initialised. 
6029
  </para>
6030
 
6031
  <H4>Throwing an exception</H4>
6032
  <para>
6033
  When a <code>throw</code> expression with an argument is encountered
6034
  a number of steps performed.  Firstly, space is allocated to hold
6035
  the exception value using the token: 
6036
  <programlisting>
6037
  	~cpp.except.alloc : ( EXP VARIETY size_t ) -&gt; EXP pv
6038
  </programlisting>
6039
  the argument of which gives the size of the value.  The space allocated
6040
  is returned as an expression of type <code>void *</code>.  Secondly,
6041
  the exception value is copied into the space allocated, using a copy
6042
  constructor if appropriate.  Finally the exception is raised using
6043
  the token: 
6044
  <programlisting>
6045
  	~cpp.except.throw : ( EXP pv, EXP pti, EXP PROC ) -&gt; EXP BOTTOM
6046
  </programlisting>
6047
  The first argument gives the pointer to the exception value, returned
6048
  by 
6049
  <code>~cpp.except.alloc</code>, the second argument gives a pointer
6050
  to the run-time type information for the exception type, and the third
6051
  argument gives the destructor to be called to destroy the exception
6052
  value (if any). This token sets the current exception to the given
6053
  values and invokes the exception manager as above. 
6054
  </para>
6055
  <para>
6056
  A <code>throw</code> expression without an argument results in a call
6057
  to the token: 
6058
  <programlisting>
6059
  	~cpp.except.rethrow : () -&gt; EXP BOTTOM
6060
  </programlisting>
6061
  which re-invokes the exception manager with the current exception.
6062
  If there is no current exception then the implementation should call
6063
  <code>std::terminate</code>. 
6064
  </para>
6065
 
6066
  <H4>Handling an exception</H4>
6067
  <para>
6068
  The exception manager proceeds to find an exception in the manner
6069
  described above, unwinding the stack and calling destructors for local
6070
  variables.  When a <code>try</code> block is popped from the stack
6071
  a TDF <code>long_jump</code> is applied to transfer control to its
6072
  list of handlers.  For each handler in turn it is checked whether
6073
  the handler can catch the current exception.  For <code>...</code>
6074
  handlers this is always true; for other handlers it is checked using
6075
  the token: 
6076
  <programlisting>
6077
  	~cpp.except.catch : ( EXP pti ) -&gt; EXP VARIETY int
6078
  </programlisting>
6079
  where the argument is a pointer to the run-time type information for
6080
  the handler type.  This token gives 1 if the exception is caught by
6081
  this handler, and 0 otherwise.  If the exception is not caught by
6082
  the handler then the next handler is checked, until there are no more
6083
  handlers associated with the <code>try</code> block.  In this case
6084
  control is passed back to the exception manager by re-throwing the
6085
  current exception using <code>~cpp.except.rethrow</code>. 
6086
  </para>
6087
  <para>
6088
  If an exception is caught by a handler then a number of steps are
6089
  performed. Firstly, if appropriate, the handler variable is initialised
6090
  by copying the current exception value.  A pointer to the current
6091
  exception value can be obtained using the token: 
6092
  <programlisting>
6093
  	~cpp.except.value : () -&gt; EXP pv
6094
  </programlisting>
6095
  Once this initialisation is complete the token: 
6096
  <programlisting>
6097
  	~cpp.except.caught : () -&gt; EXP TOP
6098
  </programlisting>
6099
  is called to indicate that the exception has been caught.  The handler
6100
  body is then executed.  When control exits from the handler, whether
6101
  by reaching the end of the handler or by jumping out of it, the token:
6102
  <programlisting>
6103
  	~cpp.except.end : () -&gt; EXP TOP
6104
  </programlisting>
6105
  is called to indicate that the exception has been completed.  Note
6106
  that the implementation should call the destructor for the current
6107
  exception and free the space allocated by <code>~cpp.except.alloc</code>
6108
  at this point. Execution then continues with the statement following
6109
  the handler. 
6110
  </para>
6111
  <para>
6112
  To conclude, the TDF generated for a <code>try</code> block and its
6113
  associated list of handlers has the form: 
6114
  <programlisting>
6115
  	variable (
6116
  	    long_jump_access,
6117
  	    stack_tag,
6118
  	    make_value ( ~cpp.try.type ),
6119
  	    conditional (
6120
  		handler_label,
6121
  		sequence (
6122
  		    ~cpp.try.begin (
6123
  			obtain_tag ( stack_tag ),
6124
  			current_env,
6125
  			make_local_lv ( handler_label ) ),
6126
  			<I>try-block-body</I>,
6127
  			~cpp.try.end ),
6128
  		    conditional (
6129
  			catch_label_1,
6130
  			sequence (
6131
  			    integer_test (
6132
  				not_equal,
6133
  				catch_label_1,
6134
  				~cpp.except.catch (
6135
  				    <I>handler-1-typeid</I> ) )
6136
  			    variable (
6137
  				handler_tag_1,
6138
  				<I>handler-1-init</I> (
6139
  				    ~cpp.except.value ),
6140
  				sequence (
6141
  				    ~cpp.except.caught,
6142
  				    <I>handler-1-body</I> ) )
6143
  			    ~cpp.except.end )
6144
  			conditional (
6145
  			    catch_label_2,
6146
  			    <I>further-handlers</I>,
6147
  			    ~cpp.except.rethrow ) ) ) )
6148
  </programlisting>
6149
  </para>
6150
  <para>
6151
  Note that for a local variable to maintain its previous value when
6152
  an  exception is caught in this way it is necessary to declare it
6153
  using the TDF <code>long_jump_access</code> construct.  Any local
6154
  variable which contains a <code>try</code> block in its scope is declared
6155
  in this way. 
6156
  </para>
6157
  <para>
6158
  To aid implementations in the writing of exception managers the following
6159
  standard tokens are provided: 
6160
  <programlisting>
6161
  	~cpp.ptr.code : () -&gt; SHAPE POINTER ca
6162
  	~cpp.ptr.frame : () -&gt; SHAPE POINTER fa
6163
  	~cpp.except.jump : ( EXP POINTER fa, EXP POINTER ca ) -&gt; EXP BOTTOM
6164
  </programlisting>
6165
  These give the shape of the TDF <code>make_local_lv</code> construct,
6166
  the shape of the TDF <code>current_env</code> construct, and direct
6167
  access to the TDF <code>long_jump</code> access.  The exception manager
6168
  in the default implementation is a function called <code>__TCPPLUS_throw</code>.
6169
  </para>
6170
 
6171
  <H4>Exception specifications</H4>
6172
  <para>
6173
  If a function is declared with an exception specification then extra
6174
  code needs to be generated in the function definition to catch any
6175
  unexpected exceptions thrown by the function and to call <code>std::unexpected
6176
  </code>. Since this is a potentially high overhead for small functions,
6177
  this extra code is not generated if it can be proved that such unexpected
6178
  exceptions can never be thrown (the analysis is essentially the same
6179
  as that in the 
6180
  <A HREF="pragma.html#exception">exception analysis</A> check). 
6181
  </para>
6182
  <para>
6183
  The implementation of exception specification is to enclose the entire
6184
  function definition in a <code>try</code> block.  The handler for
6185
  this block uses <code>~cpp.except.catch</code> to check whether the
6186
  current exception can be caught by any of the types listed in the
6187
  exception specification.  If so the current exception is re-thrown.
6188
  If none of these types catch the current exception then the token:
6189
  <programlisting>
6190
  	~cpp.except.bad : ( SIGNED_NAT ) -&gt; EXP TOP
6191
  </programlisting>
6192
  is called.  The argument is 1 if the exception specification includes
6193
  the special type <code>std::bad_exception</code>, and 0 otherwise.
6194
  The implementation should call <code>std::unexpected</code>, but how
6195
  any exceptions thrown during this call are to be handled depends on
6196
  the value of the argument. 
6197
  </para>
6198
  </sect3>  
6199
 
6200
  <sect3 id="mangle">
6201
    <title>2.6.17. Mangled identifier names</title>
6202
  <para>
6203
  In a similar fashion to other C++ compilers, the C++ producer needs
6204
  a method of mapping C++ identifiers to a form suitable for further
6205
  processing, namely TDF tag names.  This mangled name contains an encoding
6206
  of the identifier name, its parent namespace or class and its type.
6207
  Identifiers with C linkage are not mangled.  The producer contains
6208
  a built-in <A HREF="man.html#unmangle">name unmangler</A>
6209
  which performs the reverse operation of transforming the mangled form
6210
  of an identifier name back to the underlying identifier.  This can
6211
  be useful when analysing system linker errors. 
6212
  </para>
6213
  <para>
6214
  Note that the type of an identifier forms part of its mangled name
6215
  not only for functions, but also for variables.  Many other compilers
6216
  do not mangle variable names, however the ISO C++ rules on namespaces
6217
  and variables with C linkage make it necessary (this can be suppressed
6218
  using the <code>-j-n</code> command-line option).  Declaring the language
6219
  linkage of a variable inconsistently can therefore lead to linking
6220
  errors with the C++ producer which are not detected by other compilers.
6221
  A common example is: 
6222
  <programlisting>
6223
  	extern int errno ;
6224
  </programlisting>
6225
  which, leaving aside whether <code>errno</code> is actually an external
6226
  variable, should be: 
6227
  <programlisting>
6228
  	extern &quot;C&quot; int errno ;
6229
  </programlisting>
6230
  </para>
6231
  <para>
6232
  As described above, the mangled form of an identifier has three components;
6233
  the identifier name, the identifier namespace and the identifier type.
6234
  Two underscores (<code>__</code>) are used to separate the name component
6235
  from the namespace and type components.  The mangling scheme used
6236
  is based on that described in the ARM.  The description below is not
6237
  complete; the mangling and unmangling routines themselves should be
6238
  consulted for a complete description. 
6239
  </para>
6240
 
6241
  <H4>Mangling identifier names</H4>
6242
  <para>
6243
  Simple identifier names are mapped to themselves.  Unicode characters
6244
  of the forms <code>\u</code><I>xxxx</I> and <code>\U</code><I>xxxxxxxx</I>
6245
  are mapped to <code>__k</code><I>xxxx</I> and <code>__K</code><I>xxxxxxxx</I>
6246
  respectively, where the hex digits are output in their canonical lower-case
6247
  form.  Constructors are mapped to <code>__ct</code> and destructors
6248
  to <code>__dt</code>.  Conversions functions are mapped to 
6249
  <code>__op</code><I>type</I> where <I>type</I> is the mangled form
6250
  of the conversion type.  Overloaded operator functions, 
6251
  <code>operator@</code>, are mapped as follows: 
6252
  </para>
6253
 
6254
  <table>
6255
  <tr><th>Operator</th>   <th>Mapping</th>
6256
  <th>Operator</th>   <th>Mapping</th>
6257
  <th>Operator</th>   <th>Mapping</th>
6258
  </tr>
6259
  <tr><td>&amp;</td>  <td>__ad</td>
6260
  <td>&amp;=</td> <td>__aad</td>
6261
  <td>[]</td>  <td>__vc</td>
6262
  </tr>
6263
  <tr><td>-&gt;</td>  <td>__rf</td>
6264
  <td>-&gt;*</td> <td>__rm</td>
6265
  <td>=</td>  <td>__as</td>
6266
  </tr>
6267
  <tr><td>,</td>  <td>__cm</td>
6268
  <td>~</td>  <td>__co</td>
6269
  <td>/</td>  <td>__dv</td>
6270
  </tr>
6271
  <tr><td>/=</td>  <td>__adv</td>
6272
  <td>==</td>  <td>__eq</td>
6273
  <td>()</td>  <td>__cl</td>
6274
  </tr>
6275
  <tr><td>&gt;</td>  <td>__gt</td>
6276
  <td>&gt;=</td>  <td>__ge</td>
6277
  <td>&lt;</td>  <td>__lt</td>
6278
  </tr>
6279
  <tr><td>&lt;=</td>  <td>__le</td>
6280
  <td>&amp;&amp;</td> <td>__aa</td>
6281
  <td>||</td>  <td>__oo</td>
6282
  </tr>
6283
  <tr><td>&lt;&lt;</td> <td>__ls</td>
6284
  <td>&lt;&lt;=</td> <td>__als</td>
6285
  <td>-</td>  <td>__mi</td>
6286
  </tr>
6287
  <tr><td>-=</td>  <td>__ami</td>
6288
  <td>--</td>  <td>__mm</td>
6289
  <td>!</td>  <td>__nt</td>
6290
  </tr>
6291
  <tr><td>!=</td>  <td>__ne</td>
6292
  <td>|</td>  <td>__or</td>
6293
  <td>|=</td>  <td>__aor</td>
6294
  </tr>
6295
  <tr><td>+</td>  <td>__pl</td>
6296
  <td>+=</td>  <td>__apl</td>
6297
  <td>++</td>  <td>__pp</td>
6298
  </tr>
6299
  <tr><td>%</td>  <td>__md</td>
6300
  <td>%=</td>  <td>__amd</td>
6301
  <td>&gt;&gt;</td> <td>__rs</td>
6302
  </tr>
6303
  <tr><td>&gt;&gt;=</td> <td>__ars</td>
6304
  <td>*</td>  <td>__ml</td>
6305
  <td>*=</td>  <td>__aml</td>
6306
  </tr>
6307
  <tr><td>^</td>  <td>__er</td>
6308
  <td>^=</td>  <td>__aer</td>
6309
  <td>delete</td> <td>__dl</td>
6310
  </tr>
6311
  <tr><td>delete []</td> <td>__vd</td>
6312
  <td>new</td>  <td>__nw</td>
6313
  <td>new []</td> <td>__vn</td>
6314
  </tr>
6315
  <tr><td>?:</td>  <td>__cn</td>
6316
  <td>:</td>  <td>__cs</td>
6317
  <td>::</td>  <td>__cc</td>
6318
  </tr>
6319
  <tr><td>.</td>  <td>__df</td>
6320
  <td>.*</td>  <td>__dm</td>
6321
  <td>abs</td>  <td>__ab</td>
6322
  </tr>
6323
  <tr><td>max</td>  <td>__mx</td>
6324
  <td>min</td>  <td>__mn</td>
6325
  <td>sizeof</td> <td>__sz</td>
6326
  </tr>
6327
  <tr><td>typeid</td> <td>__td</td>
6328
  <td>vtable</td> <td>__tb</td>
6329
  <td>-</td>  <td>-</td>
6330
  </tr>
6331
  </table>
6332
 
6333
  <para>
6334
  Note that this table contains a number of operators which are not
6335
  part of C++ or cannot be overloaded in C++.  These are used in the
6336
  representation of target dependent integer constants. 
6337
  </para>
6338
 
6339
  <H4>Mangling namespace names</H4>
6340
  <para>
6341
  The global namespace is mapped to an empty string.  Simple namespace
6342
  and class names are mapped as above, but are preceded by a series
6343
  of decimal digits giving the length of the mangled name.  Nested namespaces
6344
  and classes are represented by a sequence of such namespace names,
6345
  preceded by the number of elements in the sequence.  This takes the
6346
  form <code>Q</code><I>digit</I> if there are less than 10 elements,
6347
  or 
6348
  <code>Q_</code><I>digits</I><code>_</code> if there are more than
6349
  10. Note that members of anonymous classes or namespaces are local
6350
  to their translation unit, and so do not have external tag names.
6351
  </para>
6352
 
6353
  <H4>Mangling types</H4>
6354
  <para>
6355
  The mangling of types is essentially similar to that used in the 
6356
  <A HREF="dump.html">symbol table dump</A> format.  The type used in
6357
  the mangled name for an identifier ignores the return type for a function
6358
  and ignores the most significant bound for an array. 
6359
  </para>
6360
  <para>
6361
  The built-in types are mapped in precisely the same way as in the
6362
  <A HREF="dump.html#built-in">symbol table dump</A>.  Class and enumeration
6363
  types are mapped to their type names mangled in the same way as the
6364
  namespace names above.  The exception to this is that in a class member,
6365
  the parent class is mapped to <code>X</code>. 
6366
  </para>
6367
  <para>
6368
  The composite types are again mapped in a similar fashion to that
6369
  in the <A HREF="dump.html#composite">dump file</A>.  For example,
6370
  <code>PCc</code> represents <code>const char *</code>.  The only difficult
6371
  case concerns function parameter types where the ARM 
6372
  <code>T</code> and <code>N</code> encodings are used for duplicate
6373
  parameter types.  The function return type is included in the mangled
6374
  form except for function identifier types.  In the cases where the
6375
  identifier is known always to represent a function (constructors,
6376
  destructors etc.) the initial <code>F</code>
6377
  indicating a function type is also omitted. 
6378
  </para>
6379
  <para>
6380
  The types of template functions and classes are represented by the
6381
  underlying template and the template arguments giving rise to the
6382
  instance.  Template classes are preceded by <code>t</code>; template
6383
  functions are preceded by <code>G</code> rather than <code>F</code>.
6384
  Type arguments are represented by <code>Z</code> followed by the type
6385
  value; non-type arguments are represented by the argument type followed
6386
  by the argument value.  In the underlying type the template parameters
6387
  are represented by <code>m0</code>, <code>m1</code> etc. An alternative
6388
  scheme, in which the mangled form of a template function includes
6389
  the type of that instance, rather than the underlying template, can
6390
  be enabled using the <code>-j-f</code>
6391
  command-line option. 
6392
  </para>
6393
 
6394
  <H4><A id="other">Other mangled names</A></H4>
6395
  <para>
6396
  The <A HREF="#vtable">virtual function table</A> for a class, when
6397
  this is a variable with external linkage, is named <code>__vt__</code><I>type
6398
  </I>, where <I>type</I> is the mangled form of the class name.  The
6399
  virtual function table for a base class is named <code>__vt__</code><I>base</I>
6400
  where <I>base</I> is a sequence of mangled class names specifying
6401
  the base class.  The <A HREF="#rtti">run-time type information structure</A>
6402
  for a type, when this is a variable with external linkage, is named
6403
  <code>__ti__</code><I>type</I>, where <I>type</I> is the mangled form
6404
  of the type name. 
6405
  </para>
6406
 
6407
  <H4>Mangled name examples</H4>
6408
  <para>
6409
  The following gives some examples of the name mangling scheme: 
6410
  <programlisting>
6411
  	class A {
6412
  	    static int a ;			// a__1Ai
6413
  	public :
6414
  	    A () ;				// __ct__1A
6415
  	    A ( int ) ;				// __ct__1Ai
6416
  	    A ( const A &amp; ) ;			// __ct__1ARCX
6417
  	    virtual ~A () ;			// __dt__1A
6418
  	    operator bool () ;			// __opb__1A
6419
  	    bool operator! () ;			// __nt__1A
6420
  	} ;
6421
 
6422
  	// virtual function table	__vt__1A
6423
  	// run-time type information	__ti__1A
6424
 
6425
  	int f ( A *, int, A * ) ;		// f__FP1AiT1
6426
  	int b = 2 ;				// b__i
6427
  	int c [3] ;				// c__A_i
6428
 
6429
  	namespace N {
6430
  	    int *p = 0 ;			// p__1NPi
6431
  	}
6432
  </programlisting>
6433
  </para>
6434
  </sect3>
6435
  </sect2>
6436
 
6437
  <sect2>
6438
    <title>2.7. Standard library</title>
6439
  <para>
6440
  At present the default implementation contains only a very small fraction
6441
  of the ISO C++ library, namely those headers - 
6442
  <code>&lt;exception&gt;</code>, <code>&lt;new&gt;</code> and 
6443
  <code>&lt;typeinfo&gt;</code> - which are an integral part of the
6444
  language specification.  These headers are also those which require
6445
  the most cooperation between the producer and the library implementation,
6446
  as described in the <A HREF="lib.html">previous section</A>. 
6447
  </para>
6448
  <para>
6449
  It is suggested that if further library components are required then
6450
  they be acquired from third parties.  It should be noted however that
6451
  such libraries may require <A HREF="#porting">some effort</A> to be
6452
  ported to an ISO compliant compiler; for example, some information
6453
  on porting the <code>libio</code> component of <code>libg++</code>,
6454
  which contains some very compiler-dependent code, are 
6455
  <A HREF="#libio">given below</A>.  Libraries compiled with other C++
6456
  compilers may not link correctly with modules compiled using <code>tcc</code>.
6457
  </para>
6458
 
6459
 
6460
  <sect3 id="porting">
6461
    <title>2.7.1. Common porting problems</title>
6462
  <para>
6463
  Experience in porting pre-ISO C++ programs has shown that the following
6464
  new ISO C++ features tend to cause the most problems: 
6465
  <itemizedlist>
6466
  <listitem><A HREF="pragma.html#implicit">Implicit <code>int</code></A> has
6467
  been banned. 
6468
  </listitem>
6469
  <listitem><A HREF="pragma.html#string">String literals are now <code>const</code>
6470
  </A>, although in simple assignments the <code>const</code> is
6471
  implicitly removed. 
6472
  </listitem>
6473
  <listitem>The scope of a <A HREF="pragma.html#for">variable declared in
6474
  a for-init-statement</A> is the <code>for</code> statement itself.
6475
  </listitem>
6476
  <listitem><A HREF="lib.html#mangle">Variables have linkage</A> and so should
6477
  be declared <code>extern &quot;C&quot;</code> if appropriate. 
6478
  </listitem>
6479
  <listitem>The standard C library is now declared in the <code>std</code>
6480
  namespace. 
6481
  </listitem>
6482
  <listitem>The <A HREF="pragma.html#template">template compilation model</A>
6483
  has been clarified.  The notation for explicit instantiation and 
6484
  specialisation has changed. 
6485
  </listitem>
6486
  <listitem>Templates are analysed at their point of definition as well as
6487
  their point of instantiation. 
6488
  </listitem>
6489
  <listitem><A HREF="pragma.html#keyword">New keywords</A> have been introduced.
6490
  </listitem>
6491
  </itemizedlist>
6492
  Note that many of these features have controlling <code>#pragma</code>
6493
  directives, so that it is possible to switch to using the pre-ISO
6494
  features. 
6495
  </para>
6496
  </sect3>  
6497
 
6498
  <sect3 id="libio">
6499
    <title>2.7.2. Porting <code>libio</code></title>
6500
  <para>
6501
  Perhaps the library component which is most likely to be required
6502
  is 
6503
  <code>&lt;iostream&gt;</code>.  A readily available freeware implementation
6504
  of a pre-ISO (i.e. non-template) <code>&lt;iostream&gt;</code>
6505
  package is given by the <code>libio</code> component of <code>libg++</code>.
6506
  This section describes some of the problems encountered in porting
6507
  this package (version 2.7.1).  
6508
  </para>
6509
  <para>
6510
  The <A HREF="man.html"><code>tcc</code> compiler flags</A> used in
6511
  porting <code>libio</code> were: 
6512
  <programlisting>
6513
  	tcc -Yposix -Yc++ -sC:cc
6514
  </programlisting>
6515
  indicating that the POSIX API is to be used and that the <code>.cc</code>
6516
  suffix is used to identify C++ source files. 
6517
  </para>
6518
  <para>
6519
  In <code>iostream.h</code>, <code>cin</code>, <code>cout</code>, 
6520
  <code>cerr</code> and <code>clog</code> should be declared with C
6521
  linkage, otherwise the C++ producer includes the type in the 
6522
  <A HREF="lib.html#mangle">mangled name</A> and the fake 
6523
  <code>iostream</code> hacks in <code>stdstream.cc</code> don't work.
6524
  The definition of <code>EOF</code> in this header can cause problems
6525
  if both <code>iostream.h</code> and <code>stdio.h</code> are included.
6526
  In this case <code>stdio.h</code> should be included first. 
6527
  </para>
6528
  <para>
6529
  In <code>stdstream.cc</code>, the <A HREF="lib.html#derive">correct
6530
  definitions</A> for the fake <code>iostream</code> structures are
6531
  as follows: 
6532
  <programlisting>
6533
  	struct _fake_istream::myfields {
6534
  	    _ios_fields *vb ;		// pointer to virtual base class ios
6535
  	    _IO_ssize_t _gcount ;	// istream fields
6536
  	    void *vptr ;		// pointer to virtual function table
6537
  	} ;
6538
 
6539
  	struct _fake_ostream::myfields {
6540
  	    _ios_fields *vb ;		// pointer to virtual base class ios
6541
  	    void *vptr ;		// pointer to virtual function table
6542
  	} ;
6543
  </programlisting>
6544
  The fake definition macros are then defined as follows: 
6545
  <programlisting>
6546
  	#define OSTREAM_DEF( NAME, SBUF, TIE, EXTRA_FLAGS )\
6547
  	    extern &quot;C&quot; _fake_ostream NAME = { { &amp;NAME.base, 0 }, .... } ;
6548
 
6549
  	#define ISTREAM_DEF( NAME, SBUF, TIE, EXTRA_FLAGS )\
6550
  	    extern &quot;C&quot; _fake_istream NAME = { { &amp;NAME.base, 0, 0 }, .... } ;
6551
  </programlisting>
6552
  Note that these are declared with C linkage as above. 
6553
  </para>
6554
  <para>
6555
  In <code>stdstrbufs.cc</code>, the <A HREF="lib.html#other">correct
6556
  definitions</A> for the virtual function table names are as follows:
6557
  <programlisting>
6558
  	#define filebuf_vtable		__vt__7filebuf
6559
  	#define stdiobuf_vtable		__vt__8stdiobuf
6560
  </programlisting>
6561
  Note that the <code>_G_VTABLE_LABEL_PREFIX</code> macro is incorrectly
6562
  defined by the configuration process (it should be <code>__vt__</code>),
6563
  but the <code>##</code> directives in which it is used don't work
6564
  on an ISO compliant preprocessor anyway (token concatenation takes
6565
  place after replacement of macro parameters, but before further macro
6566
  expansion). The dummy virtual function tables should also be declared
6567
  with C linkage to suppress name mangling. 
6568
  </para>
6569
  In addition, the initialisation of the standard streams relies on
6570
  the file pointers <code>stdout</code> etc. being constant expressions,
6571
  which in general they are not.  The directive:
6572
  <programlisting>
6573
  	#pragma TenDRA++ rvalue token as const allow
6574
  </programlisting>
6575
  will cause the C++ producer to assume that all <A HREF="token.html#exp">
6576
  tokenised rvalue expressions</A> are constant.
6577
  <para>
6578
  In <code>streambuf.cc</code>, if <code>errno</code> is to be explicitly
6579
  declared it should have C linkage or be declared in the <code>std</code>
6580
  namespace. 
6581
  </para>
6582
  <para>
6583
  In <code>iomanip.cc</code>, the explicit template instantiations should
6584
  be prefixed by <code>template</code>.  The corresponding template
6585
  declarations in <code>iomanip.h</code> should be declared using 
6586
  <A HREF="pragma.html#template"><code>export</code></A> (note that
6587
  the <code>__GNUG__</code> version uses <code>extern</code>, which
6588
  may yet win out over <code>export</code>). 
6589
  </para>
6590
  </sect3>
6591
  </sect2>
6592
  </sect1>
6593
 
6594
  <sect1>
6595
  <title>
6596
  C++ Producer Guide: Style guide 
6597
  </title>
6598
 
6599
  <sect2>
6600
    <title>3.1. Source code organisation</title>
6601
  <para>
6602
  This section describes the basic organisation of the source code for
6603
  the C++ producer.  This includes the coding conventions applied, the
6604
  application programming interface (API) observed and the division
6605
  of the code into separate modules. 
6606
  </para>
6607
 
6608
 
6609
  <sect3 id="language">
6610
    <title>3.1.1. C coding standard</title>
6611
  <para>
6612
  The C++ producer is written in a subset of C which is compatible with
6613
  C++ (it compiles with most C compilers, but also bootstraps itself).
6614
  It has been written to conform to the local (OSSG) 
6615
  <A HREF="index.html#cstyle">C coding standard</A>; most of the conformance
6616
  checking being automated by use of a 
6617
  <A HREF="pragma.html#usr">user-defined compilation profile</A>, 
6618
  <code>ossg_std.h</code>.  The standard macros described in the coding
6619
  standard are defined in the standard header <code>ossg.h</code>. This
6620
  is included from the header <code>config.h</code> which is included
6621
  by all source files.  The default definitions for these macros, set
6622
  according to the value of <code>__STDC__</code> and other compiler-defined
6623
  macros, should be correct, but they can be overridden by defining
6624
  the <code>FS_*</code> macros, described in the header, as command-line
6625
  options. 
6626
  </para>
6627
  <para>
6628
  The most important of these macros are those used to handle function
6629
  prototypes, enabling both ISO and pre-ISO C compilers to be accommodated.
6630
  Simple function definitions take the form: 
6631
  <programlisting>
6632
  	ret function
6633
  	    PROTO_N ( ( p1, p2, ...., pn ) )
6634
  	    PROTO_T ( par1 p1 X par2 p2 X .... X parn pn )
6635
  	{
6636
  	    ....
6637
  	}
6638
  </programlisting>
6639
  with the <code>PROTO_N</code> macro being used to list the parameter
6640
  names (note the double bracket) and the <code>PROTO_T</code> macro
6641
  being used to list the parameter types using <code>X</code> (cartesian
6642
  product) as a separator.  The corresponding function declaration will
6643
  have the form: 
6644
  <programlisting>
6645
  	ret function PROTO_S ( ( par1, par2, ...., parn ) ) ;
6646
  </programlisting>
6647
  The case where there are no parameter types is defined using: 
6648
  <programlisting>
6649
  	ret function
6650
  	    PROTO_Z ()
6651
  	{
6652
  	    ....
6653
  	}
6654
  </programlisting>
6655
  and declared as: 
6656
  <programlisting>
6657
  	ret function PROTO_S ( ( void ) ) ;
6658
  </programlisting>
6659
  Functions with ellipses are defined using: 
6660
  <programlisting>
6661
  	#if FS_STDARG
6662
  	#include &lt;stdarg.h&gt;
6663
  	#else
6664
  	#include &lt;varargs.h&gt;
6665
  	#endif
6666
 
6667
  	ret function
6668
  	    PROTO_V ( ( par1 p1, par2 p2, ...., parn pn, ... ) )
6669
  	{
6670
  	    va_list args ;
6671
  	    ....
6672
  	#if FS_STDARG
6673
  	    va_start ( args, pn ) ;
6674
  	#else
6675
  	    par1 p1 ;
6676
  	    par2 p2 ;
6677
  	    ....
6678
  	    parn pn ;
6679
  	    va_start ( args ) ;
6680
  	    p1 = va_arg ( args, par1 ) ;
6681
  	    p2 = va_arg ( args, par2 ) ;
6682
  	    ....
6683
  	    pn = va_arg ( args, parn ) ;
6684
  	#endif
6685
  	    ....
6686
  	    va_end ( args ) ;
6687
  	    ....
6688
  	}
6689
  </programlisting>
6690
  and declared as: 
6691
  <programlisting>
6692
  	ret function PROTO_W ( ( par1, par2, ...., parn, ... ) ) ;
6693
  </programlisting>
6694
  Note that <code>&lt;varargs.h&gt;</code> does not allow for parameters
6695
  preceding the <code>va_alist</code>, so the fixed parameters need
6696
  to be explicitly assigned from <code>args</code>. 
6697
  </para>
6698
  <para>
6699
  The following <A HREF="pragma.html#keyword">TenDRA keywords</A> are
6700
  defined (with suitable default values for non-TenDRA compilers): 
6701
  <programlisting>
6702
  	#pragma TenDRA keyword SET for set
6703
  	#pragma TenDRA keyword UNUSED for discard variable
6704
  	#pragma TenDRA keyword IGNORE for discard value
6705
  	#pragma TenDRA keyword EXHAUSTIVE for exhaustive
6706
  	#pragma TenDRA keyword REACHED for set reachable
6707
  	#pragma TenDRA keyword UNREACHED for set unreachable
6708
  	#pragma TenDRA keyword FALL_THROUGH for fall into case
6709
  </programlisting>
6710
  </para>
6711
  <para>
6712
  Various flags giving properties of the compiler being used are defined
6713
  in <code>ossg.h</code>.  Among the most useful are <code>FS_STDARG</code>,
6714
  which is true if the compiler supports ellipsis functions (see above),
6715
  and <code>FS_STDC_HASH</code>, which is true if the preprocessor supports
6716
  the ISO stringising and concatenation operators.  The macros 
6717
  <code>CONST</code> and <code>VOLATILE</code>, to be used in place
6718
  of 
6719
  <code>const</code> and <code>volatile</code>, are also defined. 
6720
  </para>
6721
  <para>
6722
  A policy of rigorous static program checking is enforced.  The TenDRA
6723
  C producer is applied with the user-defined compilation mode 
6724
  <code>ossg_std.h</code> and intermodule checks enabled.  Checking
6725
  is applied with both the C and <code>#pragma token</code>  
6726
  <A HREF="../utilities/calc.html"><code>calculus</code> output files</A>.
6727
  The C++ producer itself is applied with the same checks.  <code>gcc
6728
  -Wall</code> and various versions of <code>lint</code> are also periodically
6729
  applied. 
6730
  </para>
6731
  </sect3>  
6732
 
6733
  <sect3 id="api">
6734
    <title>3.1.2. API usage and target dependencies</title>
6735
  <para>
6736
  Most of the API features used in the C++ producer are to be found
6737
  in the ISO C API, with just a couple of extensions from POSIX required.
6738
  These POSIX features can be disabled with minimal loss of functionality
6739
  by defining the macro <code>FS_POSIX</code> to be false. 
6740
  </para>
6741
  <para>
6742
  The following features are used from the ISO <code>&lt;stdio.h&gt;</code>
6743
  header: 
6744
  <programlisting>
6745
  	BUFSIZ		EOF		FILE		SEEK_SET
6746
  	fclose		fflush		fgetc		fgets
6747
  	fopen		fprintf		fputc		fputs
6748
  	fread		fseek		fwrite		rewind
6749
  	sprintf		stderr		stdin		stdout
6750
  	vfprintf
6751
  </programlisting>
6752
  from the ISO <code>&lt;stdlib.h&gt;</code> header: 
6753
  <programlisting>
6754
  	EXIT_SUCCESS	EXIT_FAILURE	NULL		abort
6755
  	exit		free		malloc		realloc
6756
  	size_t
6757
  </programlisting>
6758
  and from the ISO <code>&lt;string.h&gt;</code> header: 
6759
  <programlisting>
6760
  	memcmp		memcpy		strchr		strcmp
6761
  	strcpy		strlen		strncmp		strrchr
6762
  </programlisting>
6763
  The three headers just mentioned are included in all source files
6764
  via the 
6765
  <code>ossg_api.h</code> header file (included by <code>config.h</code>).
6766
  The remaining headers are only included as and when they are needed.
6767
  The following features are used from the ISO <code>&lt;ctype.h&gt;</code>
6768
  header: 
6769
  <programlisting>
6770
  	isalpha		isprint
6771
  </programlisting>
6772
  from the ISO <code>&lt;limits.h&gt;</code> header: 
6773
  <programlisting>
6774
  	UCHAR_MAX	UINT_MAX	ULONG_MAX
6775
  </programlisting>
6776
  from the ISO <code>&lt;stdarg.h&gt;</code> header: 
6777
  <programlisting>
6778
  	va_arg		va_end		va_list		va_start
6779
  </programlisting>
6780
  (note that if <code>FS_STDARG</code> is false the XPG3 
6781
  <code>&lt;varargs.h&gt;</code> header is used instead); and from the
6782
  ISO 
6783
  <code>&lt;time.h&gt;</code> header: 
6784
  <programlisting>
6785
  	localtime	time		time_t		struct tm
6786
  	tm::tm_hour	tm::tm_mday	tm::tm_min	tm::tm_mon
6787
  	tm::tm_sec	tm::tm_year
6788
  </programlisting>
6789
  The following features are used from the POSIX 
6790
  <code>&lt;sys/stat.h&gt;</code> header: 
6791
  <programlisting>
6792
  	stat		struct stat	stat::st_dev	stat::st_ino
6793
  	stat::st_mtime
6794
  </programlisting>
6795
  The <code>&lt;sys/types.h&gt;</code> header is also included to provide
6796
  the necessary types for <code>&lt;sys/stat.h&gt;</code>. 
6797
  </para>
6798
  <para>
6799
  There are a couple of target dependencies in the producer which can
6800
  overridden using command-line options: 
6801
  <itemizedlist>
6802
  <listitem>It assumes that if a count of the number of characters read from
6803
  an input file is maintained, then that count value can be used as
6804
  an argument to <code>fseek</code>.  This may not be true on machines
6805
  where the end of line marker consists of both a newline and a carriage
6806
  return.  In this case the <code>-m-f</code> command-line option can
6807
  be used to switch to a slower, but more portable, algorithm for setting
6808
  file positions. 
6809
  </listitem>
6810
  <listitem>It assumes that a file is uniquely determined by the 
6811
  <code>st_dev</code> and <code>st_ino</code> fields of its corresponding
6812
  <code>stat</code> value.  This is used when processing 
6813
  <code>#include</code> directives to prevent a file being read more
6814
  than once if this is not necessary.  This assumption may not be true
6815
  on machines with a small <code>ino_t</code> type which have file systems
6816
  mounted from machines with a larger <code>ino_t</code> type.  In this
6817
  case the <code>-m-i</code> command-line option can be used to disable
6818
  this check. 
6819
  </listitem>
6820
  </itemizedlist>
6821
  </para>
6822
  </sect3>  
6823
 
6824
  <sect3 id="src">
6825
    <title>3.1.3. Source code modules</title>
6826
  <para>
6827
  For convenience, the source code is divided between a number of directories:
6828
  <itemizedlist>
6829
 
6830
  <listitem>The base directory contains only the module containing the 
6831
  <code>main</code> function, the basic type descriptions and the 
6832
  <code>Makefile</code>.  
6833
  </listitem>
6834
  <listitem>The directories <code>obj_c</code> and <code>obj_tok</code> contain
6835
  respectively the C and <code>#pragma token</code> headers generated
6836
  from the type algebra by <A HREF="../utilities/calc.html"><code>calculus</code>
6837
  </A>.  The directory <code>obj_templ</code> contains certain <code>calculus
6838
  </code>
6839
  template files. 
6840
  </listitem>
6841
  <listitem>The directory <code>utility</code> contains routines for such
6842
  utility operations as memory allocation and error reporting, including
6843
  the <A HREF="error.html">error catalogue</A>. 
6844
  </listitem>
6845
  <listitem>The directory <code>parse</code> contains routines concerned with
6846
  parsing and preprocessing the input, including the 
6847
  <A HREF="../utilities/sid.html"><code>sid</code> grammar</A>. 
6848
  </listitem>
6849
  <listitem>The directory <code>construct</code> contains routines for building
6850
  up and analysing the internal representation of the parsed code. 
6851
  </listitem>
6852
  <listitem>The directory <code>output</code> contains routines for outputting
6853
  the internal representation in various formats including as a 
6854
  <A HREF="tdf.html">TDF capsule</A>, a <A HREF="link.html">C++ spec
6855
  file</A>, or a <A HREF="dump.html">symbol table dump file</A>. 
6856
  </listitem>
6857
  </itemizedlist>
6858
  </para>
6859
  <para>
6860
  Each module consists of a C source file, <code><I>file</I>.c</code>
6861
  say, containing function definitions, and a corresponding header file
6862
  <code><I>file</I>.h</code> containing the declarations of these functions.
6863
  The header is included within its corresponding source file to check
6864
  these declarations; it is protected against multiple inclusions by
6865
  a macro of the form <code><I>FILE</I>_INCLUDED</code>. The header
6866
  contains a brief comment describing the purpose of the module; each
6867
  function in the source file contains a comment describing its purpose,
6868
  its inputs and its output. 
6869
  </para>
6870
  <para>
6871
  The following table lists all the source modules in the C++ producer
6872
  with a brief description of the purpose of each: 
6873
  </para>
6874
  <para>
6875
 
6876
  <table>
6877
  <tr><th>Module</th> <th>Directory</th>
6878
  <th>Purpose</th>
6879
  </tr>
6880
  <tr><td>access</td> <td>construct</td>
6881
  <td>member access control</td>
6882
  </tr>
6883
  <tr><td>allocate</td> <td>construct</td>
6884
  <td><code>new</code> and <code>delete</code> expressions</td>
6885
  </tr>
6886
  <tr><td>assign</td> <td>construct</td>
6887
  <td>assignment expressions</td>
6888
  </tr>
6889
  <tr><td>basetype</td> <td>construct</td>
6890
  <td>basic type operations</td>
6891
  </tr>
6892
  <tr><td>buffer</td> <td>utility</td>
6893
  <td>buffer reading and writing routines</td>
6894
  </tr>
6895
  <tr><td>c_class</td> <td>obj_c</td>
6896
  <td><code>calculus</code> support routines</td>
6897
  </tr>
6898
  <tr><td>capsule</td> <td>output</td>
6899
  <td>top-level TDF encoding routines</td>
6900
  </tr>
6901
  <tr><td>cast</td> <td>construct</td>
6902
  <td>cast expressions</td>
6903
  </tr>
6904
  <tr><td>catalog</td> <td>utility</td>
6905
  <td>error catalogue definition</td>
6906
  </tr>
6907
  <tr><td>char</td> <td>parse</td>
6908
  <td>character sets</td>
6909
  </tr>
6910
  <tr><td>check</td> <td>construct</td>
6911
  <td>expression checking</td>
6912
  </tr>
6913
  <tr><td>chktype</td> <td>construct</td>
6914
  <td>type checking</td>
6915
  </tr>
6916
  <tr><td>class</td> <td>construct</td>
6917
  <td>class and enumeration definitions</td>
6918
  </tr>
6919
  <tr><td>compile</td> <td>output</td>
6920
  <td>TDF tag definition encoding routines</td>
6921
  </tr>
6922
  <tr><td>constant</td> <td>parse</td>
6923
  <td>integer constant evaluation</td>
6924
  </tr>
6925
  <tr><td>construct</td> <td>construct</td>
6926
  <td>constructors and destructors</td>
6927
  </tr>
6928
  <tr><td>convert</td> <td>construct</td>
6929
  <td>standard type conversions</td>
6930
  </tr>
6931
  <tr><td>copy</td> <td>construct</td>
6932
  <td>expression copying</td>
6933
  </tr>
6934
  <tr><td>debug</td> <td>utility</td>
6935
  <td>development aids</td>
6936
  </tr>
6937
  <tr><td>declare</td> <td>construct</td>
6938
  <td>variable and function declarations</td>
6939
  </tr>
6940
  <tr><td>decode</td> <td>output</td>
6941
  <td>bitstream reading routines</td>
6942
  </tr>
6943
  <tr><td>derive</td> <td>construct</td>
6944
  <td>base class graphs; inherited members</td>
6945
  </tr>
6946
  <tr><td>destroy</td> <td>construct</td>
6947
  <td>garbage collection routines</td>
6948
  </tr>
6949
  <tr><td>diag</td> <td>output</td>
6950
  <td>TDF diagnostic output routines</td>
6951
  </tr>
6952
  <tr><td>dump</td> <td>output</td>
6953
  <td>symbol table dump routines</td>
6954
  </tr>
6955
  <tr><td>encode</td> <td>output</td>
6956
  <td>bitstream writing routines</td>
6957
  </tr>
6958
  <tr><td>error</td> <td>utility</td>
6959
  <td>error output routines</td>
6960
  </tr>
6961
  <tr><td>exception</td> <td>construct</td>
6962
  <td>exception handling</td>
6963
  </tr>
6964
  <tr><td>exp</td> <td>output</td>
6965
  <td>TDF expression encoding routines</td>
6966
  </tr>
6967
  <tr><td>expression</td> <td>construct</td>
6968
  <td>expression processing</td>
6969
  </tr>
6970
  <tr><td>file</td> <td>parse</td>
6971
  <td>low-level I/O routines</td>
6972
  </tr>
6973
  <tr><td>function</td> <td>construct</td>
6974
  <td>function definitions and calls</td>
6975
  </tr>
6976
  <tr><td>hash</td> <td>parse</td>
6977
  <td>hash table and identifier name routines</td>
6978
  </tr>
6979
  <tr><td>identifier</td> <td>construct</td>
6980
  <td>identifier expressions</td>
6981
  </tr>
6982
  <tr><td>init</td> <td>output</td>
6983
  <td>TDF initialiser expression encoding routines</td>
6984
  </tr>
6985
  <tr><td>initialise</td> <td>construct</td>
6986
  <td>variable initialisers</td>
6987
  </tr>
6988
  <tr><td>instance</td> <td>construct</td>
6989
  <td>template instances and specialisations</td>
6990
  </tr>
6991
  <tr><td>inttype</td> <td>construct</td>
6992
  <td>integer and floating point type routines</td>
6993
  </tr>
6994
  <tr><td>label</td> <td>construct</td>
6995
  <td>labels and jumps</td>
6996
  </tr>
6997
  <tr><td>lex</td> <td>parse</td>
6998
  <td>lexical analysis</td>
6999
  </tr>
7000
  <tr><td>literal</td> <td>parse</td>
7001
  <td>integer and string literals</td>
7002
  </tr>
7003
  <tr><td>load</td> <td>output</td>
7004
  <td>C++ spec reading routines</td>
7005
  </tr>
7006
  <tr><td>macro</td> <td>parse</td>
7007
  <td>macro expansion</td>
7008
  </tr>
7009
  <tr><td>main</td> <td>-</td>
7010
  <td>main routine; command-line arguments</td>
7011
  </tr>
7012
  <tr><td>mangle</td> <td>output</td>
7013
  <td>identifier name mangling</td>
7014
  </tr>
7015
  <tr><td>member</td> <td>construct</td>
7016
  <td>member selector expressions</td>
7017
  </tr>
7018
  <tr><td>merge</td> <td>construct</td>
7019
  <td>intermodule merge routines</td>
7020
  </tr>
7021
  <tr><td>namespace</td> <td>construct</td>
7022
  <td>namespaces; name look-up</td>
7023
  </tr>
7024
  <tr><td>operator</td> <td>construct</td>
7025
  <td>overloaded operators</td>
7026
  </tr>
7027
  <tr><td>option</td> <td>utility</td>
7028
  <td>compiler options</td>
7029
  </tr>
7030
  <tr><td>overload</td> <td>construct</td>
7031
  <td>overload resolution</td>
7032
  </tr>
7033
  <tr><td>parse</td> <td>parse</td>
7034
  <td>low-level parser routines</td>
7035
  </tr>
7036
  <tr><td>pragma</td> <td>parse</td>
7037
  <td><code>#pragma</code> directives</td>
7038
  </tr>
7039
  <tr><td>predict</td> <td>parse</td>
7040
  <td>parser look-ahead routines</td>
7041
  </tr>
7042
  <tr><td>preproc</td> <td>parse</td>
7043
  <td>preprocessing directives</td>
7044
  </tr>
7045
  <tr><td>print</td> <td>utility</td>
7046
  <td>error argument printing routines</td>
7047
  </tr>
7048
  <tr><td>quality</td> <td>construct</td>
7049
  <td>extra expression checks</td>
7050
  </tr>
7051
  <tr><td>redeclare</td> <td>construct</td>
7052
  <td>variable and function redeclarations</td>
7053
  </tr>
7054
  <tr><td>rewrite</td> <td>construct</td>
7055
  <td>inline member function definitions</td>
7056
  </tr>
7057
  <tr><td>save</td> <td>output</td>
7058
  <td>C++ spec writing routines</td>
7059
  </tr>
7060
  <tr><td>shape</td> <td>output</td>
7061
  <td>TDF shape encoding routines</td>
7062
  </tr>
7063
  <tr><td>statement</td> <td>construct</td>
7064
  <td>statement processing</td>
7065
  </tr>
7066
  <tr><td>stmt</td> <td>output</td>
7067
  <td>TDF statement encoding routines</td>
7068
  </tr>
7069
  <tr><td>struct</td> <td>output</td>
7070
  <td>TDF structure encoding routines</td>
7071
  </tr>
7072
  <tr><td>syntax[0-9]*</td> <td>parse</td>
7073
  <td><code>sid</code> parser output</td>
7074
  </tr>
7075
  <tr><td>system</td> <td>utility</td>
7076
  <td>system dependent routines</td>
7077
  </tr>
7078
  <tr><td>table</td> <td>parse</td>
7079
  <td>portability table reading</td>
7080
  </tr>
7081
  <tr><td>template</td> <td>construct</td>
7082
  <td>template declarations and checks</td>
7083
  </tr>
7084
  <tr><td>throw</td> <td>output</td>
7085
  <td>TDF exception handling encoding routines</td>
7086
  </tr>
7087
  <tr><td>tok</td> <td>output</td>
7088
  <td>TDF standard tokens encoding</td>
7089
  </tr>
7090
  <tr><td>tokdef</td> <td>construct</td>
7091
  <td>token definitions</td>
7092
  </tr>
7093
  <tr><td>token</td> <td>construct</td>
7094
  <td>token declarations and expansion</td>
7095
  </tr>
7096
  <tr><td>typeid</td> <td>construct</td>
7097
  <td>run-time type information</td>
7098
  </tr>
7099
  <tr><td>unmangle</td> <td>output</td>
7100
  <td>identifier name unmangling</td>
7101
  </tr>
7102
  <tr><td>variable</td> <td>construct</td>
7103
  <td>variable analysis</td>
7104
  </tr>
7105
  <tr><td>virtual</td> <td>construct</td>
7106
  <td>virtual functions</td>
7107
  </tr>
7108
  <tr><td>xalloc</td> <td>utility</td>
7109
  <td>memory allocation routines</td>
7110
  </tr>
7111
  </table>
7112
  </para>
7113
  </sect3>
7114
  </sect2>
7115
 
7116
  <sect2>
7117
    <title>3.2. Type system</title>
7118
  <para>
7119
  This section describes the type system used in the C++ producer. Unless
7120
  otherwise stated the types are declared using the 
7121
  <A HREF="../utilities/calc.html"><code>calculus</code> tool</A> as
7122
  part of the algebra, <code>c_class.alg</code>.  The design of this
7123
  type algebra was clearly largely based on the concepts underlying
7124
  the C++ language; however TDF provided an important influence, not
7125
  merely as the intended target language, but also because of its clear
7126
  presentation of essential language features. 
7127
  </para>
7128
 
7129
 
7130
  <sect3 id="primitive">
7131
    <title>3.2.1. Primitive types</title>
7132
  <para>
7133
  The primitive types used within the algebra <code>c_class</code> are
7134
  defined as follows: 
7135
  <programlisting>
7136
  	int = &quot;int&quot; ;
7137
  	unsigned = &quot;unsigned&quot; ;
7138
  	string = &quot;character *&quot; ;
7139
  	ulong_type (ulong) = &quot;unsigned long&quot; ;
7140
  	BITSTREAM_P (bits) = &quot;BITSTREAM *&quot; ;
7141
  	PPTOKEN_P (pptok) = &quot;PPTOKEN *&quot; ;
7142
  </programlisting>
7143
  The integral types are self-explanatory.  All string literals used
7144
  in the C++ producer are based on the character type: 
7145
  <programlisting>
7146
  	typedef unsigned char character ;
7147
  </programlisting>
7148
  hence the definition of <code>string</code>.  The remaining primitive
7149
  give links to those portions of the type system which are defined
7150
  outside of the algebra.  The types <A HREF="#bits"><code>BITSTREAM</code></A>
7151
  and <A HREF="#pptok"><code>PPTOKEN</code></A> are described below.
7152
  </para>
7153
  </sect3>  
7154
 
7155
  <sect3 id="cv">
7156
    <title>3.2.2. <code>CV_SPEC</code></title>
7157
  <para>
7158
  The enumeration type <code>CV_SPEC</code> (short name <code>cv</code>)
7159
  is used to represent a C++ type qualifier.  It takes the form of a
7160
  bitfield, the elements of which can be or-ed together to represent
7161
  combinations of type qualifiers.  The cv-qualifiers are represented
7162
  by <code>cv_const</code> and <code>cv_volatile</code> in the obvious
7163
  manner.  The value <code>cv_lvalue</code> is used as a qualifier to
7164
  indicate whether a type is an lvalue or an rvalue.  Other values are
7165
  used in function types to represent the function language linkage.
7166
  </para>
7167
  </sect3>  
7168
 
7169
  <sect3 id="ntype">
7170
    <title>3.2.3. <code>BUILTIN_TYPE</code></title>
7171
  <para>
7172
  The enumeration type <code>BUILTIN_TYPE</code> (<code>ntype</code>)
7173
  is used to represent the built-in C++ types (<code>char</code>, 
7174
  <code>float</code>, <code>void</code> etc.).  It is used chiefly as
7175
  an index into tables of type information. 
7176
  </para>
7177
  </sect3>  
7178
 
7179
  <sect3 id="btype">
7180
    <title>3.2.4. <code>BASE_TYPE</code></title>
7181
  <para>
7182
  The enumeration type <code>BASE_TYPE</code> (<code>btype</code>) is
7183
  used to represent a C++ simple type specifier such as <code>signed</code>,
7184
  <code>short</code> or <code>int</code>.  It takes the form of a bitfield,
7185
  the elements of which can be or-ed together to represent combinations
7186
  of type specifiers.  Its chief use is when reading a type from the
7187
  input file; the various simple type specifiers are combined to give
7188
  a value of this type, which is then mapped to an actual <A HREF="#type">C++
7189
  type</A>. 
7190
  </para>
7191
  </sect3>  
7192
 
7193
  <sect3 id="itype">
7194
    <title>3.2.5. <code>INT_TYPE</code></title>
7195
  <para>
7196
  The union type <code>INT_TYPE</code> (<code>itype</code>) is used
7197
  to represent an integral or bitfield C++ type.  The basic integral
7198
  types are given by the <code>basic</code> field.  Bitfield types are
7199
  represented by the <code>bitfield</code> field.  There are also fields
7200
  representing target dependent integral promotion, arithmetic and integer
7201
  literal types, plus <code>VARIETY</code> tokens.  Only one <code>INT_TYPE</code>
7202
  object is created for each integral type. 
7203
  </para>
7204
  </sect3>  
7205
 
7206
  <sect3 id="ftype">
7207
    <title>3.2.6. <code>FLOAT_TYPE</code></title>
7208
  <para>
7209
  The union type <code>FLOAT_TYPE</code> (<code>ftype</code>) is used
7210
  to represent a floating point C++ type.  The basic floating point
7211
  types are given by the <code>basic</code> field.  There are also fields
7212
  representing target dependent argument promotion and arithmetic types,
7213
  plus <code>FLOAT</code> tokens.  Only one <code>FLOAT_TYPE</code>
7214
  object is created for each floating point type. 
7215
  </para>
7216
  </sect3>  
7217
 
7218
  <sect3 id="cinfo">
7219
    <title>3.2.7. <code>CLASS_INFO</code></title>
7220
  <para>
7221
  The enumeration type <code>CLASS_INFO</code> (<code>cinfo</code>)
7222
  is used to represent information relating to a class or enumeration
7223
  definition.  It takes the form of a bitfield, the elements of which
7224
  can be or-ed together to represent various combinations of properties.
7225
  </para>
7226
  </sect3>  
7227
 
7228
  <sect3 id="cusage">
7229
    <title>3.2.8. <code>CLASS_USAGE</code></title>
7230
  <para>
7231
  The enumeration type <code>CLASS_USAGE</code> (<code>cusage</code>)
7232
  is used to represent information relating to the way a class is used.
7233
  It takes the form of a bitfield, the elements of which can be or-ed
7234
  together to represent various combinations of properties. 
7235
  </para>
7236
  </sect3>  
7237
 
7238
  <sect3 id="ctype">
7239
    <title>3.2.9. <code>CLASS_TYPE</code></title>
7240
  <para>
7241
  The union type <code>CLASS_TYPE</code> (<code>ctype</code>) is used
7242
  to represent a C++ class or union.  The main components are an 
7243
  <A HREF="#id">identifier</A> giving the class name, 
7244
  <A HREF="#cinfo">class information</A> and <A HREF="#cusage">class
7245
  usage</A> fields, a <A HREF="#nspace">namespace</A> giving the class
7246
  members, a <A HREF="#graph">graph</A> representing the base class
7247
  structure, and a <A HREF="#virt">virtual function table</A>.  Only
7248
  one 
7249
  <code>CLASS_TYPE</code> object is created for each class or union.
7250
  </para>
7251
  <para>
7252
  Each class maintains a list, <code>pals</code>, of class and function
7253
  identifiers which are declared as friends of that class.  It also
7254
  maintains a list, <code>chums</code>, of those class types which declare
7255
  it to be a friend (this is what is actually used in the access checks).
7256
  Similarly each function identifier maintains a list, 
7257
  <code>chums</code>, of those class types which declare it to be a
7258
  friend. 
7259
  </para>
7260
  <para>
7261
  Each class maintains a list of its constructors, destructors and conversion
7262
  functions (included inherited conversion functions).  It also maintains
7263
  a list of its virtual base classes.  This information can be obtained
7264
  by other means but it is more convenient to record it within the class
7265
  type itself. 
7266
  </para>
7267
  </sect3>  
7268
 
7269
  <sect3 id="graph">
7270
    <title>3.2.10. <code>GRAPH</code></title>
7271
  <para>
7272
  The union type <code>GRAPH</code> (<code>graph</code>) is used to
7273
  represent a directed acyclic graph arising from the base classes of
7274
  a class.  Each node of the graph has a <code>head</code> which is
7275
  a 
7276
  <A HREF="#ctype">class type</A>, and several <code>tails</code> which
7277
  give the base class graphs for that class.  Each node has pointers,
7278
  <code>top</code>, to the top of the graph (i.e. the most derived class),
7279
  and <code>up</code>, to the node of which the current node is a direct
7280
  base.  Each node also has an <code>access</code> field which gives
7281
  information on the base access, whether it is virtual or not, and
7282
  so on, in the form of a <A HREF="#dspec"><code>DECL_SPEC</code></A>.
7283
  Virtual bases are handled by the <code>equal</code> field which defines
7284
  an equivalence relation on the graph which identifies equivalent virtual
7285
  bases.  
7286
  </para>
7287
  </sect3>  
7288
 
7289
  <sect3 id="virt">
7290
    <title>3.2.11. <code>VIRTUAL</code></title>
7291
  <para>
7292
  The union type <code>VIRTUAL</code> (<code>virt</code>) is used to
7293
  represent the virtual functions declared in a class.  The <code>table</code>
7294
  field is used to represent a virtual function table, and consists
7295
  primarily of a list of <code>VIRTUAL</code> objects giving the virtual
7296
  functions for the associated class.  These virtual functions are of
7297
  four kinds, each represented by a union field.  A virtual function
7298
  first declared in a class is represented by the <code>simple</code>
7299
  field; a virtual function in a class which overrides an inherited
7300
  virtual function is represented by the <code>override</code> field;
7301
  an inherited, non-overridden virtual function which is not overridden
7302
  in a base class is represented by the 
7303
  <code>inherit</code> field; a inherited, non-overridden virtual function
7304
  which is overridden in some base class is represented by the 
7305
  <code>complex</code> field. 
7306
  </para>
7307
  </sect3>  
7308
 
7309
  <sect3 id="etype">
7310
    <title>3.2.12. <code>ENUM_TYPE</code></title>
7311
  <para>
7312
  The union type <code>ENUM_TYPE</code> (<code>etype</code>) is used
7313
  to represent a C++ enumeration type.  This consists primarily of an
7314
  <A HREF="#id">identifier</A> giving the enumeration name, a 
7315
  <A HREF="#cinfo">class information</A> field, a <A HREF="#type">type</A>
7316
  giving the underlying representation of the enumeration type, and
7317
  a list of <A HREF="#id">identifiers</A> giving the enumerators comprising
7318
  the enumeration. 
7319
  </para>
7320
  </sect3>  
7321
 
7322
  <sect3 id="type">
7323
    <title>3.2.13. <code>TYPE</code></title>
7324
  <para>
7325
  The union type <code>TYPE</code> (<code>type</code>) is used to represent
7326
  a C++ type.  Every type has an associated <A HREF="#cv">type qualifier</A>,
7327
  <code>qual</code>, which determines whether the type is 
7328
  <code>const</code>, <code>volatile</code> or an lvalue.  A type may
7329
  also have an associated <A HREF="#id">identifier</A>, <code>name</code>,
7330
  giving the corresponding type name (the null identifier being used
7331
  for unnamed types).  The other type components are determined by the
7332
  union tag.  Each of the type constructs above has a corresponding
7333
  field in the <code>TYPE</code> union: 
7334
  <code>integer</code> for <A HREF="#itype">integral types</A>, 
7335
  <code>floating</code> for <A HREF="#ftype">floating point types</A>,
7336
  <code>bitfield</code> for <A HREF="#itype">bitfield types</A>, 
7337
  <code>compound</code> for <A HREF="#ctype">class or union types</A>,
7338
  and 
7339
  <code>enumerate</code> for <A HREF="#etype">enumeration types</A>.
7340
  There are also fields <code>top</code> and <code>bottom</code>
7341
  corresponding to <code>void</code> and bottom (the type used to represent
7342
  values which never return). 
7343
  </para>
7344
  <para>
7345
  Other fields of the <code>TYPE</code> union represent composite types;
7346
  for example, the <code>array</code> field, representing array types,
7347
  comprises a base type, <code>sub</code>, and an <A HREF="#nat">integer
7348
  constant</A> giving the array bound, <code>size</code>.  These are
7349
  generally simple, apart from <code>func</code>, representing a function
7350
  type.  This has the obvious components: a return type, <code>ret</code>,
7351
  a list of parameter types, <code>ptypes</code>, and a flag indicating
7352
  ellipsis functions, <code>ellipsis</code>.  It also has an associated
7353
  <A HREF="#nspace">namespace</A>, <code>pars</code>, in which the function
7354
  parameters are declared.  The parameter identifiers are extracted
7355
  from this as a list, <code>pids</code>.  Member function qualifiers
7356
  and language linkage information are represented by a 
7357
  <A HREF="#cv"><code>CV_QUAL</code></A>, <code>mqual</code>.  The implicit
7358
  extra parameter for member functions is recorded in the list 
7359
  <code>mtypes</code>, which adds this extra type to the start of 
7360
  <code>ptypes</code>.  Finally <code>except</code> gives any exception
7361
  specifiers; the case where the exception specifier is absent being
7362
  represented by the special value, <code>univ_type_set</code>. 
7363
  </para>
7364
  </sect3>  
7365
 
7366
  <sect3 id="dspec">
7367
    <title>3.2.14. <code>DECL_SPEC</code></title>
7368
  <para>
7369
  The enumeration type <code>DECL_SPEC</code> (<code>dspec</code>) is
7370
  used to represent information on the declaration and usage of an identifier.
7371
  It takes the form of a bitfield, the elements of which can be or-ed
7372
  together to represent various combinations of properties.  The 32
7373
  bits in this bitfield (the maximum which can be represented portably)
7374
  are a significant restriction.  This means that the same member of
7375
  <code>DECL_SPEC</code> is often used to mean different things in different
7376
  contexts.  This can prove confusing on occasions. 
7377
  </para>
7378
  </sect3>  
7379
 
7380
  <sect3 id="hashid">
7381
    <title>3.2.15. <code>HASHID</code></title>
7382
  <para>
7383
  The union type <code>HASHID</code> (<code>hashid</code>) is used to
7384
  represent a C++ identifier name.  The simplest form of identifier
7385
  name, 
7386
  <code>name</code>, consists of just a string of characters, such as
7387
  <code>foo</code>.  Extended identifier names, <code>ename</code>,
7388
  are similar, but may contain Unicode characters.  There are however
7389
  other forms of identifier name in C++: conversion function names (<code>conv
7390
  </code>) such as <code>operator int</code>, overloaded operator names
7391
  (<code>op</code>) such as <code>operator+</code>, constructor names
7392
  (<code>constr</code>), and destructor names (<code>destr</code>).
7393
  There are also names which are used for anonymous identifiers (<code>anon</code>).
7394
  </para>
7395
  <para>
7396
  Note the distinction between an identifier name and an actual 
7397
  <A HREF="#id">identifier</A>.  The latter is a meaning associated
7398
  with a name in a particular context.  Every identifier name has an
7399
  associated underlying meaning, <code>id</code>.  This is used to handle
7400
  keywords and macros, but for most identifier names this will be a
7401
  dummy identifier. Nested underlying meanings (such as a macro hiding
7402
  a keyword) are handled by linking the <code>alias</code> fields of
7403
  the corresponding identifiers.  Every identifier name also has a <code>cache
7404
  </code> field which is used to record the look-up of this name as
7405
  an unqualified identifier.  This may be set to the null identifier
7406
  to indicate that the look-up needs to be re-evaluated. 
7407
  </para>
7408
  <para>
7409
  Identifier names are stored in one of a small number of hash tables,
7410
  linked using their <code>next</code> field.  Each name has only one
7411
  entry in these tables, allowing equality of names to be implemented
7412
  as <code>EQ_hashid</code>. 
7413
  </para>
7414
  </sect3>  
7415
 
7416
  <sect3 id="qual">
7417
    <title>3.2.16. <code>QUALIFIER</code></title>
7418
  <para>
7419
  The enumeration type <code>QUALIFIER</code> (<code>qual</code>) is
7420
  used to represent the various ways in which an identifier name can
7421
  be qualified.  For example, <code>::A::a</code> is represented by
7422
  <code>qual_full</code>.  The value <code>qual_mark</code> is used
7423
  in the representation of function identifier expressions to indicate
7424
  that overload resolution has been performed. 
7425
  </para>
7426
  </sect3>  
7427
 
7428
  <sect3 id="identifier">
7429
    <title>3.2.17. <code>IDENTIFIER</code></title>
7430
  <para>
7431
  The union type <code>IDENTIFIER</code> (<code>id</code>) is used to
7432
  represent the various kinds of C++ identifiers.  Every identifier
7433
  has an associated <A HREF="#hashid">identifier name</A>, a parent
7434
  <A HREF="#nspace">namespace</A>, a <A HREF="#dspec">declaration information</A>
7435
  field, and a <A HREF="#loc">location</A> for its declaration or definition.
7436
  Each identifier also has an 
7437
  <code>alias</code> field which is normally used to represent the aliasing
7438
  which can occur in inheritance or <code>using</code>
7439
  declarations. 
7440
  </para>
7441
  <para>
7442
  The various fields of the <code>IDENTIFIER</code> union correspond
7443
  to the various kinds of identifier which can arise in C++ - class
7444
  names, functions, variables, class members, macros, keywords etc.
7445
  Each field has appropriate components giving its type, its definition
7446
  or whatever other information is required.  For example, the <code>variable
7447
  </code>
7448
  field has a <A HREF="#type">type</A> and two <A HREF="#exp">expressions</A>,
7449
  giving the constructor and destructor values for the object. 
7450
  </para>
7451
  <para>
7452
  Most of these identifier components are self-explanatory, however
7453
  the treatment of overloaded functions bears discussion.  The various
7454
  fields representing functions have an <code>over</code> component
7455
  which is used to link overloaded functions together.  A set of overloaded
7456
  functions is treated as if it were a single <code>IDENTIFIER</code>
7457
  - the first in the list - for the purposes of storing in a <A HREF="#member">namespace
7458
  member</A>; the other overloaded meanings are accessed by chasing
7459
  down the <code>over</code> components.  In other situations, whether
7460
  a function identifier represents a single function or a set of overloaded
7461
  functions can be worked out from the context.  For example, in identifier
7462
  expressions the <A HREF="#qual">identifier qualifier</A> is used to
7463
  mark whether overload resolution has taken place. 
7464
  </para>
7465
  </sect3>  
7466
 
7467
  <sect3 id="member">
7468
    <title>3.2.18. <code>MEMBER</code></title>
7469
  <para>
7470
  The union type <code>MEMBER</code> (<code>member</code>) is used to
7471
  represent a member of a <A HREF="#nspace">namespace</A>.  Each member
7472
  contains two identifiers, <code>id</code> and <code>alt</code>.  The
7473
  <code>id</code> field gives the meaning associated with a particular
7474
  name in this namespace; the <code>alt</code> field is used to represent
7475
  a type name which may be hidden by a non-type name. 
7476
  </para>
7477
  <para>
7478
  There are two kinds of member, <code>small</code> and <code>large</code>,
7479
  corresponding to whether the namespace holds its members in a simple
7480
  linked list or in a hash table. 
7481
  </para>
7482
  </sect3>  
7483
 
7484
  <sect3 id="nspace">
7485
    <title>3.2.19. <code>NAMESPACE</code></title>
7486
  <para>
7487
  The union type <code>NAMESPACE</code> (<code>nspace</code>) is used
7488
  to represent the set of identifiers declared in a particular scope.
7489
  For example, the members declared in a C++ class or namespace, the
7490
  parameters declared in a function declarator and the local variables
7491
  declared in a block all form scopes.  The various kinds of scope are
7492
  distinguished as different fields of the union, but there are basically
7493
  two categories.  The first, such as function blocks, which have relatively
7494
  small numbers of elements, store their members as a simple linked
7495
  lists.  The second, such as classes, which have larger numbers of
7496
  elements, store their members in hash tables.  In both cases the elements
7497
  are stored using the <A HREF="#member"><code>MEMBER</code></A>
7498
  type. 
7499
  </para>
7500
  <para>
7501
  The key operation on a namespace is to look up a particular 
7502
  <A HREF="#hashid">identifier name</A> in its linked list or hash table
7503
  of members to find the meaning, if any, associated with that name
7504
  in the namespace.  This can be a complex operation because of the
7505
  need to take base classes and <code>using</code> directives (as stored
7506
  in the <code>use</code> component) into account. 
7507
  </para>
7508
  </sect3>  
7509
 
7510
  <sect3 id="nat">
7511
    <title>3.2.20. <code>NAT</code></title>
7512
  <para>
7513
  The union type <code>NAT</code> (<code>nat</code>) is used to represent
7514
  an integer constant expression.  Values are represented as lists of
7515
  16 bit 'digits'.  Values which fit into a single digit are represented
7516
  by the <code>small</code> field; larger values by the <code>large</code>
7517
  field.  Negated values can be represented by the <code>neg</code>
7518
  field. Folding of integer constant expressions is performed in the
7519
  producer, however the result can only be represented as described
7520
  above if its value is target independent.  Target dependent values
7521
  are represented by the <code>calc</code> field which contains an 
7522
  <A HREF="#exp">expression</A> describing how to calculate the value.
7523
  The <code>token</code> field is used to represent <code>NAT</code>
7524
  tokens. 
7525
  </para>
7526
  <para>
7527
  Objects representing small integer constants are created at the start
7528
  of the program and stored in a table for ease of access.  Larger constants
7529
  are created as and when they are required. 
7530
  </para>
7531
  </sect3>  
7532
 
7533
  <sect3 id="flt">
7534
    <title>3.2.21. <code>FLOAT</code></title>
7535
  <para>
7536
  The union type <code>FLOAT</code> (<code>flt</code>) is used to represent
7537
  a floating point constant expression.  There is only one field, <code>simple
7538
  </code>, which corresponds to a floating point literal.  No folding
7539
  of floating point constant expressions is attempted in the producer
7540
  (it is virtually impossible to do so in a target independent manner).
7541
  </para>
7542
  <para>
7543
  Objects representing useful floating point constants (0.0, 1.0 etc.)
7544
  are created for each floating point type and stored as part of the
7545
  corresponding <A HREF="#ftype"><code>FLOAT_TYPE</code></A>.  Other
7546
  values are created as and when they are required. 
7547
  </para>
7548
  </sect3>  
7549
 
7550
  <sect3 id="str">
7551
    <title>3.2.22. <code>STRING</code></title>
7552
  <para>
7553
  The union type <code>STRING</code> (<code>str</code>) is used to represent
7554
  a string constant expression.  There is only one field, 
7555
  <code>simple</code>, which corresponds to a character string literal,
7556
  however the <code>kind</code> field can be used to modify the interpretation
7557
  put on the characters appearing in the <code>text</code>
7558
  field.  By default, each character in <code>text</code> corresponds
7559
  to a single character in the literal; however an alternative representation,
7560
  in which <code>text</code> consists of a sequence of multibyte characters
7561
  - one control character plus four value characters - is used in more
7562
  complex cases. 
7563
  </para>
7564
  <para>
7565
  All strings are stored in a hash table intended to ensure that the
7566
  same <code>STRING</code> object is used for equal string literals.
7567
  This not only saves space during the processing of the input file,
7568
  but also facilitates the output of shared string literals in the TDF
7569
  capsule. 
7570
  </para>
7571
  <para>
7572
  Note that the terminal zero character does not form part of the 
7573
  <code>STRING</code> object.  Instead information on this is stored
7574
  as part of the type of a <A HREF="#exp">string literal expression</A>.
7575
  The text of the string literal is either truncated or padded with
7576
  zeros until its length matches the size of the array bound in the
7577
  type of the corresponding literal expression. 
7578
  </para>
7579
  </sect3>  
7580
 
7581
  <sect3 id="ntest">
7582
    <title>3.2.23. <code>NTEST</code></title>
7583
  <para>
7584
  The enumeration type <code>NTEST</code> (<code>ntest</code>) is used
7585
  to represent the various C++ relational operators (<code>==</code>,
7586
  <code>!=</code>, <code>&gt;</code> etc.).  The values correspond to
7587
  the encoding of the TDF <code>NTEST</code> sort, which facilitates
7588
  code generation.  The values also have the property that the values
7589
  for complementary operators (such as <code>&lt;</code> and 
7590
  <code>&gt;=</code>) always add up to the same value, 
7591
  <code>ntest_negate</code>, allowing operators to be complemented in
7592
  a straightforward manner. 
7593
  </para>
7594
  </sect3>  
7595
 
7596
  <sect3 id="rmode">
7597
    <title>3.2.24. <code>RMODE</code></title>
7598
  <para>
7599
  The enumeration type <code>RMODE</code> (<code>rmode</code>) is used
7600
  to represent the various C++ rounding modes (towards zero, towards
7601
  smaller etc.).  The values correspond to the encoding of the TDF 
7602
  <code>RMODE</code> sort, which facilitates code generation. 
7603
  </para>
7604
  </sect3>  
7605
 
7606
  <sect3 id="exp">
7607
    <title>3.2.25. <code>EXP</code></title>
7608
  <para>
7609
  The union type <code>EXP</code> (<code>exp</code>) is used to represent
7610
  a C++ expression or statement.  Each expression has an associated
7611
  <A HREF="#type">type</A>, <code>type</code>, but most of the information
7612
  about an expression is stored in one of the large number of fields
7613
  of the <code>EXP</code> union.  Most of these fields are fairly simple.
7614
  For example, there are fields corresponding to <A HREF="#nat">integer
7615
  literals</A>, <A HREF="#flt">floating point literals</A>, 
7616
  <A HREF="#str">string literals</A> and <A HREF="#id">identifiers</A>.
7617
  Composite expressions are formed in the normal way; for example, there
7618
  are various binary operators comprising two argument expressions.
7619
  The 
7620
  <code>EXP</code> fields corresponding to statements are slightly more
7621
  complex.  They each have a <code>parent</code> field which points
7622
  to the enclosing statement.  A couple of cases bear additional discussion.
7623
  </para>
7624
  <para>
7625
  The <code>sequence</code> field represents a compound statement or
7626
  block.  This contains a <A HREF="#nspace">namespace</A>, in which
7627
  any local variables are declared, and a list of expressions, giving
7628
  the statements comprising the block.  The null namespace is used if
7629
  the block does not constitute a scope.  The first statement in the
7630
  list is always a dummy to enable <code>first</code> and <code>last</code>
7631
  pointers to be maintained to the start and end of the list without
7632
  having to worry about null lists. 
7633
  </para>
7634
  <para>
7635
  <A id="solve">The <code>solve_stmt</code> field corresponds to the
7636
  TDF <code>labelled</code> construct</A> (in early versions of TDF
7637
  this construct was called <code>solve</code>, hence the terminology).
7638
  The problem is that C and C++ labels and <code>goto</code>s are totally
7639
  unstructured, whereas the TDF label constructs are structured.  Any
7640
  statement which contains unstructured labels is enclosed in a 
7641
  <code>solve_stmt</code> construct, enclosing both the labelled statement
7642
  and all jumps to it (in general this cannot be done until the end
7643
  of the function).  Any labels or variables which are bypassed by such
7644
  unstructured jumps also need to be pulled out to the <code>solve_stmt</code>
7645
  construct.  It is not just explicit labels which can cause such problems;
7646
  complex <code>switch</code> statements have the same effect. 
7647
  </para>
7648
  </sect3>  
7649
 
7650
  <sect3 id="off">
7651
    <title>3.2.26. <code>OFFSET</code></title>
7652
  <para>
7653
  The union type <code>OFFSET</code> (<code>off</code>) is used to represent
7654
  an offset expression.  This is used as an adjunct to the normal 
7655
  <A HREF="#exp">expression</A> representation.  The <code>OFFSET</code>
7656
  union has fields corresponding to a type offset (used in pointer arithmetic),
7657
  the offset of a member of a class and the offset of a base class.
7658
  There are also simple operations on offsets, such as multiplication
7659
  by an expression. 
7660
  </para>
7661
  </sect3>  
7662
 
7663
  <sect3 id="tok">
7664
    <title>3.2.27. <code>TOKEN</code></title>
7665
  <para>
7666
  The union type <code>TOKEN</code> (<code>tok</code>) is used to represent
7667
  one of a number of different categories within the C++ language. 
7668
  It corresponds to the sort of a token declared using the 
7669
  <A HREF="token.html"><code>#pragma token</code> syntax</A>.  Thus
7670
  there are fields corresponding to expression, statement, integer constant,
7671
  type, function, member and procedure tokens.  The similarities between
7672
  <code>PROC</code> tokens and templates have been remarked above; for
7673
  example, the parameters of the template: 
7674
  <programlisting>
7675
  	template &lt; class T, int n &gt; class A {
7676
  	    T a [n] ;
7677
  	    // ....
7678
  	} ;
7679
  </programlisting>
7680
  are essentially equivalent to those in the procedure token: 
7681
  <programlisting>
7682
  	PROC ( TYPE T, EXP const : int : n ) ....
7683
  </programlisting>
7684
  (recall that non-type template arguments are always constant expressions).
7685
  Thus a field, <code>templ</code>, of the <code>TOKEN</code> union
7686
  is used to represent lists of template parameters.  Note that a further
7687
  field, <code>class</code>, is also required to represent template
7688
  template parameters.  A <A HREF="#type">template type</A> is represented
7689
  by a field, <code>templ</code>, of the union <code>TYPE</code>, which
7690
  comprises a template sort and a sub-type expressed in terms of the
7691
  template parameters. 
7692
  </para>
7693
  <para>
7694
  In addition to representing token and template sorts in this way,
7695
  the 
7696
  <code>TOKEN</code> union is used to represent token and template arguments.
7697
  Each of the parameter sorts listed above has an appropriate 
7698
  <code>value</code> component which can store a value of that sort.
7699
  Many of the union types in the algebra, including <A HREF="#type">types</A>
7700
  and <A HREF="#exp">expressions</A>, have a field of the form: 
7701
  <programlisting>
7702
  	token -&gt; {
7703
  	    IDENTIFIER tok ;
7704
  	    LIST TOKEN args ;
7705
  	}
7706
  </programlisting>
7707
  representing the given token <A HREF="#id">identifier</A> applied
7708
  to the given list of arguments. 
7709
  </para>
7710
  <para>
7711
  <A id="form">Template instances are represented slightly differently
7712
  from token applications</A>.  Each instance of a template class or
7713
  a template function gives rise to a new class or function 
7714
  <A HREF="#id">identifier</A>.  This identifier has an underlying form
7715
  giving the template identifier and the template arguments.  This is
7716
  expressed as a <code>token</code> member of the 
7717
  <A HREF="#type"><code>TYPE</code></A> union (although it is not technically
7718
  a type, this happens to be the most convenient representation).  Each
7719
  such form has an associated 
7720
  <A HREF="#inst"><code>INSTANCE</code></A> component which gives further
7721
  information about the template instance.  The form for a template
7722
  function instance is stored in the <code>form</code> component of
7723
  the corresponding <A HREF="#id">identifier</A>.  The form for a template
7724
  class instance is stored in the <code>form</code> component of the
7725
  corresponding <A HREF="#ctype">class type</A>. 
7726
  </para>
7727
  <para>
7728
  Members of instances of template classes also have a form type, but
7729
  in this case the form is an <code>instance</code> type.  This gives
7730
  a link back to the corresponding member of the template class. 
7731
  </para>
7732
  </sect3>  
7733
 
7734
  <sect3 id="inst">
7735
    <title>3.2.28. <code>INSTANCE</code></title>
7736
  <para>
7737
  The union type <code>INSTANCE</code> (<code>inst</code>) is used to
7738
  represent a particular instance of a template or token.  Each 
7739
  <A HREF="#tok">template sort</A> has an associated list of all the
7740
  instances of that template, which is used to ensure that the same
7741
  template applied with the same arguments always has the same value.
7742
  Information on partial or explicit specialisations and usage information
7743
  are stored as part of the corresponding 
7744
  <code>INSTANCE</code>.  Each template instance identifier has a link
7745
  back to its corresponding <code>INSTANCE</code> via its 
7746
  <A HREF="#form"><code>form</code> component</A>. 
7747
  </para>
7748
  </sect3>  
7749
 
7750
  <sect3 id="err">
7751
    <title>3.2.29. <code>ERROR</code></title>
7752
  <para>
7753
  The union type <code>ERROR</code> (<code>err</code>) is used to represent
7754
  an error arising during the compilation of a C++ program. Errors are
7755
  first class objects within the producer and can be passed to and from
7756
  procedures.  Each error has an associated <code>severity</code>
7757
  (serious, warning, none etc.).  Simple errors are represented by the
7758
  <code>simple</code> field, which consists of an index, <code>number</code>,
7759
  into the error catalogue, plus a variable length list of error arguments.
7760
  Errors can be combined into composite errors using the 
7761
  <code>compound</code> field, which represents the join of two errors
7762
  - 
7763
  <code>head</code> followed by <code>tail</code>. 
7764
  </para>
7765
  <para>
7766
  The chief operation on an error after it has been built up is to report
7767
  it.  Each error report consists of an error object and a 
7768
  <A HREF="#loc">file location</A> indicating where the error occurred.
7769
  </para>
7770
  </sect3>  
7771
 
7772
  <sect3 id="var">
7773
    <title>3.2.30. <code>VARIABLE</code></title>
7774
  <para>
7775
  The structure type <code>VARIABLE</code> (<code>var</code>) is used
7776
  to represent a variable state and is used in the variable analysis
7777
  checks. 
7778
  </para>
7779
  </sect3>  
7780
 
7781
  <sect3 id="location">
7782
    <title>3.2.31. <code>LOCATION</code></title>
7783
  <para>
7784
  The structure type <code>LOCATION</code> (<code>loc</code>) is used
7785
  to represent a location in an input file.  It comprises a pointer
7786
  to an 
7787
  <A HREF="#posn">input file position</A>, <code>posn</code>, modified
7788
  by a line number, taking <code>#line</code> directives into account,
7789
  <code>line</code>.  Note that character positions within the line
7790
  are not currently recorded. 
7791
  </para>
7792
  </sect3>  
7793
 
7794
  <sect3 id="posn">
7795
    <title>3.2.32. <code>POSITION</code></title>
7796
  <para>
7797
  The structure type <code>POSITION</code> (<code>posn</code>) is used
7798
  to represent a position in an input file.  It consists of two file
7799
  names, 
7800
  <code>file</code> taking <code>#line</code> directives into account,
7801
  and 
7802
  <code>input</code> giving the actual file name, plus a line number
7803
  offset, <code>offset</code>, which gives the difference between the
7804
  line number taking <code>#line</code> directives into account and
7805
  the actual line number.  Other information stored includes the datestamp
7806
  on the input file, <code>datestamp</code>, and a pointer to a 
7807
  <A HREF="#loc">file location</A> which, for files included using 
7808
  <code>#include</code>, gives the location the file was included from.
7809
  </para>
7810
  </sect3>  
7811
 
7812
  <sect3 id="bits">
7813
    <title>3.2.33. <code>BITSTREAM</code></title>
7814
  <para>
7815
  The structure <code>BITSTREAM</code> is not part of the 
7816
  <code>calculus</code> type system.  It is used to represent a sequence
7817
  of bits such as is used, for example, in the encoding of TDF. 
7818
  </para>
7819
  </sect3>  
7820
 
7821
  <sect3 id="buff">
7822
    <title>3.2.34. <code>BUFFER</code></title>
7823
  <para>
7824
  The structure <code>BUFFER</code> is not part of the <code>calculus</code>
7825
  type system.  It is used to represent a sequence of characters. 
7826
  </para>
7827
  </sect3>  
7828
 
7829
  <sect3 id="opt">
7830
    <title>3.2.35. <code>OPTIONS</code></title>
7831
  <para>
7832
  The structure <code>OPTIONS</code> is not part of the <code>calculus</code>
7833
  type system.  It is used to represent the state of the 
7834
  <A HREF="pragma.html#low">compiler options</A> at a particular point
7835
  in the input file. 
7836
  </para>
7837
  </sect3>  
7838
 
7839
  <sect3 id="pptok">
7840
    <title>3.2.36. <code>PPTOKEN</code></title>
7841
  <para>
7842
  The structure <code>PPTOKEN</code> is not part of the <code>calculus</code>
7843
  type system.  It is used to represent a linked list of preprocessing
7844
  tokens.  Each token has an associated <code>sid</code> lexical token
7845
  number, <code>tok</code>, plus additional data dependent on the token
7846
  type.  Each token also records a pointer to the current 
7847
  <A HREF="#opt"><code>OPTIONS</code></A> value. 
7848
  </para>
7849
  </sect3>
7850
  </sect2>
7851
 
7852
  <sect2>
7853
    <title>3.3. Error catalogue</title>
7854
  <para>
7855
  This section describes the error catalogue which lies at the heart
7856
  of the C++ producer's error reporting routines.  The full 
7857
  <A HREF="error1.html">error catalogue syntax</A> is given as an annex.
7858
  A typical entry in the catalogue is as follows: 
7859
  <programlisting>
7860
  	class_union_deriv ( CLASS_TYPE: ct )
7861
  	{
7862
  	    USAGE:              serious
7863
  	    PROPERTIES:         ansi
7864
  	    KEY (ISO)           &quot;9.5&quot;
7865
  	    KEY (STANDARD)      &quot;The union '&quot;ct&quot;' can't have base classes&quot;
7866
  	}
7867
  </programlisting>
7868
  This defines an error, <code>class_union_deriv</code>, which takes
7869
  a single parameter <code>ct</code> of type <code>CLASS_TYPE</code>.
7870
  The severity of this error is <code>serious</code>; that is to say,
7871
  a constraint error.  The error property <code>ansi</code> indicates
7872
  that the error arises from the ISO C++ standard, the associated 
7873
  <code>ISO</code> key indicating section 9.5.  Finally the text to
7874
  be printed for this error, including a reference to <code>ct</code>,
7875
  is given.  Looking up section 9.5 in the ISO C++ standard reveals
7876
  the corresponding constraint in paragraph 1: 
7877
  <BLOCKQUOTE>
7878
  <I>A union shall not have base classes.</I>
7879
  </BLOCKQUOTE>
7880
  Each constraint within the ISO C++ standard has a corresponding error
7881
  in this way.  The errors are named in a systematic fashion using the
7882
  section names used in the draft standard.  For example, section 9.5
7883
  is called <code>class.union</code>, so all the constraint errors arising
7884
  from this section have names of the form <code>class_union_*</code>.
7885
  These error names can be used in the <A HREF="pragma.html#low">low
7886
  level directives</A> such as: 
7887
  <programlisting>
7888
  	#pragma TenDRA++ error &quot;class_union_deriv&quot; <I>allow</I>
7889
  </programlisting>
7890
  to modify the error severity.  The effect of reducing the severity
7891
  of a constraint error in this way is undefined. 
7892
  </para>
7893
  <para>
7894
  In addition to the obvious error severity levels, <code>serious</code>,
7895
  <code>warning</code> and <code>none</code>, the error catalogue specifies
7896
  a list of optional severity levels along with their default values.
7897
  For example, the entry: 
7898
  <programlisting>
7899
  	link_incompat = serious
7900
  </programlisting>
7901
  sets up an option named <code>link_incompat</code> which is a constraint
7902
  error by default.  Errors with this severity, such as: 
7903
  <programlisting>
7904
  	dcl_stc_external ( LONG_ID: id, PTR_LOC: loc )
7905
  	{
7906
  	    USAGE:              link_incompat
7907
  	    PROPERTIES:         ansi
7908
  	    KEY (ISO)           &quot;7.1.1&quot;
7909
  	    KEY (STANDARD)      &quot;'&quot;id&quot;' previously declared with external
7910
  				 linkage (at &quot;loc&quot;)&quot;
7911
  	}
7912
  </programlisting>
7913
  are therefore constraint errors.  The severity associated with 
7914
  <code>link_incompat</code> can be modified either 
7915
  <A HREF="pragma.html#low">directly</A>, using the directive: 
7916
  <programlisting>
7917
  	#pragma TenDRA++ option &quot;link_incompat&quot; <I>allow</I>
7918
  </programlisting>
7919
  or <A HREF="pragma.html#linkage">indirectly</A> using the directive:
7920
  <programlisting>
7921
  	#pragma TenDRA incompatible linkage <I>allow</I>
7922
  </programlisting>
7923
  the effect being to modify the severity of the associated error messages.
7924
  </para>
7925
  <para>
7926
  The error catalogue is processed by a simple tool, 
7927
  <code>make_err</code>, which generates C code which is compiled into
7928
  the C++ producer.  Each error in the catalogue is assigned a number
7929
  (there are currently 873 errors in the catalogue) which gives an index
7930
  into an automatically generated table of error information.  It is
7931
  this error number, together with a list of error arguments, which
7932
  forms the associated <A HREF="alg.html#err"><code>ERROR</code> object</A>.
7933
  <code>make_err</code> generates a macro for each error in the catalogue
7934
  which takes arguments of the appropriate types (which may be statically
7935
  checked) and creates an <code>ERROR</code> object.  For example, for
7936
  the entry above this macro takes the form: 
7937
  <programlisting>
7938
  	ERROR ERR_class_union_deriv ( CLASS_TYPE ) ;
7939
  </programlisting>
7940
  These macros hide the error catalogue numbers from the rest of the
7941
  C++ producer. 
7942
  </para>
7943
  <para>
7944
  It is also possible to join a number of simple <code>ERROR</code>
7945
  objects to form a single composite <code>ERROR</code>.  The severity
7946
  of the composite error is the maximum of the severities of the component
7947
  errors.  To this purpose a dummy error severity level <code>whatever</code>
7948
  is introduced which is less severe than any other level.  This is
7949
  intended for use with error messages which are only ever used to add
7950
  information to existing errors, and which inherit their severity level
7951
  from the main error. 
7952
  </para>
7953
  <para>
7954
  The text of a simple error message can be found in the table of error
7955
  information.  The text contains certain escape sequences indicating
7956
  where the error arguments are to be printed.  For example, 
7957
  <code>%1</code> indicates the second argument.  The error argument
7958
  sorts - what is referred to as the error signature - is also stored
7959
  in the table of error information as an array of characters, each
7960
  corresponding to an <code>ERR_KEY_</code><I>type</I> macro.  The producer
7961
  defines printing routines for each of the types given by these values,
7962
  and calls the appropriate routine to print the argument. 
7963
  </para>
7964
  <para>
7965
  There are several command-line options which can be used to modify
7966
  the form in which the error message is printed.  The default format
7967
  is as follows: 
7968
  <programlisting>
7969
  	&quot;file.C&quot;, line 42: Error:
7970
  	    [ISO 9.5]: The union 'U' can't have base classes.
7971
  </programlisting>
7972
  The ISO section number can be suppressed using <code>-m-s</code>.
7973
  The <code>-mc</code> option causes the source code line giving rise
7974
  to the error to be printed as part of the message, with <code>!!!!</code>
7975
  marking the position of the error within the line.  The <code>-me</code>
7976
  option causes the error name, <code>class_union_deriv</code>, to be
7977
  printed as part of the message.  The <code>-ml</code> option causes
7978
  the full file location, including the list of <code>#include</code>
7979
  directives used in reaching the file, to be printed.  The <code>-mt</code>
7980
  option causes <code>typedef</code> names to be used when printing
7981
  types, rather than expanding to the type definition. 
7982
  </para>
7983
  </sect2>
7984
 
7985
  <sect2>
7986
    <title>3.4. Parsing C++</title>
7987
  <para>
7988
  The parser used in the C++ producer is generated using the 
7989
  <A HREF="../utilities/sid.html"><code>sid</code> tool</A>.  Because
7990
  of the large size of the generated code (1.3MB), the <code>sid</code>
7991
  output is run through a simple program, <code>sidsplit</code>, which
7992
  splits the output into a number of more manageable modules.  It also
7993
  transforms the code to use the <A HREF="style.html#language"><code>PROTO</code>
7994
  macros</A> used in the rest of the program. 
7995
  </para>
7996
  <para>
7997
  <code>sid</code> is designed as a parser for grammars which can be
7998
  transformed into LL(1) grammars.  The distinguishing feature of these
7999
  grammars is that the parser can always decide what to do next based
8000
  on the current terminal.  This is not the case in C++; in some circumstances
8001
  a potentially unlimited look-ahead is required to distinguish, for
8002
  example, declaration statements from expression statements.  In the
8003
  technical phrase, C++ is an LL(k) grammar. Fortunately there are relatively
8004
  few such situations, and <code>sid</code>
8005
  provides a mechanism, <A HREF="../utilities/sid.html#predicate">predicates</A>,
8006
  for bypassing the normal parsing mechanism in these cases.  Thus it
8007
  is possible, although difficult, to express C++ as a <code>sid</code>
8008
  grammar. 
8009
  </para>
8010
  <para>
8011
  The <code>sid</code> grammar file, <code>syntax.sid</code>, is closely
8012
  based on the ISO C++ grammar.  In particular, the same production
8013
  names have been used.  The grammar has been extended slightly to allow
8014
  common syntactic errors to be detected elegantly.  Other parsing errors
8015
  are handled by <code>sid</code>'s exception mechanism.  At present
8016
  there is only limited recovery after such errors. 
8017
  </para>
8018
  <para>
8019
  The lexical analysis routines in the C++ producer are hand-crafted,
8020
  based on an initial version generated by the simple lexical analyser
8021
  generator, 
8022
  <code>lexi</code>.  <code>lexi</code> has been used more directly
8023
  to generate the lexical analysers for certain of the other automatic
8024
  code generating tools, including <code>calculus</code>, used in the
8025
  producer. 
8026
  </para>
8027
  <para>
8028
  The <code>sid</code> grammar contains a number of entry points.  The
8029
  most important is <code>parse_file</code>, which is used to parse
8030
  a complete C++ translation unit.  The syntax for the 
8031
  <A HREF="pragma.html"><code>#pragma TenDRA</code> directives</A> is
8032
  included within the same grammar with two entry points, 
8033
  <code>parse_tendra</code> in normal use, and <code>parse_preproc</code>
8034
  for use in preprocessing mode.  There are also entry points in the
8035
  grammar for each of the kinds of <A HREF="token.html#args">token argument</A>.
8036
  The parsing routines for token and template arguments are largely
8037
  hand-crafted, based on these primitives. 
8038
  </para>
8039
  <para>
8040
  Certain parsing operations are performed before control passes to
8041
  the 
8042
  <code>sid</code> grammar.  As mentioned above, these include the processing
8043
  of token and template applications.  The other important case concerns
8044
  nested name specifiers.  For example, in: 
8045
  <programlisting>
8046
  	class A {
8047
  	    class B {
8048
  		static int c ;
8049
  	    } ;
8050
  	} ;
8051
 
8052
  	int A::B::c = 0 ;
8053
  </programlisting>
8054
  the qualified identifier <code>A::B::c</code> is split into two terminals,
8055
  a nested name specifier, <code>A::B::</code>, and an identifier, <code>c</code>,
8056
  which is looked up in the corresponding namespace.  Note that it is
8057
  at this stage that name look-up occurs. An identifier can be mapped
8058
  to one of a number of terminals, including  keywords, type names,
8059
  namespace names and other identifiers, according to the result of
8060
  this look-up.  If the look-up gives a macro then this is expanded
8061
  at this stage. 
8062
  </para>
8063
  </sect2>
8064
 
8065
  <sect2>
8066
    <title>3.5. TDF generation</title>
8067
  <para>
8068
  The TDF encoding as a bitstream is expressed as a series of macros
8069
  generated by the <code>make_tdf</code> tool from the TDF specification
8070
  database.  Note that the version of the TDF database used contains
8071
  a couple of corrections from the standard version: 
8072
  <itemizedlist>
8073
  <listitem>A construct <code>make_token_def</code> has been added to represent
8074
  a     token definition. 
8075
  </listitem>
8076
  <listitem>The sort <code>diag_tag</code> has been added to the edge constructors.
8077
  </listitem>
8078
  </itemizedlist>
8079
  The macros generated only handle the encoding of the construct - the
8080
  construct parameters need to be encoded by hand (the C producer does
8081
  something similar, but including the construct parameters).  For example,
8082
  <code>make_tdf</code> generates a macro: 
8083
  <programlisting>
8084
  	void ENC_plus ( BITSTREAM * ) ;
8085
  </programlisting>
8086
  which encodes the <code>plus</code> construct (91 as 7 bits in extended
8087
  format).  A typical use of this macro, for adding the expressions
8088
  <code>a</code> and <code>b</code> would be: 
8089
  <programlisting>
8090
  	ENC_plus ( bs ) ;
8091
  	ENC_impossible ( bs ) ;
8092
  	bs = enc_exp ( bs, a ) ;
8093
  	bs = enc_exp ( bs, b ) ;
8094
  </programlisting>
8095
  </para>
8096
  <para>
8097
  Each function or variable is compiled to TDF as its definition is
8098
  encountered.  For some definitions, such as inline functions, the
8099
  compilation may be deferred until it is clear whether or not the identifier
8100
  has been used.  There is a final pass over all identifiers during
8101
  the variable analysis routines which incorporates this check. Because
8102
  of the organisation of a TDF capsule it is necessary to store all
8103
  of the compiled TDF in memory until the end of the program, when the
8104
  complete capsule, including external tag and token names and linkage
8105
  information, is written to the output file. 
8106
  </para>
8107
  </sect2>
8108
  </sect1>
8109
 
8110
  <sect1>
8111
    <title>Annex A. <code>#pragma</code> directive syntax</title>
8112
  <para>
8113
  The following gives a summary of the syntax for the <code>#pragma</code>
8114
  directives used for <A HREF="pragma.html">compiler configuration</A>
8115
  and <A HREF="token.html">token specification</A>: 
8116
  <programlisting>
8117
 
8118
  	<I>pragma-directive</I> :
8119
  		<A HREF="#tendra"># pragma TenDRA ++<I><SUB>opt</SUB> tendra-directive</I></A>
8120
  		<A HREF="#token"># pragma <I>token-directive</I></A>
8121
 
8122
  	<A id="tendra"><I>tendra-directive</I></A> :
8123
  		<A HREF="#scope"><I>scope-directive</I></A>
8124
  		<A HREF="#low"><I>low-level-directive</I></A>
8125
  		<A HREF="#analysis"><I>analysis-directive on</I></A>
8126
  		<A HREF="#check"><I>check-directive allow</I></A>
8127
  		<A HREF="#keyword"><I>keyword-directive</I></A>
8128
  		<A HREF="#type"><I>type-directive</I></A>
8129
  		<A HREF="#linkage"><I>linkage-directive</I></A>
8130
  		<A HREF="#misc"><I>misc-directive</I></A>
8131
  		<A HREF="#token1"><I>tendra-token-directive</I></A>
8132
 
8133
  	<I>on</I> :
8134
  		on
8135
  		warning
8136
  		off
8137
 
8138
  	<I>allow</I> :
8139
  		allow
8140
  		warning
8141
  		disallow
8142
 
8143
 
8144
  	<A id="scope"><I>scope-directive</I></A> :
8145
  		<A HREF="pragma.html#scope">begin</A>
8146
  		<A HREF="pragma.html#scope">begin name environment <I>identifier</I></A>
8147
  		<A HREF="pragma.html#scope">end</A>
8148
  		<A HREF="pragma.html#scope">directory <I>identifier</I> use environment <I>identifier</I></A>
8149
  		<A HREF="pragma.html#scope">use environment <I>identifier</I></A>
8150
  		<A HREF="pragma.html#scope">use environment <I>identifier</I> reset <I>allow</I></A>
8151
 
8152
 
8153
  	<A id="low"><I>low-level-directive</I></A> :
8154
  		<A HREF="pragma.html#low">error <I>string-literal allow</I></A>
8155
  		<A HREF="pragma.html#low">error <I>string-literal on</I></A>
8156
  		<A HREF="pragma.html#low">error <I>string-literal</I> as option <I>string-literal</I></A>
8157
  		<A HREF="pragma.html#low">option <I>string-literal allow</I></A>
8158
  		<A HREF="pragma.html#low">option <I>string-literal on</I></A>
8159
  		<A HREF="pragma.html#limits">option value <I>string-literal integer-literal</I></A>
8160
  		<A HREF="pragma.html#low">use error <I>string-literal</I></A>
8161
 
8162
 
8163
  	<A id="analysis"><I>analysis-directive</I></A> :
8164
  		<A HREF="pragma.html#init">complete initialization analysis</A>
8165
  		<A HREF="pragma.html#elab">complete struct / union analysis</A>
8166
  		<A HREF="pragma.html#conv">conversion analysis <I>conversion-spec<SUB>opt</SUB></I></A>
8167
  		<A HREF="pragma.html#discard">discard analysis <I>discard-spec<SUB>opt</SUB></I></A>
8168
  		<A HREF="pragma.html#switch">enum switch analysis</A>
8169
  		<A HREF="pragma.html#linkage">external function linkage</A>
8170
  		<A HREF="pragma.html#for">for initialization block</A>
8171
  		<A HREF="pragma.html#elab">ignore struct / union / enum tag</A>
8172
  		<A HREF="pragma.html#template">implicit export template</A>
8173
  		<A HREF="pragma.html#impl_func">implicit function declaration</A>
8174
  		<A HREF="pragma.html#exp">integer operator analysis</A>
8175
  		<A HREF="pragma.html#exp">integer overflow analysis</A>
8176
  		<A HREF="pragma.html#comment">nested comment analysis</A>
8177
  		<A HREF="pragma.html#exp">operator precedence analysis</A>
8178
  		<A HREF="pragma.html#exp">pointer operator analysis</A>
8179
  		<A HREF="pragma.html#throw">throw analysis</A>
8180
  		<A HREF="pragma.html#linkage">unify external linkage</A>
8181
  		<A HREF="pragma.html#variable">variable analysis</A>
8182
  		<A HREF="pragma.html#hide">variable hiding analysis</A>
8183
  		<A HREF="pragma.html#weak">weak prototype analysis</A>
8184
 
8185
  	<I>conversion-spec</I> :
8186
  		( int - int <I>implicit-spec<SUB>opt</SUB></I> )
8187
  		( int - pointer <I>implicit-spec<SUB>opt</SUB></I> )
8188
  		( pointer - int <I>implicit-spec<SUB>opt</SUB></I> )
8189
  		( pointer - pointer <I>implicit-spec<SUB>opt</SUB></I> )
8190
  		( int - enum implicit )
8191
  		( pointer - void * implicit )
8192
  		( void * - pointer implicit )
8193
 
8194
  	<I>implicit-spec</I> :
8195
  		implicit
8196
  		explicit
8197
 
8198
  	<I>discard-spec</I> :
8199
  		( function return )
8200
  		( static )
8201
  		( value )
8202
 
8203
 
8204
  	<A id="check"><I>check-directive</I></A> :
8205
  		<A HREF="pragma.html#overload">ambiguous overload resolution</A>
8206
  		<A HREF="pragma.html#if">assignment as bool</A>
8207
  		<A HREF="pragma.html#bitfield">bitfield overflow</A>
8208
  		<A HREF="pragma.html#linkage">block function static</A>
8209
  		<A HREF="pragma.html#catch_all">catch all</A>
8210
  		<A HREF="pragma.html#escape">character escape overflow</A>
8211
  		<A HREF="token.html#tokdef">compatible token</A>
8212
  		<A HREF="pragma.html#include">complete file includes</A>
8213
  		<A HREF="pragma.html#target-if">conditional declaration</A>
8214
  		<A HREF="pragma.html#lvalue">conditional lvalue</A>
8215
  		<A HREF="pragma.html#overload">conditional overload resolution <I>overload-spec<SUB>opt</SUB></I></A>
8216
  		<A HREF="pragma.html#if">const conditional</A>
8217
  		<A HREF="pragma.html#macro">directive as macro argument</A>
8218
  		<A HREF="pragma.html#identifier">dollar as ident</A>
8219
  		<A HREF="pragma.html#elab">extra ,</A>
8220
  		<A HREF="pragma.html#decl_none">extra ;</A>
8221
  		<A HREF="pragma.html#if">extra ; after conditional</A>
8222
  		<A HREF="pragma.html#weak">extra ...</A>
8223
  		<A HREF="pragma.html#bitfield">extra bitfield int type</A>
8224
  		<A HREF="pragma.html#macro">extra macro definition</A>
8225
  		<A HREF="pragma.html#typedef">extra type definition</A>
8226
  		<A HREF="pragma.html#switch">fall into case</A>
8227
  		<A HREF="pragma.html#elab">forward enum declaration</A>
8228
  		<A HREF="pragma.html#conv">function pointer as pointer</A>
8229
  		<A HREF="pragma.html#ellipsis">ident ...</A>
8230
  		<A HREF="pragma.html#implicit">implicit int type <I>inttype-spec<SUB>opt</SUB></I></A>
8231
  		<A HREF="token.html#tokdef">implicit token definition</A>
8232
  		<A HREF="token.html#spec">incompatible interface declaration</A>
8233
  		<A HREF="token.html#member">incompatible member declaration</A>
8234
  		<A HREF="pragma.html#linkage">incompatible linkage</A>
8235
  		<A HREF="pragma.html#weak">incompatible promoted function argument</A>
8236
  		<A HREF="pragma.html#compatible">incompatible type qualifier</A>
8237
  		<A HREF="pragma.html#return">incompatible void return</A>
8238
  		<A HREF="pragma.html#complete">incomplete type as object type</A>
8239
  		<A HREF="pragma.html#ppdir">indented # directive</A>
8240
  		<A HREF="pragma.html#ppdir">indented directive after #</A>
8241
  		<A HREF="pragma.html#init">initialization of struct / union ( auto )</A>
8242
  		<A HREF="pragma.html#longlong">longlong type</A>
8243
  		<A HREF="pragma.html#ppdir">no directive / nline after ident</A>
8244
  		<A HREF="pragma.html#empty">no external declaration</A>
8245
  		<A HREF="pragma.html#macro">no ident after #</A>
8246
  		<A HREF="pragma.html#lex">no nline after file end</A>
8247
  		<A HREF="token.html#tokdef">no token definition</A>
8248
  		<A HREF="pragma.html#overload">overload resolution</A>
8249
  		<A HREF="pragma.html#weak">prototype</A>
8250
  		<A HREF="pragma.html#weak">prototype ( weak )</A>
8251
  		<A HREF="token.html#exp">rvalue token as const</A>
8252
  		<A HREF="pragma.html#ppdir">text after directive</A>
8253
  		<A HREF="pragma.html#lvalue">this lvalue</A>
8254
  		<A HREF="pragma.html#string">unify incompatible string literal</A>
8255
  		<A HREF="pragma.html#ppdir">unknown directive</A>
8256
  		<A HREF="pragma.html#escape">unknown escape</A>
8257
  		<A HREF="pragma.html#ppdir">unknown pragma</A>
8258
  		<A HREF="pragma.html#decl_none">unknown struct / union</A>
8259
  		<A HREF="pragma.html#string">unmatched quote</A>
8260
  		<A HREF="pragma.html#reach">unreachable code</A>
8261
  		<A HREF="pragma.html#init">variable initialization</A>
8262
  		<A HREF="pragma.html#macro">weak macro equality</A>
8263
  		<A HREF="pragma.html#string">writeable string literal</A>
8264
 
8265
  	<I>inttype-spec</I> :
8266
  		for const / volatile
8267
  		for external declaration
8268
  		for function return
8269
 
8270
  	<I>overload-spec</I> :
8271
  		( complete )
8272
  		( incomplete )
8273
 
8274
 
8275
  	<A id="keyword"><I>keyword-directive</I></A> :
8276
  		<A HREF="#keyword">keyword <I>identifier</I> for <I>keyword-spec</I></A>
8277
  		<A HREF="pragma.html#keyword-spec">undef keyword <I>identifier</I></A>
8278
 
8279
  	<A id="keyword-spec"><I>keyword-spec</I></A> :
8280
  		<A HREF="pragma.html#discard">discard value</A>
8281
  		<A HREF="pragma.html#variable">discard variable</A>
8282
  		<A HREF="pragma.html#switch">exhaustive</A>
8283
  		<A HREF="pragma.html#switch">fall into case</A>
8284
  		<A HREF="pragma.html#keyword">keyword <I>identifier</I></A>
8285
  		<A HREF="pragma.html#keyword">operator <I>operator</I></A>
8286
  		<A HREF="pragma.html#variable">set</A>
8287
  		<A HREF="pragma.html#reach">set reachable</A>
8288
  		<A HREF="pragma.html#reach">set unreachable</A>
8289
  		<A HREF="pragma.html#conv">type representation</A>
8290
  		<A HREF="pragma.html#weak">weak</A>
8291
 
8292
 
8293
  	<A id="type-directive"><I>type-directive</I></A> :
8294
  		<A HREF="pragma.html#reach">bottom <I>identifier</I></A>
8295
  		<A HREF="pragma.html#char">character <I>character-sign</I></A>
8296
  		<A HREF="pragma.html#identifier">character <I>character-literal character-mapping</I></A>
8297
  		<A HREF="pragma.html#identifier">character <I>string-literal character-mapping</I></A>
8298
  		<A HREF="lib.html#arith">compute promote <I>identifier</I></A>
8299
  		<A HREF="pragma.html#escape">escape <I>character-literal character-mapping</I></A>
8300
  		<A HREF="pragma.html#int">integer literal <I>literal-spec</I></A>
8301
  		<A HREF="lib.html#arith">promoted <I>type-id</I> : <I>type-id</I></A>
8302
  		<A HREF="pragma.html#char">set character literal : <I>type-id</I></A>
8303
  		<A HREF="pragma.html#longlong">set longlong type : <I>longlong-spec</I></A>
8304
  		<A HREF="pragma.html#char">set ptrdiff_t : <I>type-id</I></A>
8305
  		<A HREF="pragma.html#char">set size_t : <I>type-id</I></A>
8306
  		<A HREF="pragma.html#char">set wchar_t : <I>type-id</I></A>
8307
  		<A HREF="pragma.html#string">set string literal : <I>string-const</I></A>
8308
  		<A HREF="pragma.html#std">set std namespace : <I>scope-name</I></A>
8309
  		<A HREF="#type-spec">type <I>identifier</I> for <I>type-spec</I></A>
8310
 
8311
  	<I>character-sign</I> :
8312
  		signed
8313
  		unsigned
8314
  		either
8315
 
8316
  	<I>character-mapping</I> :
8317
  		as <I>character-literal</I> allow
8318
  		disallow
8319
 
8320
  	<I>literal-spec</I> :
8321
  		<I>literal-base literal-suffix<SUB>opt</SUB> literal-type-list</I>
8322
 
8323
  	<I>literal-base</I> :
8324
  		decimal
8325
  		octal
8326
  		hexadecimal
8327
 
8328
  	<I>literal-suffix</I> :
8329
  		unsigned
8330
  		long
8331
  		unsigned long
8332
  		long long
8333
  		unsigned long long
8334
 
8335
  	<I>literal-type-list</I> :
8336
  		* <I>literal-type-spec</I>
8337
  		<I>integer-literal literal-type-spec</I> | <I>literal-type-list</I>
8338
  		? <I>literal-type-spec</I> | <I>literal-type-list</I>
8339
 
8340
  	<I>literal-type-spec</I> :
8341
  		: <I>type-id</I>
8342
  		* <I>allow<SUB>opt</SUB></I> : <I>identifier</I>
8343
  		* * <I>allow<SUB>opt</SUB></I> :
8344
 
8345
  	<I>longlong-spec</I> :
8346
  		long
8347
  		long long
8348
 
8349
  	<I>string-const</I> :
8350
  		const
8351
  		no const
8352
 
8353
  	<I>scope-name</I> :
8354
  		<I>identifier</I>
8355
  		::
8356
 
8357
  	<A id="type-spec"><I>type-spec</I></A> :
8358
  		<A HREF="pragma.html#reach">bottom</A>
8359
  		<A HREF="pragma.html#char">ptrdiff_t</A>
8360
  		<A HREF="pragma.html#char">size_t</A>
8361
  		<A HREF="pragma.html#char">wchar_t</A>
8362
  		<A HREF="pragma.html#printf">... printf</A>
8363
  		<A HREF="pragma.html#printf">... scanf</A>
8364
 
8365
 
8366
  	<A id="linkage"><I>linkage-directive</I></A> :
8367
  		<A HREF="pragma.html#linkage">const linkage <I>linkage</I></A>
8368
  		<A HREF="pragma.html#linkage">external linkage <I>string-literal</I></A>
8369
  		<A HREF="pragma.html#linkage">external volatile_t</A>
8370
  		<A HREF="pragma.html#linkage">inline linkage <I>linkage</I></A>
8371
  		<A HREF="pragma.html#linkage">linkage resolution : <I>linkage-spec</I></A>
8372
 
8373
  	<I>linkage</I> :
8374
  		external
8375
  		internal
8376
 
8377
  	<I>linkage-spec</I> :
8378
  		( <I>linkage</I> ) on
8379
  		( <I>linkage</I> ) warning
8380
  		off
8381
 
8382
 
8383
  	<A id="misc"><I>misc-directive</I></A> :
8384
  		<A HREF="pragma.html#weak">argument <I>type-id</I> as ...</A>
8385
  		<A HREF="pragma.html#weak">argument <I>type-id</I> as <I>type-id</I></A>
8386
  		<A HREF="pragma.html#compatible">compatible type : <I>type-id</I> == <I>type-id</I> : <I>allow</I></A>
8387
  		<A HREF="pragma.html#conv">conversion <I>identifier-list</I> allow</A>
8388
  		<A HREF="dump.html#scope">declaration block <I>identifier</I> begin</A>
8389
  		<A HREF="dump.html#scope">declaration block end</A>
8390
  		<A HREF="pragma.html#ppdir">directive <I>directive-spec directive-state</I></A>
8391
  		<A HREF="pragma.html#variable">discard <I>expression</I></A>
8392
  		<A HREF="pragma.html#switch">exhaustive</A>
8393
  		<A HREF="pragma.html#cast">explicit cast <I>cast-spec<SUB>opt</SUB> allow</I></A>
8394
  		<A HREF="pragma.html#include">includes depth <I>integer-literal</I></A>
8395
  		<A HREF="pragma.html#static">preserve <I>preserve-list</I></A>
8396
  		<A HREF="pragma.html#variable">set <I>expression</I></A>
8397
  		<A HREF="pragma.html#limits">set error limit <I>integer-literal</I></A>
8398
  		<A HREF="pragma.html#identifier">set name limit <I>integer-literal</I> warning<I><SUB>opt</SUB></I></A>
8399
  		<A HREF="pragma.html#discard">suspend static <I>identifier-list</I></A>
8400
 
8401
  	<I>directive-spec</I> :
8402
  		assert
8403
  		file
8404
  		ident
8405
  		import
8406
  		include_next
8407
  		unassert
8408
  		warning
8409
  		weak
8410
 
8411
  	<I>directive-state</I> :
8412
  		allow
8413
  		warning
8414
  		disallow
8415
  		( ignore ) allow
8416
  		( ignore ) warning
8417
 
8418
  	<I>cast-operator</I> :
8419
  		static_cast
8420
  		const_cast
8421
  		reinterpret_cast
8422
 
8423
  	<I>cast-spec</I> :
8424
  		as <I>cast-operator</I>
8425
  		<I>cast-spec</I> | <I>cast-operator</I>
8426
 
8427
  	<I>preserve-list</I> :
8428
  		<I>identifier-list</I>
8429
  		*
8430
 
8431
  	<I>identifier-list</I> :
8432
  		<I>identifier identifier-list<SUB>opt</SUB></I>
8433
 
8434
 
8435
  	<A id="token"><I>token-directive</I></A> :
8436
  		<A HREF="token.html#spec">token <I>token-spec</I></A>
8437
  		<A HREF="token.html#tokdef">no_def <I>token-list</I></A>
8438
  		<A HREF="token.html#tokdef">define <I>token-list</I></A>
8439
  		<A HREF="token.html#tokdef">ignore <I>token-list</I></A>
8440
  		<A HREF="token.html#tokdef">interface <I>token-list</I></A>
8441
  		<A HREF="token.html#tokdef">undef token <I>token-list</I></A>
8442
  		<A HREF="token.html#tokdef">extend interface <I>header-name</I></A>
8443
  		<A HREF="token.html#tokdef">implement interface <I>header-name</I></A>
8444
 
8445
  	<A id="token1"><I>tendra-token-directive</I></A> :
8446
  		<A HREF="token.html#spec">token <I>token-spec</I></A>
8447
  		<A HREF="token.html#tokdef">no_def <I>token-list</I></A>
8448
  		<A HREF="token.html#tokdef">define <I>token-list</I></A>
8449
  		<A HREF="token.html#tokdef">reject <I>token-list</I></A>
8450
  		<A HREF="token.html#tokdef">interface <I>token-list</I></A>
8451
  		<A HREF="token.html#tokdef">undef token <I>token-list</I></A>
8452
  		<A HREF="token.html#tokdef">extend <I>header-name</I></A>
8453
  		<A HREF="token.html#tokdef">implement <I>header-name</I></A>
8454
  		<A HREF="token.html#tokdef">member definition <I>type-id</I> : <I>identifier member-offset</I></A>
8455
 
8456
  	<I>member-offset</I> :
8457
  		::<I><SUB>opt</SUB> id-expression</I>
8458
  		<I>member-offset</I> . ::<I><SUB>opt</SUB> id-expression</I>
8459
  		<I>member-offset</I> [ <I>constant-expression</I> ]
8460
 
8461
  	<I>token-list</I> :
8462
  		<I>token-id token-list<SUB>opt</SUB></I>
8463
  		# <I>preproc-token-list</I>
8464
 
8465
  	<I>token-id</I> :
8466
  		<I>token-namespace<SUB>opt</SUB> identifier</I>
8467
  		<I>type-id</I> . <I>identifier</I>
8468
 
8469
 
8470
  	<I>token-spec</I> :
8471
  		<I>token-introduction token-identification</I>
8472
 
8473
  	<I>token-introduction</I> :
8474
  		<I>exp-token</I>
8475
  		<I>statement-token</I>
8476
  		<I>type-token</I>
8477
  		<I>member-token</I>
8478
  		<I>procedure-token</I>
8479
 
8480
  	<I>token-identification</I> :
8481
  		<I>token-namespace<SUB>opt</SUB> identifier</I> # <I>external-identifier<SUB>opt</SUB></I>
8482
 
8483
  	<I>token-namespace</I> :
8484
  		TAG
8485
 
8486
  	<I>external-identifier</I> :
8487
  		-
8488
  		<I>preproc-token-list</I>
8489
 
8490
  	<I>exp-token</I> :
8491
  		EXP <I>exp-storage<SUB>opt</SUB></I> : <I>type-id</I> :
8492
  		NAT
8493
  		INTEGER
8494
 
8495
  	<I>exp-storage</I> :
8496
  		lvalue
8497
  		rvalue
8498
  		const
8499
 
8500
  	<I>statement-token</I> :
8501
  		STATEMENT
8502
 
8503
  	<I>type-token</I> :
8504
  		TYPE
8505
  		VARIETY
8506
  		VARIETY signed
8507
  		VARIETY unsigned
8508
  		FLOAT
8509
  		ARITHMETIC
8510
  		SCALAR
8511
  		CLASS
8512
  		STRUCT
8513
  		UNION
8514
 
8515
  	<I>member-token</I> :
8516
  		MEMBER <I>access-specifier<SUB>opt</SUB> member-type-id</I> : <I>type-id</I> :
8517
 
8518
  	<I>member-type-id</I> :
8519
  		<I>type-id</I>
8520
  		<I>type-id</I> % <I>constant-expression</I>
8521
 
8522
  	<I>access-specifier</I> :
8523
  		public
8524
  		protected
8525
  		private
8526
 
8527
  	<I>procedure-token</I> :
8528
  		<I>general-procedure</I>
8529
  		<I>simple-procedure</I>
8530
  		<I>function-procedure</I>
8531
 
8532
  	<I>general-procedure</I> :
8533
  		PROC { <I>bound-toks<SUB>opt</SUB></I> | <I>prog-pars<SUB>opt</SUB></I> } <I>token-introduction</I>
8534
 
8535
  	<I>bound-toks</I> :
8536
  		<I>bound-token</I>
8537
  		<I>bound-token</I> , <I>bound-toks</I>
8538
 
8539
  	<I>bound-token</I> :
8540
  		<I>token-introduction token-namespace<SUB>opt</SUB> identifier</I>
8541
 
8542
  	<I>prog-pars</I> :
8543
  		<I>program-parameter</I>
8544
  		<I>program-parameter</I> , <I>prog-pars</I>
8545
 
8546
  	<I>program-parameter</I> :
8547
  		EXP <I>identifier</I>
8548
  		STATEMENT <I>identifier</I>
8549
  		TYPE <I>type-id</I>
8550
  		MEMBER <I>type-id</I> : <I>identifier</I>
8551
  		PROC <I>identifier</I>
8552
 
8553
  	<I>simple-procedure</I> :
8554
  		PROC ( <I>simple-toks<SUB>opt</SUB></I> ) <I>token-introduction</I>
8555
 
8556
  	<I>simple-toks</I> :
8557
  		<I>simple-token</I>
8558
  		<I>simple-token</I> , <I>simple-toks</I>
8559
 
8560
  	<I>simple-token</I> :
8561
  		<I>token-introduction token-namespace<SUB>opt</SUB> identifier<SUB>opt</SUB></I>
8562
 
8563
  	<I>function-procedure</I> :
8564
  		FUNC <I>type-id</I> :
8565
  </programlisting>
8566
  </para>
8567
  </sect1>
8568
 
8569
  <sect1>
8570
    <title>Annex B. Symbol table dump syntax</title>
8571
  <para>
8572
  The following gives a summary of the syntax for the 
8573
  <A HREF="dump.html">symbol table dump file</A> (version 1.1): 
8574
  <programlisting>
8575
 
8576
  	<I>dump-file</I> :
8577
  		<I>command-list<SUB>opt</SUB></I>
8578
 
8579
  	<I>command-list</I> :
8580
  		<I>command command-list<SUB>opt</SUB></I>
8581
 
8582
  	<I>command</I> :
8583
  		<I>version-command</I>
8584
  		<I>identifier-command</I>
8585
  		<I>scope-command</I>
8586
  		<I>override-command</I>
8587
  		<I>base-command</I>
8588
  		<I>api-command</I>
8589
  		<I>template-command</I>
8590
  		<I>promotion-command</I>
8591
  		<I>error-command</I>
8592
  		<I>path-command</I>
8593
  		<I>file-command</I>
8594
  		<I>include-command</I>
8595
  		<I>string-command</I>
8596
 
8597
  	<I>version-command</I> :
8598
  		V <I>number number string</I>
8599
 
8600
 
8601
  	<I>location</I> :
8602
  		<I>number number number string string</I>
8603
  		<I>number number number string</I> *
8604
  		<I>number number number</I> *
8605
  		<I>number number</I> *
8606
  		<I>number</I> *
8607
  		*
8608
 
8609
 
8610
  	<I>identifier</I> :
8611
  		<I>number</I> = <I>identifier-name access<SUB>opt</SUB> scope-identifier</I>
8612
  		<I>number</I>
8613
 
8614
  	<I>identifier-name</I> :
8615
  		<I>string</I>
8616
  		C <I>type</I>
8617
  		D <I>type</I>
8618
  		O <I>string</I>
8619
  		T <I>type</I>
8620
 
8621
  	<I>access</I> :
8622
  		N
8623
  		B
8624
  		P
8625
 
8626
  	<I>scope-identifier</I> :
8627
  		<I>identifier</I>
8628
  		*
8629
 
8630
  	<I>identifier-command</I> :
8631
  		D <I>identifier-info type-info</I>
8632
  		M <I>identifier-info type-info</I>
8633
  		T <I>identifier-info type-info</I>
8634
  		Q <I>identifier-info</I>
8635
  		U <I>identifier-info</I>
8636
  		L <I>identifier-info</I>
8637
  		C <I>identifier-info</I>
8638
  		W <I>identifier-info type-info</I>
8639
  		I <I>identifier-command</I>
8640
 
8641
  	<I>identifier-info</I> :
8642
  		<I>identifier-key location identifier</I>
8643
 
8644
  	<I>identifier-key</I> :
8645
  		K
8646
  		MO
8647
  		MF
8648
  		MB
8649
  		TC
8650
  		TS
8651
  		TU
8652
  		TE
8653
  		TA
8654
  		NN
8655
  		NA
8656
  		VA
8657
  		VP
8658
  		VE
8659
  		VS
8660
  		FE <I>function-key<SUB>opt</SUB></I>
8661
  		FS <I>function-key<SUB>opt</SUB></I>
8662
  		FB <I>function-key<SUB>opt</SUB></I>
8663
  		CF <I>function-key<SUB>opt</SUB></I>
8664
  		CS <I>function-key<SUB>opt</SUB></I>
8665
  		CV <I>function-key<SUB>opt</SUB></I>
8666
  		CM
8667
  		CD
8668
  		E
8669
  		L
8670
  		XO
8671
  		XF
8672
  		XP
8673
  		XT
8674
 
8675
  	<I>function-key</I> :
8676
  		C <I>function-key<SUB>opt</SUB></I>
8677
  		I <I>function-key<SUB>opt</SUB></I>
8678
 
8679
  	<I>type-info</I> :
8680
  		<I>type identifier<SUB>opt</SUB></I>
8681
  		<I>sort</I>
8682
  		<I>scope-identifier</I>
8683
  		*
8684
 
8685
 
8686
  	<I>scope-command</I> :
8687
  		SS <I>scope-key location identifier</I>
8688
  		SE <I>scope-key location identifier</I>
8689
 
8690
  	<I>scope-key</I> :
8691
  		N
8692
  		S
8693
  		B
8694
  		D
8695
  		H
8696
  		CT
8697
  		CF
8698
  		CC
8699
 
8700
 
8701
  	<I>override-command</I> :
8702
  		O <I>identifier identifier</I>
8703
 
8704
 
8705
  	<I>base-command</I> :
8706
  		B <I>identifier-key identifier base-graph</I>
8707
 
8708
  	<I>base-graph</I> :
8709
  		<I>base-class</I>
8710
  		<I>base-class</I> ( <I>base-list</I> )
8711
 
8712
  	<I>base-class</I> :
8713
  		<I>number</I> = V<I><SUB>opt</SUB> access<SUB>opt</SUB> type-name</I>
8714
  		<I>number</I> :
8715
 
8716
  	<I>base-list</I> :
8717
  		<I>base-graph base-list<SUB>opt</SUB></I>
8718
 
8719
  	<I>base-number</I> :
8720
  		<I>number</I> : <I>type-name</I>
8721
 
8722
 
8723
  	<I>api-command</I> :
8724
  		X <I>identifier-key identifier string</I>
8725
 
8726
 
8727
  	<I>template-command</I> :
8728
  		Z <I>identifier-key identifier token-application specialise-info</I>
8729
 
8730
  	<I>specialise-info</I> :
8731
  		<I>identifier</I>
8732
  		<I>token-application</I>
8733
  		*
8734
 
8735
 
8736
  	<I>type</I> :
8737
  		<I>type-name</I>
8738
  		c
8739
  		s
8740
  		i
8741
  		l
8742
  		x
8743
  		b
8744
  		w
8745
  		y
8746
  		z
8747
  		f
8748
  		d
8749
  		r
8750
  		v
8751
  		u
8752
  		Sc
8753
  		Uc
8754
  		Us
8755
  		Ui
8756
  		Ul
8757
  		Ux
8758
  		C <I>type</I>
8759
  		V <I>type</I>
8760
  		P <I>type</I>
8761
  		R <I>type</I>
8762
  		M <I>type-name</I> : <I>type</I>
8763
  		F <I>type parameter-types</I>
8764
  		A <I>nat<SUB>opt</SUB></I> : <I>type</I>
8765
  		B <I>nat</I> : <I>type</I>
8766
  		t <I>parameter-list<SUB>opt</SUB></I> : <I>type</I>
8767
  		p <I>type</I>
8768
  		a <I>type</I> : <I>type</I>
8769
  		n <I>lit-base<SUB>opt</SUB> lit-suffix<SUB>opt</SUB></I>
8770
  		W <I>type parameter-types</I>
8771
  		q <I>type</I>
8772
  		Q <I>string</I>
8773
  		*
8774
 
8775
  	<I>type-name</I> :
8776
  		<I>identifier</I>
8777
  		<I>token-application</I>
8778
 
8779
  	<I>parameter-types</I> :
8780
  		: <I>exception-spec<SUB>opt</SUB> func-qualifier<SUB>opt</SUB></I> :
8781
  		. <I>exception-spec<SUB>opt</SUB> func-qualifier<SUB>opt</SUB></I> :
8782
  		. <I>exception-spec<SUB>opt</SUB> func-qualifier<SUB>opt</SUB></I> .
8783
  		, <I>type parameter-types</I>
8784
 
8785
  	<I>func-qualifier</I> :
8786
  		C <I>func-qualifier<SUB>opt</SUB></I>
8787
  		V <I>func-qualifier<SUB>opt</SUB></I>
8788
 
8789
  	<I>exception-spec</I> :
8790
  		( <I>exception-list<SUB>opt</SUB></I> )
8791
 
8792
  	<I>exception-list</I> :
8793
  		<I>type</I>
8794
  		<I>type</I> , <I>exception-list</I>
8795
 
8796
  	<I>nat</I> :
8797
  		+ <I>number</I>
8798
  		- <I>number</I>
8799
  		<I>identifier</I>
8800
  		<I>token-application</I>
8801
  		<I>string</I>
8802
 
8803
  	<I>parameter-list</I> :
8804
  		<I>identifier</I>
8805
  		<I>identifier</I> , <I>parameter-list</I>
8806
 
8807
  	<I>lit-base</I> :
8808
  		O
8809
  		X
8810
 
8811
  	<I>lit-suffix</I> :
8812
  		U
8813
  		l
8814
  		Ul
8815
  		x
8816
  		Ux
8817
 
8818
 
8819
  	<I>promotion-command</I> :
8820
  		P <I>type</I> : <I>type</I>
8821
 
8822
 
8823
  	<I>sort</I> :
8824
  		<I>expression-sort</I>
8825
  		<I>statement-sort</I>
8826
  		<I>type-sort</I>
8827
  		<I>tag-type-sort</I>
8828
  		<I>member-sort</I>
8829
  		<I>proc-sort</I>
8830
  		<I>func-sort</I>
8831
  		<I>template-sort</I>
8832
  		<I>macro-sort</I>
8833
 
8834
  	<I>expression-sort</I> :
8835
  		ZEL <I>type</I>
8836
  		ZER <I>type</I>
8837
  		ZEC <I>type</I>
8838
  		ZN
8839
 
8840
  	<I>statement-sort</I> :
8841
  		ZS
8842
 
8843
  	<I>type-sort</I> :
8844
  		ZTO
8845
  		ZTI
8846
  		ZTF
8847
  		ZTA
8848
  		ZTP
8849
  		ZTS
8850
  		ZTU
8851
 
8852
  	<I>tag-type-sort</I> :
8853
  		ZTTS
8854
  		ZTTU
8855
 
8856
  	<I>member-sort</I> :
8857
  		ZM <I>type</I> : <I>type-name</I>
8858
 
8859
  	<I>proc-sort</I> :
8860
  		ZPG <I>parameter-list<SUB>opt</SUB></I> ; <I>parameter-list<SUB>opt</SUB></I> : <I>sort</I>
8861
  		ZPS <I>parameter-list<SUB>opt</SUB></I> : <I>sort</I>
8862
 
8863
  	<I>func-sort</I> :
8864
  		ZF <I>type</I>
8865
 
8866
  	<I>template-sort</I> :
8867
  		ZTt <I>parameter-list<SUB>opt</SUB></I> :
8868
 
8869
  	<I>macro-sort</I> :
8870
  		ZUO
8871
  		ZUF <I>number</I>
8872
 
8873
  	<I>token-application</I> :
8874
  		T <I>identifier</I> , <I>token-argument-list</I> :
8875
 
8876
  	<I>token-argument-list</I> :
8877
  		<I>token-argument</I>
8878
  		<I>token-argument</I> , <I>token-argument-list</I>
8879
 
8880
  	<I>token-argument</I> :
8881
  		E <I>expression</I>
8882
  		N <I>nat</I>
8883
  		S <I>statement</I>
8884
  		T <I>type</I>
8885
  		M <I>member</I>
8886
  		F <I>identifier</I>
8887
  		C <I>identifier</I>
8888
 
8889
  	<I>expression</I> :
8890
  		<I>nat</I>
8891
 
8892
  	<I>statement</I> :
8893
  		<I>expression</I>
8894
 
8895
  	<I>member</I> :
8896
  		<I>identifier</I>
8897
  		<I>string</I>
8898
 
8899
 
8900
  	<I>error-name</I> :
8901
  		<I>number</I> = <I>string</I>
8902
  		<I>number</I>
8903
 
8904
  	<I>error-command</I> :
8905
  		ES <I>location error-info</I>
8906
  		EW <I>location error-info</I>
8907
  		EI <I>location error-info</I>
8908
  		EF <I>location error-info</I>
8909
  		EC <I>error-info</I>
8910
  		EA <I>error-argument</I>
8911
 
8912
  	<I>error-info</I> :
8913
  		<I>error-name number number</I>
8914
 
8915
  	<I>error-argument</I> :
8916
  		B <I>base-number</I>
8917
  		C <I>scope-identifier</I>
8918
  		E <I>expression</I>
8919
  		H <I>identifier-name</I>
8920
  		I <I>identifier</I>
8921
  		L <I>location</I>
8922
  		N <I>nat</I>
8923
  		S <I>string</I>
8924
  		T <I>type</I>
8925
  		V <I>number</I>
8926
  		V - <I>number</I>
8927
 
8928
 
8929
  	<I>path-command</I> :
8930
  		FD <I>number</I> = <I>string string<SUB>opt</SUB></I>
8931
 
8932
  	<I>directory</I> :
8933
  		<I>number</I>
8934
  		*
8935
 
8936
  	<I>file-command</I> :
8937
  		FS <I>location directory</I>
8938
  		FE <I>location</I>
8939
 
8940
  	<I>include-command</I> :
8941
  		FIA <I>location string</I>
8942
  		FIQ <I>location string</I>
8943
  		FIN <I>location string</I>
8944
  		FIS <I>location string</I>
8945
  		FIE <I>location string</I>
8946
  		FIR <I>location</I>
8947
 
8948
 
8949
  	<I>string-command</I> :
8950
  		A <I>location string</I>
8951
  		AC <I>location string</I>
8952
  		AL <I>location string</I>
8953
  		ACL <I>location string</I>
8954
  </programlisting>
8955
  </para>
8956
  </sect1>
8957
 
8958
  <sect1>
8959
    <title>Annex C. Error catalogue syntax</title>
8960
  <para>
8961
  The following gives a summary of the syntax for the 
8962
  <A HREF="error.html">error catalogue</A> accepted by the 
8963
  <code>make_err</code> tool.  Identifiers are normal C-style identifiers,
8964
  strings consist of any sequence of characters enclosed inside 
8965
  <code>&quot;....&quot;</code>.  The escape sequences <code>\&quot;</code>
8966
  and 
8967
  <code>\\</code> are allowed in strings; other characters (including
8968
  newline characters) map to themselves.  C-style comments are allowed.
8969
  <programlisting>
8970
 
8971
  	<I>error-database</I> :
8972
  		<I>header types<SUB>opt</SUB> properties<SUB>opt</SUB> keys<SUB>opt</SUB> usages<SUB>opt</SUB> entries<SUB>opt</SUB></I>
8973
 
8974
  	<I>header</I> :
8975
  		<I>database-name<SUB>opt</SUB> rig-name<SUB>opt</SUB> prefixes<SUB>opt</SUB></I>
8976
 
8977
 
8978
  	<I>database-name</I> :
8979
  		DATABASE_NAME : <I>identifier</I>
8980
 
8981
  	<I>rig-name</I> :
8982
  		RIG : <I>identifier</I>
8983
 
8984
 
8985
  	<I>prefixes</I> :
8986
  		PREFIX : <I>output-prefix<SUB>opt</SUB> compiler-prefix<SUB>opt</SUB> error-prefix<SUB>opt</SUB></I>
8987
 
8988
  	<I>output-prefix</I> :
8989
  		compiler_output -&gt; <I>identifier</I>
8990
 
8991
  	<I>compiler-prefix</I> :
8992
  		from_compiler -&gt; <I>identifier</I>
8993
 
8994
  	<I>error-prefix</I> :
8995
  		from_database -&gt; <I>identifier</I>
8996
 
8997
 
8998
  	<I>types</I> :
8999
  		TYPES : <I>name-list<SUB>opt</SUB></I>
9000
 
9001
  	<I>properties</I> :
9002
  		PROPERTIES : <I>name-list<SUB>opt</SUB></I>
9003
 
9004
  	<I>keys</I> :
9005
  		KEYS : <I>name-list<SUB>opt</SUB></I>
9006
 
9007
  	<I>usages</I> :
9008
  		USAGE : <I>name-list<SUB>opt</SUB></I>
9009
 
9010
  	<I>name</I> :
9011
  		<I>identifier</I>
9012
  		<I>identifier</I> = <I>identifier</I>
9013
  		<I>identifier</I> = <I>identifier</I> | <I>identifier</I>
9014
 
9015
  	<I>name-list</I> :
9016
  		<I>name</I>
9017
  		<I>name</I> , <I>name-list</I>
9018
 
9019
 
9020
  	<I>type-name</I> :
9021
  		<I>identifier</I>
9022
 
9023
  	<I>property-name</I> :
9024
  		<I>identifier</I>
9025
 
9026
  	<I>key-name</I> :
9027
  		<I>identifier</I>
9028
 
9029
  	<I>usage-name</I> :
9030
  		<I>identifier</I>
9031
 
9032
 
9033
  	<I>entries</I> :
9034
  		ENTRIES : <I>entries-list<SUB>opt</SUB></I>
9035
 
9036
  	<I>entry-list</I> :
9037
  		<I>entry entry-list<SUB>opt</SUB></I>
9038
 
9039
  	<I>entry</I> :
9040
  		<I>identifier</I> ( <I>param-list<SUB>opt</SUB></I> ) { <I>entry-body</I> }
9041
 
9042
  	<I>entry-body</I> :
9043
  		<I>alt-name<SUB>opt</SUB> entry-usage<SUB>opt</SUB> entry-properties<SUB>opt</SUB> map-list<SUB>opt</SUB></I>
9044
 
9045
 
9046
  	<I>parameter</I> :
9047
  		<I>type-name</I> : <I>identifier</I>
9048
 
9049
  	<I>param-list</I> :
9050
  		<I>parameter</I>
9051
  		<I>parameter</I> , <I>param-list</I>
9052
 
9053
  	<I>param-name</I> :
9054
  		<I>identifier</I>
9055
 
9056
 
9057
  	<I>alt-name</I> :
9058
  		ALT_NAME : <I>identifier</I>
9059
 
9060
  	<I>entry-usage</I> :
9061
  		USAGE : <I>usage-name</I>
9062
  		USAGE : <I>usage-name</I> | <I>usage-name</I>
9063
 
9064
  	<I>entry-properties</I> :
9065
  		PROPERTIES : <I>property-list<SUB>opt</SUB></I>
9066
 
9067
  	<I>property-list</I> :
9068
  		<I>property-name</I>
9069
  		<I>property-name</I> , <I>property-list</I>
9070
 
9071
 
9072
  	<I>map</I> :
9073
  		KEY ( <I>key-name</I> ) <I>message-list<SUB>opt</SUB></I>
9074
  		KEY ( <I>key-name</I> ) <I>message-list<SUB>opt</SUB></I> | <I>message-list<SUB>opt</SUB></I>
9075
 
9076
  	<I>map-list</I> :
9077
  		<I>map map-list<SUB>opt</SUB></I>
9078
 
9079
  	<I>message-list</I> :
9080
  		<I>string message-list<SUB>opt</SUB></I>
9081
  		<I>param-name message-list<SUB>opt</SUB></I>
9082
        </programlisting>
9083
      </para>
9084
    </sect1>
9085
  </chapter>
9086
</book>