Subversion Repositories tendra.SVN

Rev

Rev 6 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
6 7u83 1
<?xml version="1.0" standalone="no"?>
2
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
3
  "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
4
 
5
<!--
6
  $Id$
7
-->
8
 
9
<book>
10
  <bookinfo>
11
    <title>tspec - An API Specification Tool</title>
12
 
13
    <corpauthor>The TenDRA Project</corpauthor>
14
 
15
    <author>
16
      <firstname>Jeroen</firstname>
17
      <surname>Ruigrok van der Werven</surname>
18
    </author>
19
    <authorinitials>JRvdW</authorinitials>
20
    <pubdate>2004</pubdate>
21
 
22
    <copyright>
23
      <year>2004</year>
24
      <year>2005</year>
25
 
26
      <holder>The TenDRA Project</holder>
27
    </copyright>
28
 
29
    <copyright>
30
      <year>1998</year>
31
 
32
      <holder>DERA</holder>
33
    </copyright>
34
  </bookinfo>
35
 
36
  <chapter id="Intro">
37
    <title>Introduction</title>
38
 
39
      <para>As explained in reference 1, TDF may be regarded as an abstract
40
        target machine which can be used to facilitate the separation of
41
        target independent and target dependent code which characterises
42
        portable programs. An important aspect of this separation is the
43
        Application Programming Interface, or API, of the program. Just as,
44
        for a conventional machine, the API needs to be implemented on that
45
        machine before the program can be ported to it, so for that program to
46
        be ported to the abstract TDF machine, an "abstract implementation" of
47
        the API needs to be provided.</para>
48
 
49
      <para>But of course, an "abstract implementation" is precisely what is
50
        provided by the API specification - it is an abstraction of all the
51
        possible API implementations. Therefore the TDF representation of an
52
        API must reflect the API specification. As a consequence, compiling a
53
        program to the abstract TDF machine is to check it against the API
54
        specification rather than, as with compiling to a conventional
55
        machine, against at best a particular implementation of that
56
        API.</para>
57
 
58
      <para>In this document we address the problem of how to translate a
59
        standard API specification into its TDF representation, by describing
60
        a tool, <command>tspec</command>, which has been developed for
61
        this purpose.</para>
62
 
63
      <para>The low level form which is used to represent APIs to the C to TDF
64
        producer is the <code>#pragma token</code> syntax described in
65
        reference 3. However this is not a suitable form in which to describe
66
        API specifications. The <code>#pragma token</code> syntax is
67
        necessarily complex, and can only be checked through extensive testing
68
        using the producer. Instead an alternative form, close to C, has been
69
        developed for this purpose. API specifications in this form are
70
        transformed by <command>tspec</command> into the corresponding
71
        <code>#pragma token</code> statements, while it applies various
72
        internal checks to the API description.</para>
73
 
74
      <para>Another reason for introducing <code>tspec</code> is that the
75
        <code>#pragma token</code> syntax is currently limited in some areas.
76
        For example, at present it has very limited support for expressing
77
        constancy of expressions. By allowing the <code>tspec</code> syntax to
78
        express this information, the API description will contain all the
79
        information which may be needed in future upgrades to the
80
        <code>#pragma token</code> syntax. Thus describing an API using
81
        <code>tspec</code> is hopefully a one off process, whereas describing
82
        it directly to the <code>#pragma token</code> syntax could require
83
        periodic reworkings. Improvements in the <code>#pragma token</code>
84
        syntax will be reflected in the translations produced by future
85
        versions of <command>tspec</command>.</para>
86
 
87
      <para>The <code>tspec</code> syntax is not designed to be a formal
88
        specification language. Instead it is a pragmatic attempt to capture
89
        the common specification idioms of standard API specifications. A
90
        glance at these specifications shows that they are predominantly C
91
        based, but with an added layer of abstraction - instead of saying that
92
        <code>t</code> is a specific C type, they say, there exists a type
93
        <code>t</code>, and so on. The <code>tspec</code> syntax is designed
94
        to reflect this.</para>
95
  </chapter>
96
 
97
  <chapter id="Overview">
98
    <title>Overview of tspec</title>
99
 
100
      <sect1 id="Levels">
101
        <title>Specification Levels</title>
102
 
103
        <para>Let us begin by examining the various levels of specification
104
          with which <command>tspec</command> is concerned. At the
105
          lowest level it is concerned with objects - the types, expressions,
106
          constants etc.  which comprise the API - and indeed most of this
107
          document is concerned with how <command>tspec</command>
108
          describes these objects. At the highest level,
109
          <command>tspec</command> is concerned with APIs. We could
110
          just describe an API as being a set of objects, however this is to
111
          ignore the internal structure of APIs.</para>
112
 
113
        <para>At the most obvious level the objects in an API are spread over
114
          a number of different system headers. For example, in ANSI, the
115
          objects concerned with file input and output are grouped in
116
          <code>stdio.h</code>, whereas those concerned with string
117
          manipulation are in <code>string.h</code>. But a further level of
118
          refinement is also required. For example, ANSI specifies that the
119
          type <code>size_t</code> is defined in both <code>stdio.h</code> and
120
          <code>string.h</code>. Therefore <code>tspec</code> needs to be able
121
          to represent subsets of headers in order to express this
122
          intersection relation.</para>
123
 
124
        <para>To conclude, <code>tspec</code> distinguishes four levels of
125
          specification - APIs (which are sets of headers), headers (which are
126
          sets of objects), subsets of headers, and objects. It identifies
127
          APIs by an identifying name chosen by the person performing the API
128
          description. The (purely arbitrary) convention is for short, lower
129
          case names, for example:
130
 
131
          <itemizedlist>
132
            <listitem>
133
              <para><code>ansi</code> refers to ANSI C (X3.159),</para>
134
            </listitem>
135
 
136
            <listitem>
137
              <para><code>posix</code> refers to POSIX 1003.1,</para>
138
            </listitem>
139
 
140
            <listitem>
141
              <para><code>xpg3</code> refers to X/Open Portability Guide
142
                3.</para>
143
            </listitem>
144
          </itemizedlist></para>
145
 
146
        <para>In this document, headers are identified by the API they belong
147
          to and the header name. Thus <code>ansi:stdio.h</code> refers to the
148
          <code>stdio.h</code> header of the ANSI API.  Finally subsets of
149
          headers are identified by the header and the subset name. If, for
150
          example, the <code>stdio.h</code> header of ANSI has a subset named
151
          <code>file</code>, then this is referred to as
152
          <code>ansi:stdio.h:file</code>.</para>
153
      </sect1>
154
 
155
      <sect1 id="Input">
156
        <title>Input Layout</title>
157
 
158
        <para>The <code>tspec</code> representation of an API is arranged as a
159
          directory with the same name as the API, containing a number of
160
          files, one for each API header. For example, the ANSI API is
161
          represented by a directory <code>ansi</code> containing files
162
          <code>ansi/stdio.h</code>, <code>ansi/string.h</code> etc. In
163
          addition each API directory contains a master file (for ANSI it
164
          would be called <code>ansi/MASTER</code>) which lists all the
165
          headers comprising that API.</para>
166
 
167
        <para>When <code>tspec</code> needs to find an API directory it does
168
          so by searching along its input directory path. This is a colon
169
          separated list of directories to be searched. This may be specified
170
          in a number of ways. A default search list is built into
171
          <code>tspec</code>, however this may be overridden by the system
172
          variable <code>TSPEC_INPUT</code>. Directories may be added to the
173
          start of the path using the
174
          <option>-I</option><filename>dir</filename> command-line option (see
175
          <link linkend="Options">section 2.5</link> for a complete list of
176
          options). The current working directory is always added to the start
177
          of the path.</para>
178
      </sect1>
179
 
180
      <sect1 id="Output">
181
        <title>Output Layout</title>
182
 
183
        <para><code>tspec</code> actually outputs two sets of output files,
184
          the include output files, containing the <code>#pragma token</code>
185
          directives corresponding to the input API, and the source output
186
          files, which provide a rig for TDF library building (see
187
          <link linkend="Libraries">section 6.4</link>). These output files and
188
          directories are built up under two standard output directories - the
189
          include output directory, <filename>incl_dir</filename> say, and the
190
          source output directory, <filename>src_dir</filename> say.
191
          <code>tspec</code> has default values for these directories built
192
          in, but these may be overridden in a number of ways. Firstly, if the
193
          system variable <code>TSPEC_OUTPUT</code> is defined to be
194
          <filename>dir</filename>, say, then <filename>incl_dir</filename> is
195
          <filename>dir/include</filename> and <filename>src_dir</filename> is
196
          <filename>dir/src</filename>. Secondly,
197
          <filename>incl_dir</filename> and <filename>src_dir</filename> can
198
          be set independently using the system variables
199
          <code>TSPEC_INCL_OUTPUT</code> and <code>TSPEC_SRC_OUTPUT</code>
200
          respectively. Finally, they may also be set using the
201
          <option>-O</option><filename>dir</filename> and
202
          <option>-S</option><filename>dir</filename> command-line options
203
          respectively.</para>
204
 
205
        <para>As an example of the mapping from input files to output files,
206
          the header <code>ansi:stdio.h</code> is mapped to the include output
207
          file <filename>incl_dir/ansi.api/stdio.h</filename> and the source
208
          output file <filename>src_dir/ansi.api/stdio.c</filename>.  The
209
          header subset <code>ansi:stdio.h:file</code> is mapped to its own
210
          pair of output files,
211
          <filename>incl_dir/shared/ansi.api/file.h</filename> and
212
          <filename>src_dir/ansi.api/file.c</filename>.</para>
213
 
214
        <para>The default output file names can be overridden by means of the
215
          <code>INCLNAME</code> and <code>SOURCENAME</code> file properties
216
          described in <link linkend="Properties">section 5.4</link>.</para>
217
 
218
        <para>By default, <code>tspec</code> only creates an output file if
219
          the date stamps on all the input files it depends on indicate that
220
          it needs updating. In effect, <code>tspec</code> creates an internal
221
          makefile from the dependencies it deduces. This behaviour can be
222
          overridden by means of the <option>-f</option> command-line option,
223
          which forces all output files to be created.</para>
224
 
225
        <para>In addition, <code>tspec</code> only creates the source output
226
          file if it is needed for TDF library building. If the corresponding
227
          include output file does not contain any token specifications then
228
          the source output file is suppressed (see
229
          <link linkend="Libraries">section 6.4</link>).</para>
230
      </sect1>
231
 
232
      <sect1 id="Copyright">
233
        <title>Copyright Messages</title>
234
 
235
        <para><code>tspec</code> will optionally add a copyright message to
236
          the start of each include output file. This message is copied from a
237
          file which may be specified either using the
238
          <code>TSPEC_COPYRIGHT</code> system variable, or by the
239
          <option>-C</option><filename>file</filename> command-line
240
          option.</para>
241
      </sect1>
242
 
243
      <sect1 id="Options">
244
        <title>Command-line Options</title>
245
 
246
        <para>There are three main forms for invoking <code>tspec</code> on
247
          the command-line, depending on whether it is desired to process an
248
          entire API, a single header from that API, or only a subset of that
249
          header. These are given respectively as:
250
 
251
          <programlisting>
252
tspec [options] api
253
tspec [options] api header
254
tspec [options] api header subset
255
          </programlisting></para>
256
 
257
        <para>The valid options include:
258
          <itemizedlist>
259
            <listitem>
260
              <para>The option <option>-C</option><filename>file</filename>
261
                specifies the copyright message file (see
262
                <link linkend="Copyright">section 2.4</link>).</para>
263
            </listitem>
264
 
265
            <listitem>
266
              <para>The option <option>-I</option><filename>dir</filename>
267
                adds a directory to the input directory search path (see
268
                <link linkend="Input">section 2.2</link>).</para>
269
            </listitem>
270
 
271
            <listitem>
272
              <para>The option <option>-O</option><filename>dir</filename>
273
                specifies the include output directory (see
274
                <link linkend="Output">section 2.3</link>).</para>
275
            </listitem>
276
 
277
            <listitem>
278
              <para>The option <option>-S</option><filename>dir</filename>
279
                specifies the source output directory (see
280
                <link linkend="Output">section 2.3</link>).</para>
281
            </listitem>
282
 
283
            <listitem>
284
              <para>The <option>-c</option> option causes <code>tspec</code>
285
              to only check the input files and not to generate any output
286
              files.</para>
287
            </listitem>
288
 
289
            <listitem>
290
              <para>The <option>-e</option> option causes <code>tspec</code>
291
                only to run its preprocessor phase, writing the result to the
292
                standard output.</para>
293
              </listitem>
294
 
295
            <listitem>
296
              <para>The <option>-f</option> option forces <code>tspec</code>
297
                to create all output files regardless of date
298
                stamps.</para>
299
            </listitem>
300
 
301
            <listitem>
302
              <para>The <option>-i</option> option causes <code>tspec</code>
303
              to print an index of all the objects in the input files (see
304
              <link linkend= "Index">section 6.3</link>).</para>
305
            </listitem>
306
 
307
            <listitem>
308
              <para>The <option>-p</option> option indicates to
309
                <code>tspec</code> that its input has already been
310
                preprocessed (i.e. it is the output of a previous
311
                <option>-e</option> option).</para>
312
            </listitem>
313
 
314
            <listitem>
315
              <para>The <option>-r</option> option causes <code>tspec</code>
316
              to only produce output for implemented objects, and not used
317
              objects (see <link linkend="Impl">section 3.2</link>).</para>
318
            </listitem>
319
 
320
            <listitem>
321
              <para>The <option>-s</option> option causes <code>tspec</code>
322
              to check all the headers in an API separately rather than, as
323
              with the <option>-c</option> option, all at once.</para>
324
            </listitem>
325
 
326
            <listitem>
327
              <para>The <option>-u</option> option causes <code>tspec</code>
328
              to generate unique token names for the specified objects (see
329
              <link linkend="Names">section 4.1.1</link>).</para>
330
            </listitem>
331
 
332
            <listitem>
333
              <para>The <option>-v</option> option causes <code>tspec</code>
334
              to enter verbose mode, in which it reports on the output files
335
              it creates. If two <option>-v</option> options are given then
336
              <code>tspec</code> enters very verbose mode, in which it gives
337
              more information on its activities.</para>
338
            </listitem>
339
 
340
            <listitem>
341
              <para>The <option>-V</option> option causes <code>tspec</code>
342
              to print its current version number (this document refers to
343
              version 2.0).</para>
344
            </listitem>
345
          </itemizedlist></para>
346
 
347
        <para>In addition <code>tspec</code> has a local input mode for
348
          translating single headers which are not part of an API into the
349
          corresponding <code>#pragma token</code> statements. The form:
350
 
351
          <programlisting>
352
tspec [options] -l file
353
          </programlisting>
354
 
355
          processes the input file <code>file</code>, writing the include
356
          output file to the standard output.
357
        </para>
358
      </sect1>
359
  </chapter>
360
 
361
  <chapter id="Structure">
362
    <title>Specifying API Structure</title>
363
 
364
      <para>The basic form of the <code>tspec</code> description of an API has
365
        already been explained in <link linkend="Input">section 2.2</link> -
366
        it is a directory containing a set of files corresponding to the
367
        headers in that API. Each file basically consists of a list of the
368
        objects declared in that header. Each object specification is part of
369
        a <code>tspec</code> construct. These constructs are identified by
370
        keywords. These keywords always begin with <code>+</code> to avoid
371
        conflict with C identifiers. Comments may be inserted at any point.
372
        These are prefixed by <code>#</code> and run to the end of the
373
        line.</para>
374
 
375
      <para>In addition to the basic object specification constructs,
376
        <code>tspec</code> also has constructs for imposing structure on the
377
        API description. It is these constructs that we consider first.</para>
378
 
379
      <sect1 id="Subset">
380
        <title>+SUBSET</title>
381
 
382
        <para>A list of <code>tspec</code> constructs within a header can be
383
          grouped into a named subset by enclosing them within:
384
 
385
          <programlisting>
386
+SUBSET "name" := {
387
    ....
388
} ;
389
          </programlisting>
390
 
391
          where <code>name</code> is the subset name. These named subsets can
392
          be nested, but are still regarded as subsets of the parent
393
          header.</para>
394
 
395
        <para>Subsets are intended to give a layer of resolution beyond that
396
          of the entire header (see <link linkend="Levels">section
397
          2.1</link>).  Each subset is mapped onto a separate pair of output
398
          files, so unwary use of subsets is discouraged.</para>
399
      </sect1>
400
 
401
      <sect1 id="Impl">
402
        <title>+IMPLEMENT and +USE</title>
403
 
404
        <para><code>tspec</code> has two import constructs which allow one
405
          API, or header, or subset of a header to be included in another.
406
          The first construct is used to indicate that the given set of
407
          objects is also declared in the including header, and takes one of
408
          the forms:
409
 
410
          <programlisting>
411
+IMPLEMENT "api" ;
412
+IMPLEMENT "api", "header" ;
413
+IMPLEMENT "api", "header", "subset" ;
414
          </programlisting></para>
415
 
416
        <para>The second construct is used to indicate that the objects are
417
          only used in the including header, and take one of the forms:
418
 
419
          <programlisting>
420
+USE "api" ;
421
+USE "api", "header" ;
422
+USE "api", "header", "subset" ;
423
          </programlisting></para>
424
 
425
        <para>For example, <code>posix:stdio.h</code> is an extension of
426
          <code>ansi:stdio.h</code> , so, rather than duplicate all the object
427
          specifications from the latter in the former, it is easier and
428
          clearer to use the construct:
429
 
430
          <programlisting>
431
+IMPLEMENT "ansi", "stdio.h" ;
432
          </programlisting>
433
 
434
          and just add the extra objects specified by POSIX. Note that this
435
          makes the relationship between the APIs <code>ansi</code> and
436
          <code>posix</code> absolutely explicit. <code>tspec</code> is as
437
          much concerned with the relationships between APIs as their actual
438
          contents.</para>
439
 
440
        <para>Objects which are specified as being declared in more than one
441
          header of an API should also be treated using
442
          <code>+IMPLEMENT</code>. For example, the type <code>size_t</code>
443
          is declared in a number of <code>ansi</code> headers, namely
444
          <code>stddef.h</code>, <code>stdio.h</code>, <code>string.h</code>
445
          and <code>time.h</code>. This can be handled by declaring
446
          <code>size_t</code> as part of a named subset of, say,
447
          <code>ansi:stddef.h</code>:
448
 
449
          <programlisting>
450
+SUBSET "size_t" := {
451
    +TYPE (unsigned) size_t ;
452
} ;
453
          </programlisting>
454
 
455
          and including this in each of the other headers:
456
 
457
          <programlisting>
458
+IMPLEMENT "ansi", "stddef.h", "size_t" ;
459
          </programlisting></para>
460
 
461
        <para>Another use of <code>+IMPLEMENT</code> is in the
462
          <code>MASTER</code> file used to list the headers in an API (see
463
          <link linkend="Input">section 2.2</link>). This basically consists
464
          of a list of <code>+IMPLEMENT</code> commands, one per header. For
465
          example, with <code>ansi</code> it consists of:
466
 
467
          <programlisting>
468
+IMPLEMENT "ansi", "assert.h" ;
469
+IMPLEMENT "ansi", "ctype.h" ;
470
....
471
+IMPLEMENT "ansi", "time.h" ;
472
          </programlisting></para>
473
 
474
        <para>To illustrate <code>+USE</code>, <code>posix:sys/stat.h</code>
475
          uses some types from <code>posix:sys/types.h</code> but does not
476
          define them. To avoid the user having to include both headers it
477
          makes sense for the description to include the latter in the former
478
          (provided there are no namespace restrictions imposed by the API).
479
          This would be done using the construct:
480
 
481
          <programlisting>
482
+USE "posix", "sys/types.h" ;
483
          </programlisting></para>
484
 
485
        <para>On the command-line <code>tspec</code> is given one set of
486
          objects, be it an API, a header, or a subset of a header. This
487
          causes it to read that set, which may contain
488
          <code>+IMPLEMENT</code> or <code>+USE</code> commands. It then reads
489
          the sets indicated by these commands, which again may contain
490
          <code>+IMPLEMENT</code> or <code>+USE</code> commands, and so on. It
491
          is possible for this process to lead to infinite cycles, but in this
492
          case <code>tspec</code> raises an error and aborts. In the legal
493
          case, the collection of sets read by <code>tspec</code> is the
494
          closure of the set given on the command-line under
495
          <code>+IMPLEMENT</code> and <code>+USE</code>.  Some of these sets
496
          will be implemented - that it to say, connected to the top level by
497
          a chain of <code>+IMPLEMENT</code> commands - others will merely be
498
          used. By default <code>tspec</code> produces output for all these
499
          sets, but specifying the <option>-r</option> command-line option
500
          restricts it to the implemented sets.</para>
501
 
502
        <para>For further information on the <code>+IMPLEMENT</code> and
503
          <code>+USE</code> commands see
504
          <link linkend="FineImpl">section 6.1</link>.</para>
505
      </sect1>
506
  </chapter>
507
 
508
  <chapter id="Objects">
509
    <title>Specifying Objects</title>
510
 
511
      <para>The main body of any <code>tspec</code> description of an API
512
      consists of a list of object specifications. Most of this section
513
      is concerned with the various <code>tspec</code> constructs for
514
      specifying objects of various kinds, however we start with a few
515
      remarks on object names.</para>
516
 
517
      <sect1 id="S41">
518
        <title>Object Names</title>
519
 
520
        <sect2 id="Names">
521
          <title>Internal and External Names</title>
522
 
523
          <para>All objects specified using <code>tspec</code> actually have
524
            two names. The first is the internal name by which it is
525
            identified within the program, the second is the external name by
526
            which the TDF construct (actually a token) representing this
527
            object is referred to for the purposes of TDF linking. The
528
            internal names are normal C identifiers and obey the normal C
529
            namespace rules (indeed one of the roles of <code>tspec</code> is
530
            to keep track of these namespaces). The external token name is
531
            constructed by <code>tspec</code> from the internal name.</para>
532
 
533
          <para><code>tspec</code> has two strategies for making up these
534
            token names. The first, which is default, is to use the internal
535
            name as the external name (there is an exception to this simple
536
            rule, namely field selectors - see
537
            <link linkend="Field">section 4.9</link>). The second, which is preferred
538
            for standard APIs, is to construct a "unique name" from the API
539
            name, the header and the internal name. For example, under the
540
            first strategy, the external name of the type <code>FILE</code>
541
            specified in <code>ansi:stdio.h</code> would be <code>FILE</code>,
542
            whereas under the second it would be <code>ansi.stdio.FILE</code>.
543
            The unique name strategy may be specified by passing the
544
            <option>-u</option> command-line option to <code>tspec</code> (see
545
            <link linkend="Options">section 2.5</link>) or by setting the
546
            <code>UNIQUE</code> property to 1 (see
547
            <link linkend="Properties">section 5.4</link>).</para>
548
 
549
          <para>Both strategies involve flattening the several C namespaces
550
            into the single TDF token namespace, which can lead to clashes.
551
            For example, in <code>posix:sys/stat.h</code> both a structure,
552
            <code>struct stat</code>, and a procedure, <code>stat</code>, are
553
            specified. In C the two uses of <code>stat</code> are in different
554
            namespaces and so present no difficulty, however they are mapped
555
            onto the same name in the TDF token namespace. To work round such
556
            difficulties, <code>tspec</code> allows an alternative external
557
            form to be specified. When the object is specified the form:
558
 
559
            <programlisting>
560
iname | ename
561
            </programlisting>
562
 
563
            may be used to specify the internal name <code>iname</code> and
564
            the external name <code>ename</code>.</para>
565
 
566
          <para>For example, in the <code>stat</code> case above we could
567
            distinguish between the two uses as follows:
568
 
569
            <programlisting>
570
+TYPE struct stat | struct_stat ;
571
+FUNC int stat ( const char *, struct stat * ) ;
572
            </programlisting></para>
573
 
574
          <para>With simple token names the token corresponding to the
575
            structure would be called <code>struct_stat</code>, whereas that
576
            corresponding to the procedure would still be <code>stat</code>.
577
            With unique token names the names would be
578
            <code>posix.stat.struct_stat</code> and
579
            <code>posix.stat.stat</code> respectively.</para>
580
 
581
          <para>Very occasionally it may be necessary to precisely specify an
582
            external token name. This can be done using the form:
583
 
584
            <programlisting>
585
iname | "ename"
586
            </programlisting>
587
 
588
            which makes the object <code>iname</code> have external name
589
            <code>ename</code> regardless of the naming strategy used.</para>
590
        </sect2>
591
 
592
        <sect2 id="Identifiers">
593
          <title>More on Object Names</title>
594
 
595
          <para>Basically the legal identifiers in <code>tspec</code> (for
596
            both internal and external names) are the same as those in C -
597
            strings of upper and lower case letters, decimal digits or
598
            underscores, which do not begin with a decimal digit. However
599
            there is a second class of local identifiers - those consisting of
600
            a tilde followed by any number of letters, digits or underscores -
601
            which are intended to indicate objects which are local to the API
602
            description and should not be visible to any application using the
603
            API. For example, to express the specification that <code>t</code>
604
            is a pointer type, we could say that there is a locally named type
605
            to which <code>t</code> is a pointer:
606
 
607
            <programlisting>
608
+TYPE ~t ;
609
+TYPEDEF ~t *t ;
610
            </programlisting></para>
611
 
612
          <para>Finally it is possible to cheat the <code>tspec</code>
613
            namespaces. It may actually be legal to have two objects of the
614
            same name in an API - they may lie in different branches of a
615
            conditional compilation, or not be allowed to coexist. To allow
616
            for this, <code>tspec</code> allows version numbers, consisting of
617
            a decimal pointer plus a number of digits, to be appended to an
618
            identifier name when it is first introduced. These version numbers
619
            are purely to tell <code>tspec</code> that this version of the
620
            object is different from a previous version with a different
621
            version number (or indeed without any version number).  If more
622
            than one version of an object is specified then which version is
623
            retrieved by <code>tspec</code> in any look-up operation is
624
            undefined.</para>
625
        </sect2>
626
      </sect1>
627
 
628
      <sect1 id="Func">
629
        <title>+FUNC</title>
630
 
631
        <para>The simplest form of object to specify is a procedure. This is
632
          done by means of:
633
 
634
          <programlisting>
635
+FUNC prototype ;
636
          </programlisting>
637
 
638
          where <code>prototype</code> is the full C prototype of the
639
          procedure being declared. For example, <code>ansi:string.h</code>
640
          contains:
641
 
642
          <programlisting>
643
+FUNC char *strcpy ( char *, const char * ) ;
644
+FUNC int strcmp ( const char *, const char * ) ;
645
+FUNC size_t strlen ( const char * ) ;
646
          </programlisting></para>
647
 
648
        <para>Strictly speaking, <code>+FUNC</code> means that the procedure
649
          may be implemented by a macro, but that there is an underlying
650
          library function with the same effect. The exception is for
651
          procedures which take a variable number of arguments, such as:
652
 
653
          <programlisting>
654
+FUNC int fprintf ( FILE *, const char *, ... ) ;
655
          </programlisting>
656
 
657
          which cannot be implemented by macros. Occasionally it may be
658
          necessary to specify that a procedure is only a library function,
659
          and cannot be implemented by a macro. In this case the form:
660
 
661
          <programlisting>
662
+FUNC (extern) prototype ;
663
          </programlisting>
664
 
665
          should be used. Thus:
666
 
667
          <programlisting>
668
+FUNC (extern) char *strcpy ( char *, const char * ) ;
669
          </programlisting>
670
 
671
          would mean that <code>strcpy</code> was only a library function and
672
          not a macro.</para>
673
 
674
        <para>Increasingly standard APIs are using prototypes to express their
675
          procedures. However it still may be necessary on occasion to specify
676
          procedures declared using old style declarations. In most cases
677
          these can be easily transcribed into prototype declarations, however
678
          things are not always that simple. For example,
679
          <code>xpg3:stdlib.h</code> declares <code>malloc</code> by the old
680
          style declaration:
681
 
682
          <programlisting>
683
void *malloc ( sz )
684
size_t sz ;
685
          </programlisting>
686
 
687
          which is in general different from the prototype:
688
 
689
          <programlisting>
690
void *malloc ( size_t ) ;
691
          </programlisting></para>
692
 
693
        <para>In the first case the argument is passed as the integral
694
          promotion of <code>size_t</code>, whereas in the second it is passed
695
          as a <code>size_t</code> . In general we only know that
696
          <code>size_t</code> is an unsigned integral type, so we cannot
697
          assert that it is its own integral promotion. One possible solution
698
          would be to use the C to TDF producer's weak prototypes (see
699
          reference 3). The form:
700
 
701
          <programlisting>
702
+FUNC (weak) void *malloc ( size_t ) ;
703
          </programlisting>
704
 
705
          means that <code>malloc</code> is a library function returning
706
          <code>void *</code> which is declared using an old style declaration
707
          with a single argument of type <code>size_t</code>.  (For an
708
          alternative approach see
709
          <link linkend="Typedef">section 4.8</link>.)</para>
710
      </sect1>
711
 
712
      <sect1 id="Exp">
713
        <title>+EXP and +CONST</title>
714
 
715
        <para>Expressions correspond to constants, identities and variables.
716
          They are specified by:
717
 
718
          <programlisting>
719
+EXP type exp1, ..., expn ;
720
          </programlisting>
721
 
722
          where <code>type</code> is the base type of the expressions
723
          <code>expi</code> as in a normal C declaration list. For example, in
724
          <code>ansi:stdio.h</code>:
725
 
726
          <programlisting>
727
+EXP FILE *stdin, *stdout, *stderr ;
728
          </programlisting>
729
 
730
          specifies three expressions of type <code>FILE *</code>.</para>
731
 
732
        <para>By default all expressions are rvalues, that is, values which
733
          cannot be assigned to. If an lvalue (assignable) expression is
734
          required its type should be qualified using the keyword
735
          <code>lvalue</code>. This is an extension to the C type syntax which
736
          is used in a similar fashion to <code>const</code>. For example,
737
          <code>ansi:errno.h</code> says that <code>errno</code> is an
738
          assignable lvalue of type <code>int</code>. This is expressed as
739
          follows:
740
 
741
          <programlisting>
742
+EXP lvalue int errno ;
743
          </programlisting></para>
744
 
745
        <para>On the other hand, <code>posix:errno.h</code> states that
746
          <code>errno</code> is an external value of type <code>int</code>.
747
          As with procedures the <code>(extern)</code> qualifier may be used
748
          to express this as:
749
 
750
          <programlisting>
751
+EXP (extern) int errno ;
752
          </programlisting>
753
 
754
          Note that this automatically means that <code>errno</code> is an
755
          lvalue, so the <code>lvalue</code> qualifier is optional in this
756
          case.</para>
757
 
758
        <para>If all the expressions are guaranteed to be literal constants
759
          then one of the equivalent forms:
760
 
761
          <programlisting>
762
+EXP (const) type exp1, ..., expn ;
763
+CONST type exp1, ..., expn ;
764
          </programlisting>
765
 
766
          should be used. For example, in <code>ansi:errno.h</code> we have:
767
 
768
          <programlisting>
769
+CONST int EDOM, ERANGE ;
770
          </programlisting></para>
771
      </sect1>
772
 
773
      <sect1 id="Macro">
774
        <title>+MACRO</title>
775
 
776
        <para>The <code>+MACRO</code> construct is similar in form to the
777
          <code>+FUNC</code> construct, except that it means that only a macro
778
          exists, and no underlying library function. For example, in
779
          <code>xpg3:ctype.h</code> we have:
780
 
781
          <programlisting>
782
+MACRO int _toupper ( int ) ;
783
+MACRO int _tolower ( int ) ;
784
          </programlisting>
785
 
786
          since these are explicitly stated to be macros and not functions. Of
787
          course the <code>(extern)</code> qualifier cannot be used with
788
          <code>+MACRO</code>.</para>
789
 
790
        <para>One thing which macros can do which functions cannot is to
791
          return assignable values or to assign to their arguments. Thus it is
792
          legitimate for <code>+MACRO</code> constructs to have their return
793
          type or argument types qualified by <code>lvalue</code>, whereas
794
          this is not allowed for <code>+FUNC</code> constructs.  For example,
795
          in <code>svid3:curses.h</code>, a macro <code>getyx</code> is
796
          specified which takes a pointer to a window and two integer
797
          variables and assigns the cursor position of the window to those
798
          variables. This may be expressed by:
799
 
800
          <programlisting>
801
+MACRO void getyx ( WINDOW *win, lvalue int y, lvalue int x ) ;
802
          </programlisting></para>
803
      </sect1>
804
 
805
      <sect1 id="Statement">
806
        <title>+STATEMENT</title>
807
 
808
        <para>The <code>+STATEMENT</code> construct is very similar to the
809
          <code>+MACRO</code> construct except that, instead of being a C
810
          expression, it is a C statement (i.e. something ending in a
811
          semicolon). As such it does not have a return type and so takes one
812
          of the forms:
813
 
814
          <programlisting>
815
+STATEMENT stmt ;
816
+STATEMENT stmt ( arg1, ..., argn ) ;
817
          </programlisting>
818
 
819
          depending on whether or not it takes any arguments. (A <code>
820
          +MACRO</code> without any arguments is an <code>+EXP</code>, so the
821
          no argument form does not exist for <code>+MACRO</code>.) As with
822
          <code>+MACRO</code>, the argument types <code>argi</code> can be
823
          qualified using <code>lvalue</code>.</para>
824
      </sect1>
825
 
826
      <sect1 id="Define">
827
        <title>+DEFINE</title>
828
 
829
        <para>It is possible to insert macro definitions directly into
830
          <code>tspec</code> using the <code>+DEFINE</code> construct. This
831
          has two forms depending on whether the macro has arguments:
832
 
833
          <programlisting>
834
+DEFINE name %% text %% ;
835
+DEFINE name ( arg1, ..., argn ) %% text %% ;
836
          </programlisting></para>
837
 
838
        <para>These translate directly into:
839
 
840
          <programlisting>
841
#define name text
842
#define name( arg1, ..., argn ) text
843
          </programlisting></para>
844
 
845
        <para>The macro definition, <code>text</code>, consists of any string
846
          of characters delimited by double percents. If <code>text</code> is
847
          a simple number or a single identifier then the double percents may
848
          be omitted. Thus in <code>ansi:stddef.h</code> we have:
849
 
850
          <programlisting>
851
+DEFINE NULL 0 ;
852
          </programlisting></para>
853
      </sect1>
854
 
855
      <sect1 id="Type">
856
        <title>+TYPE</title>
857
 
858
        <para>New types may be specified using the <code>+TYPE</code>
859
          construct. This has the form:
860
 
861
          <programlisting>
862
+TYPE type1, ..., typen ;
863
          </programlisting>
864
 
865
          where each <code>typei</code> has one of the forms:
866
 
867
          <itemizedlist>
868
            <listitem>
869
              <para><code>name</code> for a general type (about which we know
870
                nothing more),</para>
871
            </listitem>
872
 
873
            <listitem>
874
              <para><code>(struct) name</code> for a structure
875
                type,</para>
876
            </listitem>
877
 
878
            <listitem>
879
              <para><code>(union) name</code> for a union type,</para>
880
            </listitem>
881
 
882
            <listitem>
883
              <para><code>struct name</code> for a structure tag,</para>
884
            </listitem>
885
 
886
            <listitem>
887
              <para><code>union name</code> for a union tag,</para>
888
            </listitem>
889
 
890
            <listitem>
891
              <para><code>(int) name</code> for an integral type,</para>
892
            </listitem>
893
 
894
            <listitem>
895
              <para><code>(signed) name</code> for a signed integral
896
                type,</para>
897
            </listitem>
898
 
899
            <listitem>
900
              <para><code>(unsigned) name</code> for an unsigned integral
901
                type,</para>
902
            </listitem>
903
 
904
            <listitem>
905
              <para><code>(float) name</code> for a floating type,</para>
906
            </listitem>
907
 
908
            <listitem>
909
              <para><code>(arith) name</code> for an arithmetic (integral or
910
                floating) type,</para>
911
            </listitem>
912
 
913
            <listitem>
914
              <para><code>(scalar) name</code> for a scalar (arithmetic or
915
                pointer) type.</para>
916
            </listitem>
917
          </itemizedlist></para>
918
 
919
        <para>To make clear the distinction between structure types and
920
          structure tags, if we have in C:
921
 
922
          <programlisting>
923
typedef struct tag { int x, y ; } type ;
924
          </programlisting>
925
 
926
          then <code>type</code> is a structure type and <code>tag</code> is a
927
          structure tag.</para>
928
 
929
        <para>For example, in <code>ansi</code> we have:
930
 
931
          <programlisting>
932
+TYPE FILE ;
933
+TYPE struct lconv ;
934
+TYPE (struct) div_t ;
935
+TYPE (signed) ptrdiff_t ;
936
+TYPE (unsigned) size_t ;
937
+TYPE (arith) time_t ;
938
+TYPE (int) wchar_t ;
939
          </programlisting></para>
940
      </sect1>
941
 
942
      <sect1 id="Typedef">
943
        <title>+TYPEDEF</title>
944
 
945
        <para>It is also possible to define new types in terms of existing
946
          types. This is done using the <code>+TYPEDEF</code> construct, which
947
          is identical in form to the C <code>typedef</code> construct. This
948
          construct can be used to define pointer, procedure and array types,
949
          but not compound structure and union types. For these see
950
          <link linkend="Field">section 4.9</link> below.</para>
951
 
952
        <para>For example, in <code>xpg3:search.h</code> we have:
953
 
954
          <programlisting>
955
+TYPE struct entry ;
956
+TYPEDEF struct entry ENTRY ;
957
          </programlisting></para>
958
 
959
        <para>There are a couple of special forms. To understand the first,
960
          note that C uses <code>void</code> function returns for two
961
          purposes.  Firstly to indicate that the function does not return a
962
          value, and secondly to indicate that the function does not return at
963
          all (<code>exit</code> is an example of this second usage). In TDF
964
          terms, in the first case the function returns <code>TOP</code>, in
965
          the second it returns <code>BOTTOM</code> .  <code>tspec</code>
966
          allows types to be introduced which have the second meaning. For
967
          example, we could have:
968
 
969
          <programlisting>
970
+TYPEDEF ~special ( "bottom" ) ~bottom ;
971
+FUNC ~bottom exit ( int ) ;
972
          </programlisting>
973
 
974
          meaning that the local type <code>~bottom</code> is the
975
          <code>BOTTOM</code> form of <code>void</code>. The procedure <code>
976
          exit</code>, which never returns, can then be declared to return
977
          <code>~bottom</code> rather than <code>void</code>. Other such
978
          special types may be added in future.</para>
979
 
980
        <para>The second special form:
981
 
982
          <programlisting>
983
+TYPEDEF ~promote ( x ) y ;
984
          </programlisting>
985
 
986
          means that <code>y</code> is an integral type which is the integral
987
          promotion of <code>x</code>. <code>x</code> must have previously
988
          been declared as an integral type. This gives an alternative
989
          approach to the old style procedure declaration problem described in
990
          <link linkend="Func">section 4.2</link>. Recall that:
991
 
992
          <programlisting>
993
void *malloc ( sz )
994
size_t sz ;
995
          </programlisting>
996
 
997
          means that <code>malloc</code> has one argument which is passed as
998
          the integral promotion of <code>size_t</code>. This could be
999
          expressed as follows:
1000
 
1001
          <programlisting>
1002
+TYPEDEF ~promote ( size_t ) ~size_t ;
1003
+FUNC void *malloc ( ~size_t ) ;
1004
          </programlisting>
1005
 
1006
          introducing a local type to stand for the integral promotion of
1007
          <code>size_t</code>.</para>
1008
      </sect1>
1009
 
1010
      <sect1 id="Field">
1011
        <title>+FIELD</title>
1012
 
1013
        <para>Having specified a structure or union type, or a structure or
1014
          union tag, we may wish to specify certain fields of this structure
1015
          or union. This is done using the <code>+FIELD</code> construct. This
1016
          takes the form:
1017
 
1018
<programlisting>
1019
          +FIELD type {
1020
              ftype field1, ..., fieldn ;
1021
              ....
1022
          } ;
1023
</programlisting>
1024
 
1025
          where <code>type</code> is the structure or union type and
1026
          <code>field1</code>, ..., <code>fieldn</code> are field selectors
1027
          derived from the base type <code>ftype</code> as in a normal C
1028
          structure definition. <code>type</code> may have one of the forms:
1029
 
1030
          <itemizedlist>
1031
            <listitem>
1032
              <para><code>(struct) name</code> for a structure type,</para>
1033
            </listitem>
1034
 
1035
            <listitem>
1036
              <para><code>(union) name</code> for a union type,</para>
1037
            </listitem>
1038
 
1039
            <listitem>
1040
              <para><code>struct name</code> for a structure tag,</para>
1041
            </listitem>
1042
 
1043
            <listitem>
1044
              <para><code>union name</code> for a union tag,</para>
1045
            </listitem>
1046
 
1047
            <listitem>
1048
              <para><code>name</code> for a previously declared structure or
1049
                union type.</para>
1050
            </listitem>
1051
          </itemizedlist></para>
1052
 
1053
        <para>Except in the final case (where it is not clear if
1054
          <code>type</code> is a structure or a union), it is not necessary to
1055
          have previously introduced <code>type</code> using a
1056
          <code>+TYPE</code> construct - this declaration is implicit in the
1057
          <code>+FIELD</code> construct.</para>
1058
 
1059
        <para>For example, in <code>ansi:time.h</code> we have:
1060
 
1061
          <programlisting>
1062
+FIELD struct tm {
1063
    int tm_sec ;
1064
    int tm_min ;
1065
    int tm_hour ;
1066
    int tm_mday ;
1067
    int tm_mon ;
1068
    int tm_year ;
1069
    int tm_wday ;
1070
    int tm_yday ;
1071
    int tm_isdst ;
1072
} ;
1073
          </programlisting>
1074
 
1075
          meaning that there exists a structure with tag <code>tm</code> with
1076
          various fields of type <code>int</code>. Any implementation must
1077
          have these corresponding fields, but they need not be in the given
1078
          order, nor do they have to comprise the whole structure.</para>
1079
 
1080
        <para>As was mentioned above (in <link linkend="Names">4.1.1</link>),
1081
          field selectors form a special case when <code>tspec</code> is
1082
          making up external token names. For example, in the case above, the
1083
          token name for the <code>tm_sec</code> field is either
1084
          <code>tm.tm_sec</code> or <code>ansi.time.tm.tm_sec</code>,
1085
          depending on whether or not unique token names are used.</para>
1086
 
1087
        <para>It is possible to have several <code>+FIELD</code> constructs
1088
          referring to the same structure or union. For example,
1089
          <code>posix:dirent.h</code> declares a structure with tag
1090
          <code>dirent</code> and one field, <code>d_name</code>, of this
1091
          structure. <code>xpg3:dirent.h</code> extends this by adding another
1092
          field, <code>d_ino</code>.</para>
1093
 
1094
        <para>There is a second form of the <code>+FIELD</code> construct
1095
          which has more in common with the <code>+TYPEDEF</code> construct.
1096
          The form:
1097
 
1098
          <programlisting>
1099
+FIELD type := {
1100
    ftype field1, ..., fieldn ;
1101
    ....
1102
} ;
1103
          </programlisting>
1104
 
1105
          means that the type <code>type</code> is defined to be exactly the
1106
          given structure or union type, with precisely the given fields in
1107
          the given order.</para>
1108
      </sect1>
1109
 
1110
      <sect1 id="Nat">
1111
        <title>+NAT</title>
1112
 
1113
        <para>In the example given in
1114
          <link linkend="Field">section 4.9</link>,
1115
          <code>posix:dirent.h</code> specifies that the <code>d_name</code>
1116
          field of <code>struct dirent</code> is a fixed sized array of
1117
          characters, but that the size of this array is implementation
1118
          dependent. We therefore have to introduce a value to stand for the
1119
          size of this array using the <code>+NAT</code> construct. This has
1120
          the form:
1121
 
1122
          <programlisting>
1123
+NAT nat1, ..., natn ;
1124
          </programlisting>
1125
 
1126
          where <code>nat1</code>, ..., <code>natn</code> are the array sizes
1127
          to be declared. The example thus becomes:
1128
 
1129
          <programlisting>
1130
+NAT ~dirent_d_name_size ;
1131
+FIELD struct dirent {
1132
    char d_name [ ~dirent_d_name_size ] ;
1133
} ;
1134
          </programlisting>
1135
 
1136
          Note the use of a local variable to stand for a value, namely the
1137
          array size, which is invisible to the user (see
1138
          <link linkend="Identifiers">section 4.1.2</link>).</para>
1139
 
1140
        <para>As another example, in <code>ansi:setjmp.h</code> we know that
1141
          <code>jmp_buf</code> is an array type. We therefore introduce
1142
          objects to stand for the type which it is an array of and for the
1143
          size of the array, and define <code>jmp_buf</code> by a
1144
          <code>+TYPEDEF</code> command:
1145
 
1146
          <programlisting>
1147
+NAT ~jmp_buf_size ;
1148
+TYPE ~jmp_buf_elt ;
1149
+TYPEDEF ~jmp_buf_elt jmp_buf [ ~jmp_buf_size ] ;
1150
          </programlisting></para>
1151
 
1152
        <para>Again, local variables have been used for the introduced
1153
          objects.</para>
1154
      </sect1>
1155
 
1156
      <sect1 id="Enum">
1157
        <title>+ENUM</title>
1158
 
1159
        <para>Currently <code>tspec</code> only has limited support for
1160
          enumeration types. A <code>+ENUM</code> construct is translated
1161
          directly into a C definition of an enumeration type. The
1162
          <code>+ENUM</code> construct has the form:
1163
 
1164
          <programlisting>
1165
+ENUM etype := {
1166
    entry,
1167
    ....
1168
} ;
1169
          </programlisting>
1170
 
1171
          where <code>etype</code> is the enumeration type being defined -
1172
          either a type name or <code>enum etag</code> for some enumeration
1173
          tag <code>etag</code> - and each <code>entry</code> has one of the
1174
          forms:
1175
 
1176
          <programlisting>
1177
name
1178
name = number
1179
          </programlisting>
1180
 
1181
          as in a C enumeration type. For example, in
1182
          <code>xpg3:search.h</code> we have:
1183
 
1184
          <programlisting>
1185
+ENUM ACTION := { FIND, ENTER } ;
1186
          </programlisting></para>
1187
      </sect1>
1188
 
1189
      <sect1 id="Token">
1190
        <title>+TOKEN</title>
1191
 
1192
        <para>As was mentioned in <link linkend="Intro">section 1</link>, the
1193
          <code>#pragma token</code> syntax is highly complex, and the token
1194
          descriptions output by <code>tspec</code> form only a small subset
1195
          of those possible. It is possible to directly access the full
1196
          <code>#pragma token</code> syntax from <code>tspec</code> using the
1197
          construct:
1198
 
1199
          <programlisting>
1200
+TOKEN name %% text %% ;
1201
          </programlisting>
1202
 
1203
          where the token <code>name</code> is defined by the sequence of
1204
          characters <code>text</code>, which is delimited by double percents.
1205
          This is turned into the token description:
1206
 
1207
          <programlisting>
1208
#pragma token text name #
1209
          </programlisting></para>
1210
 
1211
        <para>No checks are applied to <code>text</code>. A more sophisticated
1212
          mechanism for defining complex tokens may be introduced in a later
1213
          version of <code>tspec</code>.</para>
1214
 
1215
        <para>For example, in <code>ansi:stdarg.h</code> a token
1216
          <code>va_arg</code> is defined which takes a variable of type
1217
          <code>va_list</code> and a type <code>t</code> and returns a value
1218
          of type <code>t</code>. This is given by:
1219
 
1220
          <programlisting>
1221
+TOKEN va_arg %% PROC ( EXP lvalue : va_list : e, TYPE t ) EXP rvalue : t : %% ;
1222
          </programlisting></para>
1223
 
1224
        <para>See reference 3 for more details on the token syntax.</para>
1225
      </sect1>
1226
  </chapter>
1227
 
1228
  <chapter id="Others">
1229
    <title>Other tspec Constructs</title>
1230
 
1231
      <para>Although most <code>tspec</code> constructs are concerned either
1232
        with specifying new objects or imposing structure upon various sets of
1233
        objects, there are a few which do not fall into these
1234
        categories.</para>
1235
 
1236
      <sect1 id="If">
1237
        <title>+IF, +ELSE and +ENDIF</title>
1238
 
1239
        <para>It is possible to introduce conditional compilation into the API
1240
          description by means of the constructs:
1241
 
1242
          <programlisting>
1243
+IF %% text %%
1244
+IFDEF %% text %%
1245
+IFNDEF %% text %%
1246
+ELSE
1247
+ENDIF
1248
          </programlisting>
1249
 
1250
          which are translated into:
1251
 
1252
          <programlisting>
1253
#if text
1254
#ifdef text
1255
#ifndef text
1256
#else /* text */
1257
#endif /* text */
1258
          </programlisting>
1259
 
1260
          respectively. If <code>text</code> is just a simple number or a
1261
          single identifier the double percent delimiters may be
1262
          excluded.</para>
1263
 
1264
        <para>A couple of special <code>+IFDEF</code> (and also
1265
          <code>+IFNDEF</code>) forms are available which are useful on
1266
          occasion. These are:
1267
 
1268
          <programlisting>
1269
+IFDEF ~building_libs
1270
+IFDEF ~protect ( "api", "header" )
1271
          </programlisting>
1272
 
1273
          The macros in these constructs expand respectively to <code>
1274
          __BUILDING_LIBS</code> which, by convention is defined if and only
1275
          if TDF library building is taking place (see
1276
          <link linkend="Libraries">section 6.4</link>), and the protection
1277
          macro <code>tspec</code> makes up to protect the file
1278
          <code>api:header</code> against multiple inclusion (see
1279
          <link linkend="Protect">section 6.2</link>).</para>
1280
      </sect1>
1281
 
1282
      <sect1 id="Text">
1283
        <title>Quoted Text</title>
1284
 
1285
        <para>It is sometimes desirable to include text in the specification
1286
          file which will be copied directly into one of the output files -
1287
          for example, sections of C. This can be done by enclosing the text
1288
          for copying into the include output file in double percents:
1289
 
1290
          <programlisting>
1291
%% text %%
1292
          </programlisting>
1293
 
1294
          and text for copying into the source output file in triple percents:
1295
 
1296
          <programlisting>
1297
%%% text %%%
1298
          </programlisting></para>
1299
 
1300
        <para>In fact more percents may be used. An even number always
1301
          indicates text for the include output file, and an odd number the
1302
          source output file. Note that any <code>#</code> characters in
1303
          <code>text</code> are copied as normal, and not treated as comments.
1304
          This also applies to the other cases where percent delimiters are
1305
          used.</para>
1306
      </sect1>
1307
 
1308
      <sect1 id="Comment">
1309
        <title>C Comments</title>
1310
 
1311
        <para>A special case of quoted text are C style comments:
1312
 
1313
          <programlisting>
1314
/* text */
1315
          </programlisting>
1316
 
1317
          which are copied directly into the include output file.</para>
1318
      </sect1>
1319
 
1320
      <sect1 id="Properties">
1321
        <title>File Properties</title>
1322
 
1323
        <para>Various properties of individual sets of objects or global
1324
          properties can be set using file properties. These take the
1325
          form:
1326
 
1327
          <programlisting>
1328
$property = number ;
1329
          </programlisting>
1330
 
1331
          for numeric (or boolean) properties, and:
1332
 
1333
          <programlisting>
1334
$property = "string" ;
1335
          </programlisting>
1336
 
1337
          for string properties.</para>
1338
 
1339
        <para>The valid property names are as follows:
1340
          <itemizedlist>
1341
            <listitem>
1342
              <para><code>APINAME</code> is a string property which may be
1343
                used to override the API name of the current set of
1344
                objects.</para>
1345
            </listitem>
1346
 
1347
            <listitem>
1348
              <para><code>FILE</code> is a string property which is used by
1349
                the <code>tspec</code> preprocessor to indicate the current
1350
                input file name.</para>
1351
            </listitem>
1352
 
1353
            <listitem>
1354
              <para><code>FILENAME</code> is a string property which may be
1355
                used to override the header name of the current set of
1356
                objects.</para>
1357
            </listitem>
1358
 
1359
            <listitem>
1360
              <para><code>INCLNAME</code> is a string property which may be
1361
                used to set the name of the include output file in place of
1362
                the default name given in
1363
                <link linkend="Output">section 2.3</link>.  Setting the
1364
                property to the empty string suppresses the output of this
1365
                file.</para>
1366
            </listitem>
1367
 
1368
            <listitem>
1369
              <para><code>INTERFACE</code> is a numeric property which may be
1370
                set to force the creation of the source output file and
1371
                cleared to suppress it.</para>
1372
            </listitem>
1373
 
1374
            <listitem>
1375
              <para><code>LINE</code> is a numeric property which is used by
1376
                the <code>tspec</code> preprocessor to indicate the current
1377
                input file line number.</para>
1378
            </listitem>
1379
 
1380
            <listitem>
1381
              <para><code>METHOD</code> is a string property which may be used
1382
                to specify alternative construction methods for TDF library
1383
                building (see
1384
                <link linkend="Libraries">section 6.4</link>).</para>
1385
            </listitem>
1386
 
1387
            <listitem>
1388
              <para><code>PREFIX</code> is a string property which may be used
1389
                as a prefix to unique token names in place of the API and
1390
                header names (see
1391
                <link linkend="Names">section 4.1.1</link>).</para>
1392
            </listitem>
1393
 
1394
            <listitem>
1395
              <para><code>PROTECT</code> is a string property which may be
1396
                used to set the macro used by <code>tspec</code> to protect
1397
                the include output file against multiple inclusions (see
1398
                <link linkend="Protect">section 6.2</link>). Setting the
1399
                property to the empty string suppresses this macro.</para>
1400
            </listitem>
1401
 
1402
            <listitem>
1403
              <para><code>SOURCENAME</code> is a string property which may be
1404
                used to set the name of the source output file in place of the
1405
                default name given in
1406
                <link linkend="Output">section 2.3</link>.  Setting the
1407
                property to the empty string suppresses the output of this
1408
                file.</para>
1409
            </listitem>
1410
 
1411
            <listitem>
1412
              <para><code>SUBSETNAME</code> is a string property which may be
1413
                used to override the subset name of the current set of
1414
                objects.</para>
1415
            </listitem>
1416
 
1417
            <listitem>
1418
              <para><code>UNIQUE</code> is a numeric property which may be
1419
                used to switch the unique token name flag on and off (see
1420
                <link linkend="Names">section 4.1.1</link>). For standard APIs
1421
                it is recommended that this property is set to 1 in the API
1422
                <code>MASTER</code> file.</para>
1423
            </listitem>
1424
 
1425
            <listitem>
1426
              <para><code>VERBOSE</code> is a numeric property which may be
1427
                used to set the level of the verbose option (see
1428
                <link linkend="Options">section 2.5</link>).</para>
1429
            </listitem>
1430
 
1431
            <listitem>
1432
              <para><code>VERSION</code> is a string property which may be
1433
                used to assign a version number or other identification to a
1434
                <code>tspec</code> description. This information is reproduced
1435
                in the corresponding include output file.</para>
1436
            </listitem>
1437
          </itemizedlist></para>
1438
      </sect1>
1439
  </chapter>
1440
 
1441
  <chapter id="S6">
1442
    <title>Miscellaneous Topics</title>
1443
 
1444
      <para>In this section we round up a few miscellaneous topics.</para>
1445
 
1446
      <sect1 id="FineImpl">
1447
        <title>Fine Control of Included Files</title>
1448
 
1449
        <para>The <code>+IMPLEMENT</code> and <code>+USE</code> commands
1450
          described in <link linkend="Impl">section 3.2</link> are capable of
1451
          further refinement. Normally each such command is translated into a
1452
          corresponding inclusion command in both the include and source
1453
          output files.  Occasionally this is not desirable - in particular
1454
          the inclusion in the source output file can cause problems during
1455
          TDF library building. For this reason the <code>tspec</code> syntax
1456
          has been extended to allow for fine control of the output
1457
          corresponding to <code>+IMPLEMENT</code> and <code>+USE</code>
1458
          commands. This takes the forms:
1459
 
1460
          <programlisting>
1461
+IMPLEMENT "api" (key) ;
1462
+IMPLEMENT "api", "header" (key) ;
1463
+IMPLEMENT "api", "header", "subset" (key) ;
1464
          </programlisting>
1465
 
1466
          with corresponding forms for <code>+USE</code>.  <code>key</code>
1467
          specifies which output files the inclusion commands should appear
1468
          in. It can be:
1469
 
1470
          <itemizedlist>
1471
            <listitem>
1472
              <para><code>??</code>, indicating neither output file,</para>
1473
            </listitem>
1474
 
1475
            <listitem>
1476
              <para><code>!?</code>, indicating the include output file
1477
                only,</para>
1478
            </listitem>
1479
 
1480
            <listitem>
1481
              <para><code>?!</code>, indicating the source output file
1482
                only,</para>
1483
            </listitem>
1484
 
1485
            <listitem>
1486
              <para><code>!!</code>, indicating both output files (this is the
1487
                same as the normal form).</para>
1488
            </listitem>
1489
          </itemizedlist></para>
1490
 
1491
        <para>The second refinement comes from the fact that APIs fall into
1492
          two categories - the base APIs, such as <code>ansi</code>,
1493
          <code>posix</code> and <code>xpg3</code>, and the extension APIs,
1494
          such as <code>x11</code>, the X Windows API. The latter can be
1495
          used to extend the former, so that we can form <code>ansi</code>
1496
          plus <code>x11</code>, <code>posix</code> plus <code>x11</code>,
1497
          and so on. Base APIs may be distinguished in <code>tspec</code>
1498
          by including the command:
1499
 
1500
          <programlisting>
1501
+BASE_API ;
1502
          </programlisting>
1503
 
1504
          in their <code>MASTER</code> file. Occasionally, in an extension
1505
          API, we may wish to include a version of a header from the base API,
1506
          but, because this base API is not fixed, not be able to use a simple
1507
          <code>+USE</code> command. Instead the special form:
1508
 
1509
          <programlisting>
1510
+USE ( "api" ), "header" ;
1511
          </programlisting>
1512
 
1513
          is provided for this purpose (this is the only permitted form). It
1514
          indicates that <code>tspec</code> should use the <code>api</code>
1515
          version of <code>header</code> for checking purposes, but allow the
1516
          inclusion of the version from the base API in normal use.</para>
1517
      </sect1>
1518
 
1519
      <sect1 id="Protect">
1520
        <title>Protection Macros</title>
1521
 
1522
        <para>Each include output file is surrounded by a construct of the
1523
          form:
1524
 
1525
          <programlisting>
1526
#ifndef MACRO
1527
#define MACRO
1528
....
1529
#endif /* MACRO */
1530
          </programlisting>
1531
 
1532
          to protect it against multiple inclusions. Normally
1533
          <code>tspec</code> will generate the macro name, <code>MACRO</code>,
1534
          but it can be set using the <code>PROTECT</code> file property (see
1535
          <link linkend="Properties">section 5.4</link>). Setting
1536
          <code>PROTECT</code> to the empty string suppresses the protection
1537
          construct altogether.  (Also see
1538
          <link linkend="If">section 5.1</link>.)</para>
1539
      </sect1>
1540
 
1541
      <sect1 id="Index">
1542
        <title>Index Printing</title>
1543
 
1544
        <para>If it is invoked with the <option>-i</option> command-line
1545
          option, instead of creating its output file, <code>tspec</code>
1546
          prints an index of all the objects it has read to the standard
1547
          output. This information includes the external token name associated
1548
          with the object, whether the object is implemented or used, and
1549
          where in the API description it is defined. It also includes a brief
1550
          description of the object.  It is intended that these indexes should
1551
          be usable as quick reference guides to the underlying APIs.</para>
1552
      </sect1>
1553
 
1554
      <sect1 id="Libraries">
1555
        <title>TDF Library Building</title>
1556
 
1557
        <para>As was explained in reference 1, the <code>#pragma token</code>
1558
          headers output by <code>tspec</code> are used for two purposes -
1559
          checking applications against the API during normal compilation and
1560
          checking implementations against the API during TDF library
1561
          building. This dual use does necessitate some extra work for
1562
          <code>tspec</code>. It is not always possible to use exactly the
1563
          same code in the two cases (usually because the C rules on, for
1564
          example, structure definitions get in the way during library
1565
          building). <code>tspec</code> uses a standard macro,
1566
          <code>__BUILDING_LIBS</code>, to distinguish between the two cases.
1567
          It is assumed to be defined if and only if library building is
1568
          taking place. <code>tspec</code> descriptions can access this macro
1569
          directly using <code>~building_libs</code> (see
1570
          <link linkend="If">section 5.1</link>).</para>
1571
 
1572
        <para>The actual library building process consists of compiling the
1573
          <code>#pragma token</code> descriptions of the objects comprising
1574
          the API along with the implementation of that API from the system
1575
          headers (or wherever). This creates the local token definitions for
1576
          this API, which may be stored in a token library. To facilitate this
1577
          process <code>tspec</code> creates the source output files for each
1578
          implemented header <code>api:header</code> containing something
1579
          like:
1580
 
1581
          <programlisting>
1582
#pragma implement interface &lt;../api/header&gt;
1583
#include &lt;header&gt;
1584
          </programlisting>
1585
 
1586
          together with a makefile to compile all these programs to token
1587
          definitions and to combine these token definitions into a token
1588
          library. In fact two makefiles are created in the source output
1589
          directory (see <link linkend="Output">section 2.3</link>). The first
1590
          is called <code>M_api</code> and is designed for stand-alone library
1591
          construction. The second is called <code>Makefile</code> and is
1592
          designed for use with the library building script
1593
          <code>MAKE_LIBS</code> provided with <code>tspec</code>.</para>
1594
 
1595
        <para>There are other methods whereby the source output file may be
1596
          changed into a set of token definitions. For example, in
1597
          <code>c:sys.h</code> the <code>METHOD</code> file property (see
1598
          <link linkend="Properties">section 5.4</link>) is set to
1599
          <code>TDP</code>, causing the <code>tdp</code> program to be invoked
1600
          to produce the definitions for the basic C tokens for the system. As
1601
          another example consider:
1602
 
1603
          <programlisting>
1604
$METHOD = "TNC" ;
1605
+MACRO double fl_abs ( double ) ;
1606
%%%
1607
    ( make_tokdef fl_abs ( exp x ) exp
1608
        ( floating_abs impossible x ) )
1609
%%%
1610
          </programlisting></para>
1611
 
1612
        <para>The include output file will specify a token <code>fl_abs</code>
1613
          which takes a <code>double</code> and returns a <code>double</code>.
1614
          The <code>TNC</code> method tells <code>MAKE_LIBS</code> that the
1615
          source output file, which will just contain the quoted text:
1616
 
1617
          <programlisting>
1618
( make_tokdef fl_abs ( exp x ) exp
1619
    ( floating_abs impossible x ) )
1620
          </programlisting>
1621
 
1622
          is an input file for the TDF notation compiler, <code>tnc</code>
1623
          (see reference 2). Thus we have defined a token which directly
1624
          accesses the TDF <code>floating_abs</code> construct.</para>
1625
      </sect1>
1626
  </chapter>
1627
 
1628
  <chapter id="S7">
1629
    <title>Changes in tspec 2.0</title>
1630
 
1631
      <para>This document describes <code>tspec</code> version 2.0.
1632
        <code>tspec</code> 2.0 contains significant changes from previous
1633
        releases. For convenience the main changes which are visible to the
1634
        <code>tspec</code> user are listed here:
1635
        <itemizedlist>
1636
          <listitem>
1637
            <para>The added specification level of named subsets of headers
1638
              has been introduced (see
1639
              <link linkend="Levels">section 2.1</link>).  This has been done
1640
              by introducing the <code>+SUBSET</code> construct and extending
1641
              the <code>+IMPLEMENT</code> and <code>+USE</code> constructs, as
1642
              well as the command-line options. The previous method of dealing
1643
              with such subsets - namely shared headers - is now obsolete and
1644
              its use is discouraged.</para>
1645
          </listitem>
1646
 
1647
          <listitem>
1648
            <para>A number of new command-line options have been added, and
1649
              some of the existing options have been modified slightly (see
1650
              <link linkend="Options">section 2.5</link>).</para>
1651
          </listitem>
1652
 
1653
          <listitem>
1654
            <para>The suffix <code>.api</code> has been added to the output
1655
              directories (see <link linkend="Output">section 2.3</link>) to
1656
              avoid possible confusion with other include file
1657
              directories.</para>
1658
          </listitem>
1659
 
1660
          <listitem>
1661
            <para>The use of identifiers beginning with <code>~</code> as
1662
              local variables is new (see <link linkend="Identifiers">section
1663
              4.1.2</link>).</para>
1664
          </listitem>
1665
 
1666
          <listitem>
1667
            <para>The <code>+STATEMENT</code> and <code>+DEFINE</code>
1668
              constructs (see <link linkend="Statement">section 4.5</link> and
1669
              <link linkend="Define">section 4.6</link>) are new.</para>
1670
          </listitem>
1671
 
1672
          <listitem>
1673
            <para>The <code>(extern)</code>, <code>(weak)</code> and
1674
              <code>(const)</code> qualifiers for <code>+FUNC</code> and
1675
              <code>+EXP</code> (see <link linkend="Func">section 4.2</link> and
1676
              <link linkend="Exp">section 4.3</link>) are new.</para>
1677
          </listitem>
1678
 
1679
          <listitem>
1680
            <para>The <code>(signed)</code> and <code>(unsigned)</code>
1681
              qualifiers for <code>+TYPE</code> (see
1682
              <link linkend="Type">section 4.7</link>) are new.</para>
1683
          </listitem>
1684
 
1685
          <listitem>
1686
            <para>The <code>~special</code> type constructor (see
1687
              <link linkend="Typedef">section 4.8</link>) is new.</para>
1688
          </listitem>
1689
 
1690
          <listitem>
1691
            <para>The <code>~abstract</code> type constructor has been
1692
              abandoned.</para>
1693
          </listitem>
1694
 
1695
          <listitem>
1696
            <para>The <code>+BASE_API</code> command described in
1697
              <link linkend="FineImpl">section 6.1</link> is new.</para>
1698
          </listitem>
1699
 
1700
          <listitem>
1701
            <para>The indexing routines (see
1702
              <link linkend="Index">section 6.3</link>) have been greatly
1703
              improved.</para>
1704
          </listitem>
1705
        </itemizedlist></para>
1706
  </chapter>
1707
 
1708
    <chapter id="S8">
1709
      <title>References</title>
1710
 
1711
      <para><remark>"TDF and Portability"</remark>, DRA, 1993.</para>
1712
 
1713
      <para><remark>"The TDF Notation Compiler"</remark>, DRA, 1993.</para>
1714
 
1715
      <para><remark>"The C to TDF Producer"</remark>, DRA, 1993.</para>
1716
  </chapter>
1717
</book>