Subversion Repositories tendra.SVN

Rev

Rev 2 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 7u83 1
<!-- Crown Copyright (c) 1998 -->
2
<HTML>
3
<HEAD>
4
<TITLE>C Checker Reference Manual: API checking</TITLE>
5
</HEAD>
6
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">
7
<A NAME=S164>
8
<H1>C Checker Reference Manual</H1>
9
<H3>January 1998</H3>
10
<A HREF="tdfc22.html"><IMG SRC="../images/next.gif" ALT="next section"></A>
11
<A HREF="tdfc20.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>
12
<A HREF="tdfc1.html"><IMG SRC="../images/top.gif" ALT="current document"></A>
13
<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">
14
</A>
15
<IMG SRC="../images/no_index.gif" ALT="document index"><P>
16
<HR>
17
<DL>
18
<DT><A HREF="#S165"><B>G.1 </B> - Introduction</A><DD>
19
<DT><A HREF="#S166"><B>G.2 </B> - Specifying APIs to tcc</A><DD>
20
<DT><A HREF="#S167"><B>G.3 </B> - API Checking Examples</A><DD>
21
<DT><A HREF="#S168"><B>G.4 </B> - Redeclaring Objects in APIs</A><DD>
22
<DT><A HREF="#S169"><B>G.5 </B> - Defining Objects in APIs</A><DD>
23
<DT><A HREF="#S170"><B>G.6 </B> - Stepping Outside an API</A><DD>
24
<DT><A HREF="#S171"><B>G.7 </B> - Using the System Headers</A><DD>
25
<DT><A HREF="#S172"><B>G.8 </B> - Abstract API headers and API usage
26
analysis</A><DD>
27
</DL>
28
 
29
<HR>
30
<H1>G  API checking</H1>
31
<A NAME=S165>
32
<HR><H2>G.1  Introduction</H2>
33
The token syntax described in the previous annex provides the means
34
of describing an API specification independently of any particular
35
implementation of the API. Every object in the API specification is
36
described using the appropriate #pragma token statement. These statements
37
are arranged in TenDRA header files corresponding to the headers comprising
38
the API. Each API consists of a separate set of header files. For
39
example, if the ANSI API is used, the statement:<P>
40
<PRE>
41
	#include &lt;sys/types.h&gt;
42
</PRE>
43
will lead to a &quot;header not found&quot; error, whereas the header
44
will be found in the POSIX API. <P>
45
Where relationships exist between APIs these have been made explicit
46
in the headers. For example, the POSIX version of stdio.h consists
47
of the ANSI version plus some extra objects. This is implemented by
48
making the TenDRA header describing the POSIX version of stdio.h include
49
the ANSI version of stdio.h.<P>
50
<A NAME=S166>
51
<HR><H2>G.2  Specifying APIs to tcc</H2>
52
The API against which a program is to be checked is specified to tchk
53
by means of a command-line option of the form -Yapi where api is the
54
API name. For example, ANSI X3.159 is specified by -Yansi (this is
55
the default API) and POSIX 1003.1 is specified by -Yposix (for a full
56
list of the supported APIs see Chapter 2). <P>
57
Extension APIs, such as X11, require special attention. The API for
58
a program is never just X11, but X11 plus some base API, for example,
59
X11 plus POSIX or X11 plus XPG3. These composite APIs may be specified
60
by, for example, passing the options -Yposix -Yx5_lib (in that order)
61
to tcc to specify POSIX 1003.1 plus X11 (Release 5) Xlib. The rule
62
is that base APIs, such as POSIX, override the existing API, whereas
63
extension APIs, such as X11, extend it. The command-line option -info
64
causes tcc to print the API currently in use. For example:<P>
65
<PRE>
66
	&gt; tcc -Yposix -Yx5_lib -info file.c
67
</PRE>
68
will result in the message:<P>
69
<PRE>
70
	tcc: Information: API is X11 Release 5 Xlib plus POSIX (1003.1).
71
</PRE>
72
<A NAME=S167>
73
<HR><H2>G.3  API Checking Examples</H2>
74
As an example of the TenDRA compiler's API checking capacities, consider
75
the following program which prints the names and inode numbers of
76
all the files in the current directory:<P>
77
<PRE>
78
	#include &lt;stdio.h&gt;
79
	#include &lt;sys/types.h&gt;
80
	#include &lt;dirent.h&gt;
81
	int main ()
82
	{
83
		DIR *d = opendir ( &quot;.&quot; );
84
		struct dirent *e;
85
		if (d = NULL) return ( 1 );
86
		while(e=readdir(d),e!=NULL) 
87
		{
88
			printf ( &quot;%s %lu\n&quot;, e-&gt;d_name, e-&gt;d_ino );
89
		}
90
		closedir ( d );
91
		return ( 0 );
92
	}
93
</PRE>
94
A first attempted compilation using strict checking:<P>
95
<PRE>
96
	&gt; tcc -Xs a.c 
97
</PRE>
98
results in messages to the effect that the headers &lt;sys/types.h&gt;
99
and &lt;dirent.h&gt; cannot be found, plus a number of consequential
100
errors. This is because tcc is checking the program against the default
101
API, that is against the ANSI API, and the program is certainly not
102
ANSI compliant. It does look as if it might be POSIX compliant however,
103
so a second attempted compilation might be:<P>
104
<PRE>
105
	&gt; tcc -Xs -Yposix a.c
106
</PRE>
107
This results in one error and three warnings. Dealing with the warnings
108
first, the returns of the calls of printf and closedir are being discarded
109
and the variable d has been set and not used. The discarded function
110
returns are deliberate, so they can be made explicit by casting them
111
to void. The discarded assignment to d requires a little more thought
112
- it is due to the mistyping d = NULL instead of d == NULL on line
113
9. The error is more interesting. In full the error message reads:<P>
114
<PRE>
115
	&quot;a.c&quot;:11
116
		printf ( &quot;%s %lu\n&quot;, e-&gt;d_name, e-&gt;d_ino!!!! );
117
	Error:ISO[6.5.2.1]{ANSI[3.5.2.1]}: The identifier 'd_ino' is not a member of 
118
	'struct/union posix.dirent.dirent'.
119
	ISO[6.3.2.3]{ANSI[3.3.2.3]}: The second operand of '-&gt;' must be a member of 
120
	the struct/union pointed to by the first.
121
</PRE>
122
That is, struct dirent does not have a field called d_ino. In fact
123
this is true; while the d_name field of struct dirent is specified
124
in POSIX, the d_ino field is an XPG3 extension (This example shows
125
that the TenDRA representation of APIs is able to differentiate between
126
APIs at a very fine level). Therefore a third attempted compilation
127
might be:<P>
128
<PRE>
129
	&gt; tcc -Xs -Yxpg3 a.c
130
</PRE>
131
This leads to another error message concerning the printf statement,
132
that the types unsigned long and (the promotion of) ino_t are incompatible.
133
This is due to a mismatch between the printf format string &quot;%lu&quot;
134
and the type of e-&gt;d_ino. POSIX only says that ino_t is an arithmetic
135
type, not a specific type like unsigned long. The TenDRA representation
136
of POSIX reflects this abstract nature of ino_t, so that the potential
137
portability error is detected. In fact it is impossible to give a
138
printf string which works for all possible implementations of ino_t.
139
The best that can be done is to cast e-&gt;d_ino to some fixed type
140
like unsigned long and print that.<P>
141
Hence the corrected, XPG3 conformant program reads:<P>
142
<PRE>
143
	#include &lt;stdio.h&gt;
144
	#include &lt;sys/types.h&gt;
145
	#include &lt;dirent.h&gt;
146
	int main ()
147
	{
148
		DIR *d = opendir ( &quot;.&quot; );
149
		struct dirent *e;
150
		if ( d == NULL ) return (1);
151
		while(e=readdir(d),e!=NULL)
152
		{
153
			( void ) printf ( &quot;%s %lu\n&quot;, e-&gt;d_name,( unsigned long ) e-&gt;d_ino );
154
		}
155
		( void ) closedir ( d );
156
		return ( 0 );
157
	}
158
</PRE>
159
<A NAME=S168>
160
<HR><H2>G.4  <A NAME=1>Redeclaring Objects in APIs</H2>
161
Of course, it is possible to redeclare the functions declared in the
162
TenDRA API descriptions within the program, provided they are consistent.
163
However, what constitutes a consistent redeclaration in the fully
164
abstract TenDRA machine is not as straightforward as it might seem;
165
an interesting example is malloc in the ANSI API. This is defined
166
by the prototype:<P>
167
<PRE>
168
	void *malloc ( size_t ); 
169
</PRE>
170
where size_t is a target dependent unsigned integral type. The redeclaration:<P>
171
<PRE>
172
	void *malloc (); 
173
</PRE>
174
is only correct if size_t is its own integral promotion, and therefore
175
is not correct in general.<P>
176
Since it is not always desirable to remove these redeclarations (some
177
machines may not have all the <BR>
178
necessary functions declared in their system headers) the TenDRA compiler
179
has a facility to accept inconsistent redeclarations of API functions
180
which can be enabled by using the pragma:<P>
181
<PRE>
182
	#pragma TenDRA incompatible interface declaration allow
183
</PRE>
184
This pragma supresses the consistency checking of re-declarations
185
of API functions. Replacing <CODE>allow</CODE> by <CODE>warning</CODE>
186
causes a warning to be printed. In both cases the TenDRA API description
187
of the function takes precedence. The normal behaviour of flagging
188
inconsistent redeclarations as errors can be restored by replacing
189
<CODE>allow</CODE> by <CODE>disallow</CODE> in the pragma above. (There
190
are also equivalent command-line options to tcc of the form -X:interface_decl=
191
<EM>status</EM>, where <EM>status</EM> can be check, warn or dont.)<P>
192
<A NAME=S169>
193
<HR><H2>G.5  Defining Objects in APIs</H2>
194
Since the program API is meant to define the interface between what
195
the program defines and what the target machine defines, the TenDRA
196
compiler normally raises an error if any attempt is made to define
197
an object from the API in the program itself. A subtle example of
198
this is given by compiling the program:<P>
199
<PRE>
200
	#include &lt;errno.h&gt;
201
	extern int errno;
202
</PRE>
203
with the ANSI API. ANSI states that errno is an assignable lvalue
204
of type int, and the TenDRA <BR>
205
description of the API therefore states precisely that. The declaration
206
of errno as an extern int is therefore an inconsistent specification
207
of errno, but a consistent implementation. Accepting the lesser of
208
two evils, the error reported is therefore that an attempt has been
209
made to define errno despite the fact that it is part of the API.<P>
210
Note that if this same program is compiled using the POSIX API, in
211
which errno is explicitly specified to be an extern int, the program
212
merely contains a consistent redeclaration of errno and so does not
213
raise an error.<P>
214
The neatest workaround for the ANSI case, which preserves the declaration
215
for those machines which need it, is as follows: if errno is anything
216
other than an extern int it must be defined by a macro. Therefore:<P>
217
<PRE>
218
	#include &lt;errno.h&gt;
219
	#ifndef errno
220
	extern int errno;
221
	#endif
222
</PRE>
223
should always work.<P>
224
In most other examples, the definitions are more obvious. For example,
225
a programmer might provide a memory allocator containing versions
226
of malloc, free etc.:<P>
227
<PRE>
228
	#include &lt;stdlib.h&gt;
229
	void *malloc ( size_t sz )
230
	{
231
		....
232
	}
233
	void free ( void *ptr )
234
	{
235
		....
236
	}
237
</PRE>
238
If this is deliberate then the TenDRA compiler needs to be told to
239
ignore the API definitions of these objects and to use those provided
240
instead. This is done by listing the objects to be ignored using the
241
pragma:<P>
242
<PRE>
243
	#pragma ignore malloc free ....
244
</PRE>
245
(also see section G.10). This should be placed between the API specification
246
and the object definitions. The provided definitions are checked for
247
conformance with the API specifications. There are special forms of
248
this pragma to enable field selectors and objects in the tag namespace
249
to be defined. For example, if we wish to provide a definition of
250
the type div_t from stdlib.h we need to ignore three objects - the
251
type itself and its two field selectors - quot and rem. The definition
252
would therefore take the form:<P>
253
<PRE>
254
	#include &lt;stdlib.h&gt;
255
	#pragma ignore div_t div_t.quot div_t.rem
256
	typedef struct {
257
		int quot;
258
		int rem;
259
	} div_t;
260
</PRE>
261
Similarly if we wish to define struct lconv from locale.h the definition
262
would take the form:<P>
263
<PRE>
264
	#include &lt;locale.h&gt;
265
	#pragma ignore TAG lconv TAG lconv.decimal_point 
266
	....
267
	struct lconv {
268
		char *decimal_point;
269
		....
270
	};
271
</PRE>
272
to take into account that lconv lies in the tag name space. By defining
273
objects in the API in this way, we are actually constructing a less
274
general version of the API. This will potentially restrict the portability
275
of the resultant program, and so should not be done without good reason.<P>
276
<A NAME=S170>
277
<HR><H2>G.6  Stepping Outside an API</H2>
278
Using the TenDRA compiler to check a program against a standard API
279
will only be effective if the appropriate API description is available
280
to the program being tested (just as a program can only be compiled
281
on a conventional machine if the program API is implemented on that
282
machine). What can be done for a program whose API are not supported
283
depends on the degree to which the program API differs from an existing
284
TenDRA API description. If the program API is POSIX with a small extension,
285
say, then it may be possible to express that extension to the TenDRA
286
compiler. For large unsupported program APIs it may be possible to
287
use the system headers on a particular machine to allow for partial
288
program checking (see section H.7).<P>
289
For small API extensions the ideal method would be to use the token
290
syntax described in Annex G to express the program API to the TenDRA
291
compiler, however this is not currently encouraged because the syntax
292
of such API descriptions is not yet firmly fixed. For the time being
293
it may be possible to use C to express much of the information the
294
TenDRA compiler needs to check the program. For example, POSIX specifies
295
that sys/stat.h contains a number of macros, S_ISDIR, S_ISREG, and
296
so on, which are used to test whether a file is a directory, a regular
297
file, etc. Suppose that a program is basically POSIX conformant, but
298
uses the additional macro S_ISLNK to test whether the file is a symbolic
299
link (this is in COSE and AES, but not POSIX). A proper TenDRA description
300
of S_ISLNK would contain the information that it was a macro taking
301
a mode_t and returning an int, however for checking purposes it is
302
sufficient to merely give the types. This can be done by pretending
303
that S_ISLNK is a function:<P>
304
<PRE>
305
	#ifdef __TenDRA__
306
	/* For TenDRA checking purposes only */
307
	extern int S_ISLNK ( mode_t ); 
308
	/* actually a macro */
309
	#endif
310
</PRE>
311
More complex examples might require an object in the API to be defined
312
in order to provide more information about it (see H.5). For example,
313
suppose that a program is basically ANSI compliant, but assumes that
314
FILE is a structure with a field file_no of type int (representing
315
the file number), rather than a generic type. This might be expressed
316
by:<P>
317
<PRE>
318
	#ifdef __TenDRA__
319
	/* For TenDRA checking purposes only */
320
	#pragma ignore FILE
321
	typedef struct {
322
	/* there may be other fields here */
323
		int file_no;
324
	/* there may be other fields here */
325
	} FILE;
326
	#endif
327
</PRE>
328
The methods of API description above are what might be called &quot;example
329
implementations&quot; rather than the &quot;abstract implementations&quot;
330
of the actual TenDRA API descriptions. They should only be used as
331
a last resort, when there is no alternative way of expressing the
332
program within a standard API. For example, there may be no need to
333
access the file_no field of a FILE directly, since POSIX provides
334
a function, fileno, for this purpose. Extending an API in general
335
reduces the number of potential target machines for the corresponding
336
program.<P>
337
<A NAME=S171>
338
<HR><H2>G.7  Using the System Headers</H2>
339
One possibility if a program API is not supported by the TenDRA compiler
340
is to use the set of system headers on the particular machine on which
341
tcc happens to be running. Of course, this means that the API checking
342
facilities of the TenDRA compiler will not be effective, but it is
343
possible that the other program checking aspects will be of use.<P>
344
The system headers are not, and indeed are not intended to be, portable.
345
A simple-minded approach to portability checking with the system headers
346
could lead to more portability problems being found in the system
347
headers than in the program itself. A more sophisticated approach
348
involves applying different compilation modes to the system headers
349
and to the program. The program itself can be checked very rigorously,
350
while the system headers have very lax checks applied.<P>
351
This could be done directly, by putting a wrapper around each system
352
header describing the mode to be applied to that header. However the
353
mechanism of named compilation modes (see 2.2) provides an alternative
354
solution. In addition to the normal -Idir command-line option, tcc
355
also supports the option -Nname:dir, which is identical except that
356
it also associates the identifier name with the directory dir. Once
357
a directory has been named in this way, the name can be used in a
358
directive:<P>
359
<PRE>
360
	#pragma TenDRA directory <EM>name</EM> use environment <EM>mode</EM>
361
</PRE>
362
which tells tcc to apply the named compilation mode, mode, to any
363
files included from the directory, name. This is the mechanism used
364
to specify the checks to be applied to the system headers.<P>
365
The system headers may be specified to tcc using the -Ysystem command-line
366
option. This specifies /usr/include as the directory to search for
367
headers and passes a system start-up file to tcc. This system start-up
368
file contains any macro definitions which are necessary for tcc to
369
navigate the system headers correctly, plus a description of the compilation
370
mode to be used in compiling the system headers.<P>
371
In fact, before searching /usr/include, tcc searches another directory
372
for system headers. This is intended to hold modified versions of
373
any system headers which cause particular problems or require extra
374
information. For example:<P>
375
<UL>
376
<LI>A version of stdio.h is provided for all systems, which contains
377
the declarations of printf and similar functions necessary for tcc
378
to apply its printf-string checks (see 3.3.2).<P>
379
<LI>A version of stdlib.h is provided for all systems which includes
380
the declarations of exit and similar functions necessary for tcc to
381
apply its flow analysis correctly (see 5.7).<P>
382
<LI>Versions of stdarg.h and varargs.h are provided for all systems
383
which work with tcc. Most system headers contain built-in functions
384
which are recognised by cc (but not tcc) to deal with these.<P>
385
</UL>
386
The user can also use this directory to modify any system headers
387
which cause problems. For example, not all system headers declare
388
all the functions they should, so it might be desirable to add these
389
declarations.<P>
390
It should be noted that the system headers and the TenDRA API headers
391
do not mix well. Both are parts of coherent systems of header files,
392
and unless the intersection is very small, it is not usually possible
393
to combine parts of these systems sensibly.<P>
394
Even a separation, such as compiling some modules of a program using
395
a TenDRA API description and others using the system headers, can
396
lead to problems in the intermodular linking phase (see Chapter 9).
397
There will almost certainly be type inconsistency errors since the
398
TenDRA headers and the system headers will have different representations
399
of the same object.<P>
400
<A NAME=S172>
401
<HR><H2>G.8  <A NAME=6>Abstract API headers and API usage analysis</H2>
402
The abstract standard headers provided with the tool are the basis
403
for the API usage analysis checking on dump files described in Chapter
404
9. The declarations in each abstract header file are enclosed by the
405
following pragmas:<P>
406
<PRE>
407
	#pragma TenDRA declaration block <EM>API_name</EM> begin
408
	#pragma TenDRA declaration block end
409
</PRE>
410
<CODE>API_name</CODE> has a standard form e.g. <EM>api__ansi__stdio</EM>
411
for stdio.h in the ANSI API.<P>
412
This information is output in the dump format as the start and end
413
of a header scope, i.e.<P>
414
<PRE>
415
	SSH	position	ref_no = &lt;API_name&gt;
416
	SEH	position	ref_no
417
</PRE>
418
The first occurence of each identifier in the dump output contains
419
scope information; in the case of an identifier declared in the abstract
420
headers, this scope information will normally refer to a header scope.
421
Since each use of the identifier can be traced back to its declaration,
422
this provides a means of tracking API usage within the application
423
when the abstract headers are used. The disadvantages of this method
424
are that only APIs for which abstract headers are available can be
425
used. Objects which are not part of the standard APIs are not available
426
and if an application requires such an identifier (or indeed attempts
427
to use a standard API identifier for which the appropriate header
428
has not been included) the resulting errors may distort or even completely
429
halt the dump output resulting in incomplete or incorrect analysis.<P>
430
The second method of API analysis allows compilation of the application
431
against the system headers, thereby overcoming the problems of non-standard
432
API usage mentioned above. The dump of the application can be scanned
433
to determine the identifiers which are used but not defined within
434
the application itself. These identifiers form the program's external
435
API with the system headers and libraries, and can be compared with
436
API reference information, provided by dump output files produced
437
from the abstract standard headers, to determine the applications
438
API usage.<P>
439
<!-- FM pgf ignored -->
440
<HR>
441
<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown
442
Copyright &copy; 1998.</I></P>
443
</BODY>
444
</HTML>