Blame | Last modification | View Log | RSS feed
<!-- Crown Copyright (c) 1998 -->
<HTML>
<HEAD>
<TITLE>C Checker Reference Manual: API checking</TITLE>
</HEAD>
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">
<A NAME=S164>
<H1>C Checker Reference Manual</H1>
<H3>January 1998</H3>
<A HREF="tdfc22.html"><IMG SRC="../images/next.gif" ALT="next section"></A>
<A HREF="tdfc20.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>
<A HREF="tdfc1.html"><IMG SRC="../images/top.gif" ALT="current document"></A>
<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">
</A>
<IMG SRC="../images/no_index.gif" ALT="document index"><P>
<HR>
<DL>
<DT><A HREF="#S165"><B>G.1 </B> - Introduction</A><DD>
<DT><A HREF="#S166"><B>G.2 </B> - Specifying APIs to tcc</A><DD>
<DT><A HREF="#S167"><B>G.3 </B> - API Checking Examples</A><DD>
<DT><A HREF="#S168"><B>G.4 </B> - Redeclaring Objects in APIs</A><DD>
<DT><A HREF="#S169"><B>G.5 </B> - Defining Objects in APIs</A><DD>
<DT><A HREF="#S170"><B>G.6 </B> - Stepping Outside an API</A><DD>
<DT><A HREF="#S171"><B>G.7 </B> - Using the System Headers</A><DD>
<DT><A HREF="#S172"><B>G.8 </B> - Abstract API headers and API usage
analysis</A><DD>
</DL>
<HR>
<H1>G API checking</H1>
<A NAME=S165>
<HR><H2>G.1 Introduction</H2>
The token syntax described in the previous annex provides the means
of describing an API specification independently of any particular
implementation of the API. Every object in the API specification is
described using the appropriate #pragma token statement. These statements
are arranged in TenDRA header files corresponding to the headers comprising
the API. Each API consists of a separate set of header files. For
example, if the ANSI API is used, the statement:<P>
<PRE>
#include <sys/types.h>
</PRE>
will lead to a "header not found" error, whereas the header
will be found in the POSIX API. <P>
Where relationships exist between APIs these have been made explicit
in the headers. For example, the POSIX version of stdio.h consists
of the ANSI version plus some extra objects. This is implemented by
making the TenDRA header describing the POSIX version of stdio.h include
the ANSI version of stdio.h.<P>
<A NAME=S166>
<HR><H2>G.2 Specifying APIs to tcc</H2>
The API against which a program is to be checked is specified to tchk
by means of a command-line option of the form -Yapi where api is the
API name. For example, ANSI X3.159 is specified by -Yansi (this is
the default API) and POSIX 1003.1 is specified by -Yposix (for a full
list of the supported APIs see Chapter 2). <P>
Extension APIs, such as X11, require special attention. The API for
a program is never just X11, but X11 plus some base API, for example,
X11 plus POSIX or X11 plus XPG3. These composite APIs may be specified
by, for example, passing the options -Yposix -Yx5_lib (in that order)
to tcc to specify POSIX 1003.1 plus X11 (Release 5) Xlib. The rule
is that base APIs, such as POSIX, override the existing API, whereas
extension APIs, such as X11, extend it. The command-line option -info
causes tcc to print the API currently in use. For example:<P>
<PRE>
> tcc -Yposix -Yx5_lib -info file.c
</PRE>
will result in the message:<P>
<PRE>
tcc: Information: API is X11 Release 5 Xlib plus POSIX (1003.1).
</PRE>
<A NAME=S167>
<HR><H2>G.3 API Checking Examples</H2>
As an example of the TenDRA compiler's API checking capacities, consider
the following program which prints the names and inode numbers of
all the files in the current directory:<P>
<PRE>
#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>
int main ()
{
DIR *d = opendir ( "." );
struct dirent *e;
if (d = NULL) return ( 1 );
while(e=readdir(d),e!=NULL)
{
printf ( "%s %lu\n", e->d_name, e->d_ino );
}
closedir ( d );
return ( 0 );
}
</PRE>
A first attempted compilation using strict checking:<P>
<PRE>
> tcc -Xs a.c
</PRE>
results in messages to the effect that the headers <sys/types.h>
and <dirent.h> cannot be found, plus a number of consequential
errors. This is because tcc is checking the program against the default
API, that is against the ANSI API, and the program is certainly not
ANSI compliant. It does look as if it might be POSIX compliant however,
so a second attempted compilation might be:<P>
<PRE>
> tcc -Xs -Yposix a.c
</PRE>
This results in one error and three warnings. Dealing with the warnings
first, the returns of the calls of printf and closedir are being discarded
and the variable d has been set and not used. The discarded function
returns are deliberate, so they can be made explicit by casting them
to void. The discarded assignment to d requires a little more thought
- it is due to the mistyping d = NULL instead of d == NULL on line
9. The error is more interesting. In full the error message reads:<P>
<PRE>
"a.c":11
printf ( "%s %lu\n", e->d_name, e->d_ino!!!! );
Error:ISO[6.5.2.1]{ANSI[3.5.2.1]}: The identifier 'd_ino' is not a member of
'struct/union posix.dirent.dirent'.
ISO[6.3.2.3]{ANSI[3.3.2.3]}: The second operand of '->' must be a member of
the struct/union pointed to by the first.
</PRE>
That is, struct dirent does not have a field called d_ino. In fact
this is true; while the d_name field of struct dirent is specified
in POSIX, the d_ino field is an XPG3 extension (This example shows
that the TenDRA representation of APIs is able to differentiate between
APIs at a very fine level). Therefore a third attempted compilation
might be:<P>
<PRE>
> tcc -Xs -Yxpg3 a.c
</PRE>
This leads to another error message concerning the printf statement,
that the types unsigned long and (the promotion of) ino_t are incompatible.
This is due to a mismatch between the printf format string "%lu"
and the type of e->d_ino. POSIX only says that ino_t is an arithmetic
type, not a specific type like unsigned long. The TenDRA representation
of POSIX reflects this abstract nature of ino_t, so that the potential
portability error is detected. In fact it is impossible to give a
printf string which works for all possible implementations of ino_t.
The best that can be done is to cast e->d_ino to some fixed type
like unsigned long and print that.<P>
Hence the corrected, XPG3 conformant program reads:<P>
<PRE>
#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>
int main ()
{
DIR *d = opendir ( "." );
struct dirent *e;
if ( d == NULL ) return (1);
while(e=readdir(d),e!=NULL)
{
( void ) printf ( "%s %lu\n", e->d_name,( unsigned long ) e->d_ino );
}
( void ) closedir ( d );
return ( 0 );
}
</PRE>
<A NAME=S168>
<HR><H2>G.4 <A NAME=1>Redeclaring Objects in APIs</H2>
Of course, it is possible to redeclare the functions declared in the
TenDRA API descriptions within the program, provided they are consistent.
However, what constitutes a consistent redeclaration in the fully
abstract TenDRA machine is not as straightforward as it might seem;
an interesting example is malloc in the ANSI API. This is defined
by the prototype:<P>
<PRE>
void *malloc ( size_t );
</PRE>
where size_t is a target dependent unsigned integral type. The redeclaration:<P>
<PRE>
void *malloc ();
</PRE>
is only correct if size_t is its own integral promotion, and therefore
is not correct in general.<P>
Since it is not always desirable to remove these redeclarations (some
machines may not have all the <BR>
necessary functions declared in their system headers) the TenDRA compiler
has a facility to accept inconsistent redeclarations of API functions
which can be enabled by using the pragma:<P>
<PRE>
#pragma TenDRA incompatible interface declaration allow
</PRE>
This pragma supresses the consistency checking of re-declarations
of API functions. Replacing <CODE>allow</CODE> by <CODE>warning</CODE>
causes a warning to be printed. In both cases the TenDRA API description
of the function takes precedence. The normal behaviour of flagging
inconsistent redeclarations as errors can be restored by replacing
<CODE>allow</CODE> by <CODE>disallow</CODE> in the pragma above. (There
are also equivalent command-line options to tcc of the form -X:interface_decl=
<EM>status</EM>, where <EM>status</EM> can be check, warn or dont.)<P>
<A NAME=S169>
<HR><H2>G.5 Defining Objects in APIs</H2>
Since the program API is meant to define the interface between what
the program defines and what the target machine defines, the TenDRA
compiler normally raises an error if any attempt is made to define
an object from the API in the program itself. A subtle example of
this is given by compiling the program:<P>
<PRE>
#include <errno.h>
extern int errno;
</PRE>
with the ANSI API. ANSI states that errno is an assignable lvalue
of type int, and the TenDRA <BR>
description of the API therefore states precisely that. The declaration
of errno as an extern int is therefore an inconsistent specification
of errno, but a consistent implementation. Accepting the lesser of
two evils, the error reported is therefore that an attempt has been
made to define errno despite the fact that it is part of the API.<P>
Note that if this same program is compiled using the POSIX API, in
which errno is explicitly specified to be an extern int, the program
merely contains a consistent redeclaration of errno and so does not
raise an error.<P>
The neatest workaround for the ANSI case, which preserves the declaration
for those machines which need it, is as follows: if errno is anything
other than an extern int it must be defined by a macro. Therefore:<P>
<PRE>
#include <errno.h>
#ifndef errno
extern int errno;
#endif
</PRE>
should always work.<P>
In most other examples, the definitions are more obvious. For example,
a programmer might provide a memory allocator containing versions
of malloc, free etc.:<P>
<PRE>
#include <stdlib.h>
void *malloc ( size_t sz )
{
....
}
void free ( void *ptr )
{
....
}
</PRE>
If this is deliberate then the TenDRA compiler needs to be told to
ignore the API definitions of these objects and to use those provided
instead. This is done by listing the objects to be ignored using the
pragma:<P>
<PRE>
#pragma ignore malloc free ....
</PRE>
(also see section G.10). This should be placed between the API specification
and the object definitions. The provided definitions are checked for
conformance with the API specifications. There are special forms of
this pragma to enable field selectors and objects in the tag namespace
to be defined. For example, if we wish to provide a definition of
the type div_t from stdlib.h we need to ignore three objects - the
type itself and its two field selectors - quot and rem. The definition
would therefore take the form:<P>
<PRE>
#include <stdlib.h>
#pragma ignore div_t div_t.quot div_t.rem
typedef struct {
int quot;
int rem;
} div_t;
</PRE>
Similarly if we wish to define struct lconv from locale.h the definition
would take the form:<P>
<PRE>
#include <locale.h>
#pragma ignore TAG lconv TAG lconv.decimal_point
....
struct lconv {
char *decimal_point;
....
};
</PRE>
to take into account that lconv lies in the tag name space. By defining
objects in the API in this way, we are actually constructing a less
general version of the API. This will potentially restrict the portability
of the resultant program, and so should not be done without good reason.<P>
<A NAME=S170>
<HR><H2>G.6 Stepping Outside an API</H2>
Using the TenDRA compiler to check a program against a standard API
will only be effective if the appropriate API description is available
to the program being tested (just as a program can only be compiled
on a conventional machine if the program API is implemented on that
machine). What can be done for a program whose API are not supported
depends on the degree to which the program API differs from an existing
TenDRA API description. If the program API is POSIX with a small extension,
say, then it may be possible to express that extension to the TenDRA
compiler. For large unsupported program APIs it may be possible to
use the system headers on a particular machine to allow for partial
program checking (see section H.7).<P>
For small API extensions the ideal method would be to use the token
syntax described in Annex G to express the program API to the TenDRA
compiler, however this is not currently encouraged because the syntax
of such API descriptions is not yet firmly fixed. For the time being
it may be possible to use C to express much of the information the
TenDRA compiler needs to check the program. For example, POSIX specifies
that sys/stat.h contains a number of macros, S_ISDIR, S_ISREG, and
so on, which are used to test whether a file is a directory, a regular
file, etc. Suppose that a program is basically POSIX conformant, but
uses the additional macro S_ISLNK to test whether the file is a symbolic
link (this is in COSE and AES, but not POSIX). A proper TenDRA description
of S_ISLNK would contain the information that it was a macro taking
a mode_t and returning an int, however for checking purposes it is
sufficient to merely give the types. This can be done by pretending
that S_ISLNK is a function:<P>
<PRE>
#ifdef __TenDRA__
/* For TenDRA checking purposes only */
extern int S_ISLNK ( mode_t );
/* actually a macro */
#endif
</PRE>
More complex examples might require an object in the API to be defined
in order to provide more information about it (see H.5). For example,
suppose that a program is basically ANSI compliant, but assumes that
FILE is a structure with a field file_no of type int (representing
the file number), rather than a generic type. This might be expressed
by:<P>
<PRE>
#ifdef __TenDRA__
/* For TenDRA checking purposes only */
#pragma ignore FILE
typedef struct {
/* there may be other fields here */
int file_no;
/* there may be other fields here */
} FILE;
#endif
</PRE>
The methods of API description above are what might be called "example
implementations" rather than the "abstract implementations"
of the actual TenDRA API descriptions. They should only be used as
a last resort, when there is no alternative way of expressing the
program within a standard API. For example, there may be no need to
access the file_no field of a FILE directly, since POSIX provides
a function, fileno, for this purpose. Extending an API in general
reduces the number of potential target machines for the corresponding
program.<P>
<A NAME=S171>
<HR><H2>G.7 Using the System Headers</H2>
One possibility if a program API is not supported by the TenDRA compiler
is to use the set of system headers on the particular machine on which
tcc happens to be running. Of course, this means that the API checking
facilities of the TenDRA compiler will not be effective, but it is
possible that the other program checking aspects will be of use.<P>
The system headers are not, and indeed are not intended to be, portable.
A simple-minded approach to portability checking with the system headers
could lead to more portability problems being found in the system
headers than in the program itself. A more sophisticated approach
involves applying different compilation modes to the system headers
and to the program. The program itself can be checked very rigorously,
while the system headers have very lax checks applied.<P>
This could be done directly, by putting a wrapper around each system
header describing the mode to be applied to that header. However the
mechanism of named compilation modes (see 2.2) provides an alternative
solution. In addition to the normal -Idir command-line option, tcc
also supports the option -Nname:dir, which is identical except that
it also associates the identifier name with the directory dir. Once
a directory has been named in this way, the name can be used in a
directive:<P>
<PRE>
#pragma TenDRA directory <EM>name</EM> use environment <EM>mode</EM>
</PRE>
which tells tcc to apply the named compilation mode, mode, to any
files included from the directory, name. This is the mechanism used
to specify the checks to be applied to the system headers.<P>
The system headers may be specified to tcc using the -Ysystem command-line
option. This specifies /usr/include as the directory to search for
headers and passes a system start-up file to tcc. This system start-up
file contains any macro definitions which are necessary for tcc to
navigate the system headers correctly, plus a description of the compilation
mode to be used in compiling the system headers.<P>
In fact, before searching /usr/include, tcc searches another directory
for system headers. This is intended to hold modified versions of
any system headers which cause particular problems or require extra
information. For example:<P>
<UL>
<LI>A version of stdio.h is provided for all systems, which contains
the declarations of printf and similar functions necessary for tcc
to apply its printf-string checks (see 3.3.2).<P>
<LI>A version of stdlib.h is provided for all systems which includes
the declarations of exit and similar functions necessary for tcc to
apply its flow analysis correctly (see 5.7).<P>
<LI>Versions of stdarg.h and varargs.h are provided for all systems
which work with tcc. Most system headers contain built-in functions
which are recognised by cc (but not tcc) to deal with these.<P>
</UL>
The user can also use this directory to modify any system headers
which cause problems. For example, not all system headers declare
all the functions they should, so it might be desirable to add these
declarations.<P>
It should be noted that the system headers and the TenDRA API headers
do not mix well. Both are parts of coherent systems of header files,
and unless the intersection is very small, it is not usually possible
to combine parts of these systems sensibly.<P>
Even a separation, such as compiling some modules of a program using
a TenDRA API description and others using the system headers, can
lead to problems in the intermodular linking phase (see Chapter 9).
There will almost certainly be type inconsistency errors since the
TenDRA headers and the system headers will have different representations
of the same object.<P>
<A NAME=S172>
<HR><H2>G.8 <A NAME=6>Abstract API headers and API usage analysis</H2>
The abstract standard headers provided with the tool are the basis
for the API usage analysis checking on dump files described in Chapter
9. The declarations in each abstract header file are enclosed by the
following pragmas:<P>
<PRE>
#pragma TenDRA declaration block <EM>API_name</EM> begin
#pragma TenDRA declaration block end
</PRE>
<CODE>API_name</CODE> has a standard form e.g. <EM>api__ansi__stdio</EM>
for stdio.h in the ANSI API.<P>
This information is output in the dump format as the start and end
of a header scope, i.e.<P>
<PRE>
SSH position ref_no = <API_name>
SEH position ref_no
</PRE>
The first occurence of each identifier in the dump output contains
scope information; in the case of an identifier declared in the abstract
headers, this scope information will normally refer to a header scope.
Since each use of the identifier can be traced back to its declaration,
this provides a means of tracking API usage within the application
when the abstract headers are used. The disadvantages of this method
are that only APIs for which abstract headers are available can be
used. Objects which are not part of the standard APIs are not available
and if an application requires such an identifier (or indeed attempts
to use a standard API identifier for which the appropriate header
has not been included) the resulting errors may distort or even completely
halt the dump output resulting in incomplete or incorrect analysis.<P>
The second method of API analysis allows compilation of the application
against the system headers, thereby overcoming the problems of non-standard
API usage mentioned above. The dump of the application can be scanned
to determine the identifiers which are used but not defined within
the application itself. These identifiers form the program's external
API with the system headers and libraries, and can be compared with
API reference information, provided by dump output files produced
from the abstract standard headers, to determine the applications
API usage.<P>
<!-- FM pgf ignored -->
<HR>
<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown
Copyright © 1998.</I></P>
</BODY>
</HTML>