2 |
7u83 |
1 |
<!-- Crown Copyright (c) 1998 -->
|
|
|
2 |
<HTML>
|
|
|
3 |
<HEAD>
|
|
|
4 |
<TITLE>TDF and Portability: Portability</TITLE>
|
|
|
5 |
</HEAD>
|
|
|
6 |
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">
|
|
|
7 |
<A NAME=S2>
|
|
|
8 |
<H1>TDF and Portability</H1>
|
|
|
9 |
<H3>January 1998</H3>
|
|
|
10 |
<A HREF="port4.html"><IMG SRC="../images/next.gif" ALT="next section"></A>
|
|
|
11 |
<A HREF="port1.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>
|
|
|
12 |
<A HREF="port1.html"><IMG SRC="../images/top.gif" ALT="current document"></A>
|
|
|
13 |
<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">
|
|
|
14 |
</A>
|
|
|
15 |
<IMG SRC="../images/no_index.gif" ALT="document index"><P>
|
|
|
16 |
<HR>
|
|
|
17 |
<DL>
|
|
|
18 |
<DT><A HREF="#S3"><B>2.1</B> - Portable Programs</A><DD>
|
|
|
19 |
<DL>
|
|
|
20 |
<DT><A HREF="#S4"><B>2.1.1</B> - Definitions and Preliminary Discussion</A><DD>
|
|
|
21 |
<DT><A HREF="#S5"><B>2.1.2</B> - Separation and Combination of Code</A><DD>
|
|
|
22 |
<DT><A HREF="#S6"><B>2.1.3</B> - Application Programming Interfaces</A><DD>
|
|
|
23 |
<DT><A HREF="#S7"><B>2.1.4</B> - Compilation Phases</A><DD>
|
|
|
24 |
</DL>
|
|
|
25 |
<DT><A HREF="#S8"><B>2.2</B> - Portability Problems</A><DD>
|
|
|
26 |
<DL>
|
|
|
27 |
<DT><A HREF="#S9"><B>2.2.1</B> - Programming Problems</A><DD>
|
|
|
28 |
<DT><A HREF="#S10"><B>2.2.2</B> - Code Transformation Problems</A><DD>
|
|
|
29 |
<DT><A HREF="#S11"><B>2.2.3</B> - Code Combination Problems</A><DD>
|
|
|
30 |
<DT><A HREF="#S12"><B>2.2.4</B> - API Problems</A><DD>
|
|
|
31 |
<DL>
|
|
|
32 |
<DT><A HREF="#S13"><B>2.2.4.1</B> - API Checking</A><DD>
|
|
|
33 |
<DT><A HREF="#S14"><B>2.2.4.2</B> - API Implementation Errors</A><DD>
|
|
|
34 |
<DT><A HREF="#S15"><B>2.2.4.3</B> - System Header Problems</A><DD>
|
|
|
35 |
<DT><A HREF="#S16"><B>2.2.4.4</B> - System Library Problems</A><DD>
|
|
|
36 |
</DL>
|
|
|
37 |
</DL>
|
|
|
38 |
<DT><A HREF="#S17"><B>2.3</B> - APIs and Portability</A><DD>
|
|
|
39 |
<DL>
|
|
|
40 |
<DT><A HREF="#S18"><B>2.3.1</B> - Target Dependent Code</A><DD>
|
|
|
41 |
<DT><A HREF="#S19"><B>2.3.2</B> - Making APIs Explicit</A><DD>
|
|
|
42 |
<DT><A HREF="#S20"><B>2.3.3</B> - Choosing an API</A><DD>
|
|
|
43 |
<DT><A HREF="#S21"><B>2.3.4</B> - Alternative Program Versions</A><DD>
|
|
|
44 |
</DL>
|
|
|
45 |
</DL>
|
|
|
46 |
<HR>
|
|
|
47 |
|
|
|
48 |
<H1>2. Portability</H1>
|
|
|
49 |
We start by examining some of the problems involved in the writing
|
|
|
50 |
of portable programs. Although the discussion is very general, and
|
|
|
51 |
makes no mention of TDF, many of the ideas introduced are of importance
|
|
|
52 |
in the second half of the paper, which deals with TDF.<P>
|
|
|
53 |
<A NAME=S3>
|
|
|
54 |
<HR><H2>2.1. Portable Programs</H2>
|
|
|
55 |
<A NAME=S4>
|
|
|
56 |
<H3>2.1.1. Definitions and Preliminary Discussion</H3>
|
|
|
57 |
Let us firstly say what we mean by a portable program. A program is
|
|
|
58 |
portable to a number of machines if it can be compiled to give the
|
|
|
59 |
same functionality on all those machines. Note that this does not
|
|
|
60 |
mean that exactly the same source code is used on all the machines.
|
|
|
61 |
One could envisage a program written in, say, 68020 assembly code
|
|
|
62 |
for a certain machine which has been translated into 80386 assembly
|
|
|
63 |
code for some other machine to give a program with exactly equivalent
|
|
|
64 |
functionality. This would, under our definition, be a program which
|
|
|
65 |
is portable to these two machines. At the other end of the scale,
|
|
|
66 |
the C program:<P>
|
|
|
67 |
<PRE>
|
|
|
68 |
#include <stdio.h>
|
|
|
69 |
|
|
|
70 |
int main ()
|
|
|
71 |
{
|
|
|
72 |
fputs ( "Hello world\n", stdout ) ;
|
|
|
73 |
return ( 0 ) ;
|
|
|
74 |
}
|
|
|
75 |
</PRE>
|
|
|
76 |
which prints the message, "Hello world", onto the standard
|
|
|
77 |
output stream, will be portable to a vast range of machines without
|
|
|
78 |
any need for rewriting. Most of the portable programs we shall be
|
|
|
79 |
considering fall closer to the latter end of the spectrum - they will
|
|
|
80 |
largely consist of target independent source with small sections of
|
|
|
81 |
target dependent source for those constructs for which target independent
|
|
|
82 |
expression is either impossible or of inadequate efficiency.<P>
|
|
|
83 |
Note that we are defining portability in terms of a set of target
|
|
|
84 |
machines and not as some universal property. The act of modifying
|
|
|
85 |
an existing program to make it portable to a new target machine is
|
|
|
86 |
called porting. Clearly in the examples above, porting the first program
|
|
|
87 |
would be a highly complex task involving almost an entire rewrite,
|
|
|
88 |
whereas in the second case it should be trivial.<P>
|
|
|
89 |
<A NAME=S5>
|
|
|
90 |
<H3>2.1.2. Separation and Combination of Code</H3>
|
|
|
91 |
So why is the second example above more portable (in the sense of
|
|
|
92 |
more easily ported to a new machine) than the first? The first, obvious,
|
|
|
93 |
point to be made is that it is written in a high-level language, C,
|
|
|
94 |
rather than the low-level languages, 68020 and 80386 assembly codes,
|
|
|
95 |
used in the first example. By using a high-level language we have
|
|
|
96 |
abstracted out the details of the processor to be used and expressed
|
|
|
97 |
the program in an architecture neutral form. It is one of the jobs
|
|
|
98 |
of the compiler on the target machine to transform this high-level
|
|
|
99 |
representation into the appropriate machine dependent low-level representation.
|
|
|
100 |
<P>
|
|
|
101 |
The second point is that the second example program is not in itself
|
|
|
102 |
complete. The objects <CODE>fputs</CODE> and <CODE>stdout</CODE>,
|
|
|
103 |
representing the procedure to output a string and the standard output
|
|
|
104 |
stream respectively, are left undefined. Instead the header <CODE>stdio.h</CODE>
|
|
|
105 |
is included on the understanding that it contains the specification
|
|
|
106 |
of these objects.<P>
|
|
|
107 |
A version of this file is to be found on each target machine. On a
|
|
|
108 |
particular machine it might contain something like:<P>
|
|
|
109 |
<PRE>
|
|
|
110 |
typedef struct {
|
|
|
111 |
int __cnt ;
|
|
|
112 |
unsigned char *__ptr ;
|
|
|
113 |
unsigned char *__base ;
|
|
|
114 |
short __flag ;
|
|
|
115 |
char __file ;
|
|
|
116 |
} FILE ;
|
|
|
117 |
|
|
|
118 |
extern FILE __iob [60] ;
|
|
|
119 |
#define stdout ( &__iob [1] )
|
|
|
120 |
|
|
|
121 |
extern int fputs ( const char *, FILE * ) ;
|
|
|
122 |
</PRE>
|
|
|
123 |
meaning that the type <CODE>FILE</CODE> is defined by the given structure,
|
|
|
124 |
<CODE>__iob</CODE> is an external array of 60 <CODE>FILE</CODE>'s,
|
|
|
125 |
<CODE>stdout</CODE> is a pointer to the second element of this array,
|
|
|
126 |
and that <CODE>fputs</CODE> is an external procedure which takes a
|
|
|
127 |
<CODE>const char *</CODE> and a <CODE>FILE *</CODE> and returns an
|
|
|
128 |
<CODE>int</CODE>. On a different machine, the details may be different
|
|
|
129 |
(exactly what we can, or cannot, assume is the same on all target
|
|
|
130 |
machines is discussed below).<P>
|
|
|
131 |
These details are fed into the program by the pre-processing phase
|
|
|
132 |
of the compiler. (The various compilation phases are discussed in
|
|
|
133 |
more detail later - see Fig. 1.) This is a simple, preliminary textual
|
|
|
134 |
substitution. It provides the definitions of the type <CODE>FILE</CODE>
|
|
|
135 |
and the value <CODE>stdout</CODE> (in terms of <CODE>__iob</CODE>),
|
|
|
136 |
but still leaves the precise definitions of <CODE>__iob</CODE> and
|
|
|
137 |
<CODE>fputs</CODE> still unresolved (although we do know their types).
|
|
|
138 |
The definitions of these values are not provided until the final phase
|
|
|
139 |
of the compilation - linking - where they are linked in from the precompiled
|
|
|
140 |
system libraries.<P>
|
|
|
141 |
Note that, even after the pre-processing phase, our portable program
|
|
|
142 |
has been transformed into a target dependent form, because of the
|
|
|
143 |
substitution of the target dependent values from <CODE>stdio.h</CODE>.
|
|
|
144 |
If we had also included the definitions of <CODE>__iob</CODE> and,
|
|
|
145 |
more particularly, <CODE>fputs</CODE>, things would have been even
|
|
|
146 |
worse - the procedure for outputting a string to the screen is likely
|
|
|
147 |
to be highly target dependent.<P>
|
|
|
148 |
To conclude, we have, by including <CODE>stdio.h</CODE>, been able
|
|
|
149 |
to effectively separate the target independent part of our program
|
|
|
150 |
(the main program) from the target dependent part (the details of
|
|
|
151 |
<CODE>stdout</CODE> and <CODE>fputs</CODE>). It is one of the jobs
|
|
|
152 |
of the compiler to recombine these parts to produce a complete program.<P>
|
|
|
153 |
<A NAME=S6>
|
|
|
154 |
<H3>2.1.3. Application Programming Interfaces</H3>
|
|
|
155 |
As we have seen, the separation of the target dependent sections of
|
|
|
156 |
a program into the system headers and system libraries greatly facilitates
|
|
|
157 |
the construction of portable programs. What has been done is to define
|
|
|
158 |
an interface between the main program and the existing operating system
|
|
|
159 |
on the target machine in abstract terms. The program should then be
|
|
|
160 |
portable to any machine which implements this interface correctly.<P>
|
|
|
161 |
The interface for the "Hello world" program above might
|
|
|
162 |
be described as follows : defined in the header <CODE>stdio.h</CODE>
|
|
|
163 |
are a type <CODE>FILE</CODE> representing a file, an object <CODE>stdout</CODE>
|
|
|
164 |
of type <CODE>FILE *</CODE> representing the standard output file,
|
|
|
165 |
and a procedure <CODE>fputs</CODE> with prototype:<P>
|
|
|
166 |
<PRE>
|
|
|
167 |
int fputs ( const char *s, FILE *f ) ;
|
|
|
168 |
</PRE>
|
|
|
169 |
which prints the string <CODE>s</CODE> to the file <CODE>f</CODE>.
|
|
|
170 |
This is an example of an Application Programming Interface (API).
|
|
|
171 |
Note that it can be split into two aspects, the syntactic (what they
|
|
|
172 |
are) and the semantic (what they mean). On any machine which implements
|
|
|
173 |
this API our program is both syntactically correct and does what we
|
|
|
174 |
expect it to.<P>
|
|
|
175 |
The benefit of describing the API at this fairly high level is that
|
|
|
176 |
it leaves scope for a range of implementation (and thus more machines
|
|
|
177 |
which implement it) while still encapsulating the main program's requirements.
|
|
|
178 |
<P>
|
|
|
179 |
In the example implementation of <CODE>stdio.h</CODE> above we see
|
|
|
180 |
that this machine implements this API correctly syntactically, but
|
|
|
181 |
not necessarily semantically. One would have to read the documentation
|
|
|
182 |
provided on the system to be sure of the semantics.<P>
|
|
|
183 |
Another way of defining an API for this program would be to note that
|
|
|
184 |
the given API is a subset of the ANSI C standard. Thus we could take
|
|
|
185 |
ANSI C as an "off the shelf" API. It is then clear that
|
|
|
186 |
our program should be portable to any ANSI-compliant machine.<P>
|
|
|
187 |
It is worth emphasising that all programs have an API, even if it
|
|
|
188 |
is implicit rather than explicit. However it is probably fair to say
|
|
|
189 |
that programs without an explicit API are only portable by accident.
|
|
|
190 |
We shall have more to say on this subject later.<P>
|
|
|
191 |
<A NAME=S7>
|
|
|
192 |
<H3>2.1.4. Compilation Phases</H3>
|
|
|
193 |
The general plan for how to write the extreme example of a portable
|
|
|
194 |
program, namely one which contains no target dependent code, is now
|
|
|
195 |
clear. It is shown in the compilation diagram in Fig. 1 which represents
|
|
|
196 |
the traditional compilation process. This diagram is divided into
|
|
|
197 |
four sections. The left half of the diagram represents the actual
|
|
|
198 |
program and the right half the associated API. The top half of the
|
|
|
199 |
diagram represents target independent material - things which only
|
|
|
200 |
need to be done once - and the bottom half target dependent material
|
|
|
201 |
- things which need to be done on every target machine.<P>
|
|
|
202 |
FIGURE 1. Traditional Compilation Phases
|
|
|
203 |
<BR>
|
|
|
204 |
<CENTER>
|
|
|
205 |
<IMG SRC="../images/trad_scheme.gif">
|
|
|
206 |
</CENTER>
|
|
|
207 |
<BR>
|
|
|
208 |
So, we write our target independent program (top left), conforming
|
|
|
209 |
to the target independent API specification (top right). All the compilation
|
|
|
210 |
actually takes place on the target machine. This machine must have
|
|
|
211 |
the API correctly implemented (bottom right). This implementation
|
|
|
212 |
will in general be in two parts - the system headers, providing type
|
|
|
213 |
definitions, macros, procedure prototypes and so on, and the system
|
|
|
214 |
libraries, providing the actual procedure definitions. Another way
|
|
|
215 |
of characterising this division is between syntax (the system headers)
|
|
|
216 |
and semantics (the system libraries).<P>
|
|
|
217 |
The compilation is divided into three main phases. Firstly the system
|
|
|
218 |
headers are inserted into the program by the pre-processor. This produces,
|
|
|
219 |
in effect, a target dependent version of the original program. This
|
|
|
220 |
is then compiled into a binary object file. During the compilation
|
|
|
221 |
process the compiler inserts all the information it has about the
|
|
|
222 |
machine - including the Application Binary Interface (ABI) - the sizes
|
|
|
223 |
of the basic C types, how they are combined into compound types, the
|
|
|
224 |
system procedure calling conventions and so on. This ensures that
|
|
|
225 |
in the final linking phase the binary object file and the system libraries
|
|
|
226 |
are obeying the same ABI, thereby producing a valid executable. (On
|
|
|
227 |
a dynamically linked system this final linking phase takes place partially
|
|
|
228 |
at run time rather than at compile time, but this does not really
|
|
|
229 |
affect the general scheme.)<P>
|
|
|
230 |
The compilation scheme just described consists of a series of phases
|
|
|
231 |
of two types ; code combination (the pre-processing and system linking
|
|
|
232 |
phases) and code transformation (the actual compilation phases). The
|
|
|
233 |
existence of the combination phases allows for the effective separation
|
|
|
234 |
of the target independent code (in this case, the whole program) from
|
|
|
235 |
the target dependent code (in this case, the API implementation),
|
|
|
236 |
thereby aiding the construction of portable programs. These ideas
|
|
|
237 |
on the separation, combination and transformation of code underlie
|
|
|
238 |
the TDF approach to portability.<P>
|
|
|
239 |
<A NAME=S8>
|
|
|
240 |
<HR><H2>2.2. Portability Problems</H2>
|
|
|
241 |
We have set out a scheme whereby it should be possible to write portable
|
|
|
242 |
programs with a minimum of difficulties. So why, in reality, does
|
|
|
243 |
it cause so many problems? Recall that we are still primarily concerned
|
|
|
244 |
with programs which contain no target dependent code, although most
|
|
|
245 |
of the points raised apply by extension to all programs.<P>
|
|
|
246 |
<A NAME=S9>
|
|
|
247 |
<H3>2.2.1. Programming Problems</H3>
|
|
|
248 |
A first, obvious class of problems concern the program itself. It
|
|
|
249 |
is to be assumed that as many bugs as possible have been eliminated
|
|
|
250 |
by testing and debugging on at least one platform before a program
|
|
|
251 |
is considered as a candidate for being a portable program. But for
|
|
|
252 |
even the most self-contained program, working on one platform is no
|
|
|
253 |
guarantee of working on another. The program may use undefined behaviour
|
|
|
254 |
- using uninitialised values or dereferencing null pointers, for example
|
|
|
255 |
- or have built-in assumptions about the target machine - whether
|
|
|
256 |
it is big-endian or little-endian, or what the sizes of the basic
|
|
|
257 |
integer types are, for example. This latter point is going to become
|
|
|
258 |
increasingly important over the next couple of years as 64-bit architectures
|
|
|
259 |
begin to be introduced. How many existing programs implicitly assume
|
|
|
260 |
a 32-bit architecture?<P>
|
|
|
261 |
Many of these built-in assumptions may arise because of the conventional
|
|
|
262 |
porting process. A program is written on one machine, modified slightly
|
|
|
263 |
to make it work on a second machine, and so on. This means that the
|
|
|
264 |
program is "biased" towards the existing set of target machines,
|
|
|
265 |
and most particularly to the original machine it was written on. This
|
|
|
266 |
applies not only to assumptions about endianness, say, but also to
|
|
|
267 |
the questions of API conformance which we will be discussing below.<P>
|
|
|
268 |
Most compilers will pick up some of the grosser programming errors,
|
|
|
269 |
particularly by type checking (including procedure arguments if prototypes
|
|
|
270 |
are used). Some of the subtler errors can be detected using the <B>-Wall</B>
|
|
|
271 |
option to the Free Software Foundation's GNU C Compiler (<CODE>gcc</CODE>)
|
|
|
272 |
or separate program checking tools such as <CODE>lint</CODE>, for
|
|
|
273 |
example, but this remains a very difficult area.<P>
|
|
|
274 |
<A NAME=S10>
|
|
|
275 |
<H3>2.2.2. Code Transformation Problems</H3>
|
|
|
276 |
We now move on from programming problems to compilation problems.
|
|
|
277 |
As we mentioned above, compilation may be regarded as a series of
|
|
|
278 |
phases of two types : combination and transformation. Transformation
|
|
|
279 |
of code - translating a program in one form into an equivalent program
|
|
|
280 |
in another form - may lead to a variety of problems. The code may
|
|
|
281 |
be transformed wrongly, so that the equivalence is broken (a compiler
|
|
|
282 |
bug), or in an unexpected manner (differing compiler interpretations),
|
|
|
283 |
or not at all, because it is not recognised as legitimate code (a
|
|
|
284 |
compiler limitation). The latter two problems are most likely when
|
|
|
285 |
the input is a high level language, with complex syntax and semantics.<P>
|
|
|
286 |
Note that in Fig. 1 all the actual compilation takes place on the
|
|
|
287 |
target machine. So, to port the program to <I>n</I> machines, we need
|
|
|
288 |
to deal with the bugs and limitations of <I>n</I>, potentially different,
|
|
|
289 |
compilers. For example, if you have written your program using prototypes,
|
|
|
290 |
it is going to be a large and rather tedious job porting it to a compiler
|
|
|
291 |
which does not have prototypes (this particular example can be automated;
|
|
|
292 |
not all such jobs can). Other compiler limitations can be surprising
|
|
|
293 |
- not understanding the <CODE>L</CODE> suffix for long numeric literals
|
|
|
294 |
and not allowing members of enumeration types as array indexes are
|
|
|
295 |
among the problems drawn from my personal experience.<P>
|
|
|
296 |
The differing compiler interpretations may be more subtle. For example,
|
|
|
297 |
there are differences between ANSI and "traditional" C which
|
|
|
298 |
may trap the unwary. Examples are the promotion of integral types
|
|
|
299 |
and the resolution of the linkage of static objects.<P>
|
|
|
300 |
Many of these problems may be reduced by using the "same"
|
|
|
301 |
compiler on all the target machines. For example, <CODE>gcc</CODE>
|
|
|
302 |
has a single front end (C -> RTL) which may be combined with an
|
|
|
303 |
appropriate back end (RTL -> target) to form a suitable compiler
|
|
|
304 |
for a wide range of target machines. The existence of a single front
|
|
|
305 |
end virtually eliminates the problems of differing interpretation
|
|
|
306 |
of code and compiler quirks. It also reduces the exposure to bugs.
|
|
|
307 |
Instead of being exposed to the bugs in <I>n</I> separate compilers,
|
|
|
308 |
we are now only exposed to bugs in one half-compiler (the front end)
|
|
|
309 |
plus <I>n</I> half-compilers (the back ends) - a total of <I>( n +
|
|
|
310 |
1 ) / 2</I>. (This calculation is not meant totally seriously, but
|
|
|
311 |
it is true in principle.) Front end bugs, when tracked down, also
|
|
|
312 |
only require a single workaround.<P>
|
|
|
313 |
<A NAME=S11>
|
|
|
314 |
<H3>2.2.3. Code Combination Problems</H3>
|
|
|
315 |
If code transformation problems may be regarded as a time consuming
|
|
|
316 |
irritation, involving the rewriting of sections of code or using a
|
|
|
317 |
different compiler, the second class of problems, those concerned
|
|
|
318 |
with the combination of code, are far more serious.<P>
|
|
|
319 |
The first code combination phase is the pre-processor pulling in the
|
|
|
320 |
system headers. These can contain some nasty surprises. For example,
|
|
|
321 |
consider a simple ANSI compliant program which contains a linked list
|
|
|
322 |
of strings arranged in alphabetical order. This might also contain
|
|
|
323 |
a routine:<P>
|
|
|
324 |
<PRE>
|
|
|
325 |
void index ( char * ) ;
|
|
|
326 |
</PRE>
|
|
|
327 |
which adds a string to this list in the appropriate position, using
|
|
|
328 |
<CODE>strcmp</CODE> from <CODE>string.h</CODE> to find it. This works
|
|
|
329 |
fine on most machines, but on some it gives the error:<P>
|
|
|
330 |
<PRE>
|
|
|
331 |
Only 1 argument to macro 'index'
|
|
|
332 |
</PRE>
|
|
|
333 |
The reason for this is that the system version of <CODE>string.h</CODE>
|
|
|
334 |
contains the line:<P>
|
|
|
335 |
<PRE>
|
|
|
336 |
#define index ( s, c ) strchr ( s, c )
|
|
|
337 |
</PRE>
|
|
|
338 |
But this is nothing to do with ANSI, this macro is defined for compatibility
|
|
|
339 |
with BSD.<P>
|
|
|
340 |
In reality the system headers on any given machine are a hodge podge
|
|
|
341 |
of implementations of different APIs, and it is often virtually impossible
|
|
|
342 |
to separate them (feature test macros such as <CODE>_POSIX_SOURCE</CODE>
|
|
|
343 |
are of some use, but are not always implemented and do not always
|
|
|
344 |
produce a complete separation; they are only provided for "standard"
|
|
|
345 |
APIs anyway). The problem above arose because there is no transitivity
|
|
|
346 |
rule of the form : if program <I>P</I> conforms to API <I>A</I>, and
|
|
|
347 |
API <I>B</I> extends <I>A</I>, then <I>P</I> conforms to <I>B</I>.
|
|
|
348 |
The only reason this is not true is these namespace problems.<P>
|
|
|
349 |
A second example demonstrates a slightly different point. The POSIX
|
|
|
350 |
standard states that <CODE>sys/stat.h</CODE> contains the definition
|
|
|
351 |
of the structure <CODE>struct stat</CODE>, which includes several
|
|
|
352 |
members, amongst them:<P>
|
|
|
353 |
<PRE>
|
|
|
354 |
time_t st_atime ;
|
|
|
355 |
</PRE>
|
|
|
356 |
representing the access time for the corresponding file. So the program:<P>
|
|
|
357 |
<PRE>
|
|
|
358 |
#include <sys/types.h>
|
|
|
359 |
#include <sys/stat.h>
|
|
|
360 |
|
|
|
361 |
time_t st_atime ( struct stat *p )
|
|
|
362 |
{
|
|
|
363 |
return ( p->st_atime ) ;
|
|
|
364 |
}
|
|
|
365 |
</PRE>
|
|
|
366 |
should be perfectly valid - the procedure name <CODE>st_atime</CODE>
|
|
|
367 |
and the field selector <CODE>st_atime</CODE> occupy different namespaces
|
|
|
368 |
(see however the appendix on namespaces and APIs below). However at
|
|
|
369 |
least one popular operating system has the implementation:<P>
|
|
|
370 |
<PRE>
|
|
|
371 |
struct stat {
|
|
|
372 |
....
|
|
|
373 |
union {
|
|
|
374 |
time_t st__sec ;
|
|
|
375 |
timestruc_t st__tim ;
|
|
|
376 |
} st_atim ;
|
|
|
377 |
....
|
|
|
378 |
} ;
|
|
|
379 |
#define st_atime st_atim.st__sec
|
|
|
380 |
</PRE>
|
|
|
381 |
This seems like a perfectly legitimate implementation. In the program
|
|
|
382 |
above the field selector <CODE>st_atime</CODE> is replaced by <CODE>st_atim.st__sec
|
|
|
383 |
</CODE> by the pre-processor, as intended, but unfortunately so is
|
|
|
384 |
the procedure name <CODE>st_atime</CODE>, leading to a syntax error.<P>
|
|
|
385 |
The problem here is not with the program or the implementation, but
|
|
|
386 |
in the way they were combined. C does not allow individual field selectors
|
|
|
387 |
to be defined. Instead the indiscriminate sledgehammer of macro substitution
|
|
|
388 |
was used, leading to the problem described.<P>
|
|
|
389 |
Problems can also occur in the other combination phase of the traditional
|
|
|
390 |
compilation scheme, the system linking. Consider the ANSI compliant
|
|
|
391 |
routine:<P>
|
|
|
392 |
<PRE>
|
|
|
393 |
#include <stdio.h>
|
|
|
394 |
|
|
|
395 |
int open ( char *nm )
|
|
|
396 |
{
|
|
|
397 |
int c, n = 0 ;
|
|
|
398 |
FILE *f = fopen ( nm, "r" ) ;
|
|
|
399 |
if ( f == NULL ) return ( -1 ) ;
|
|
|
400 |
while ( c = getc ( f ), c != EOF ) n++ ;
|
|
|
401 |
( void ) fclose ( f ) ;
|
|
|
402 |
return ( n ) ;
|
|
|
403 |
}
|
|
|
404 |
</PRE>
|
|
|
405 |
which opens the file <CODE>nm</CODE>, returning its size in bytes
|
|
|
406 |
if it exists and -1 otherwise. As a quick porting exercise, I compiled
|
|
|
407 |
it under six different operating systems. On three it worked correctly;
|
|
|
408 |
on one it returned -1 even when the file existed; and on two it crashed
|
|
|
409 |
with a segmentation error.<P>
|
|
|
410 |
The reason for this lies in the system linking. On those machines
|
|
|
411 |
which failed the library routine <CODE>fopen</CODE> calls (either
|
|
|
412 |
directly or indirectly) the library routine <CODE>open</CODE> (which
|
|
|
413 |
is in POSIX, but not ANSI). The system linker, however, linked my
|
|
|
414 |
routine <CODE>open</CODE> instead of the system version, so the call
|
|
|
415 |
to <CODE>fopen</CODE> did not work correctly.<P>
|
|
|
416 |
So code combination problems are primarily namespace problems. The
|
|
|
417 |
task of combining the program with the API implementation on a given
|
|
|
418 |
platform is complicated by the fact that, because the system headers
|
|
|
419 |
and system libraries contain things other than the API implementation,
|
|
|
420 |
or even because of the particular implementation chosen, the various
|
|
|
421 |
namespaces in which the program is expected to operate become "polluted".
|
|
|
422 |
<P>
|
|
|
423 |
<A NAME=S12>
|
|
|
424 |
<H3>2.2.4. API Problems</H3>
|
|
|
425 |
We have said that the API defines the interface between the program
|
|
|
426 |
and the standard library provided with the operating system on the
|
|
|
427 |
target machine. There are three main problems concerned with APIs.
|
|
|
428 |
The first, how to choose the API in the first place, is discussed
|
|
|
429 |
separately. Here we deal with the compilation aspects : how to check
|
|
|
430 |
that the program conforms to its API, and what to do about incorrect
|
|
|
431 |
API implementations on the target machine(s).<P>
|
|
|
432 |
<A NAME=S13>
|
|
|
433 |
<H4>2.2.4.1. API Checking</H4>
|
|
|
434 |
The problem of whether or not a program conforms to its API - not
|
|
|
435 |
using any objects from the operating system other than those specified
|
|
|
436 |
in the API, and not making any unwarranted assumptions about these
|
|
|
437 |
objects - is one which does not always receive sufficient attention,
|
|
|
438 |
mostly because the necessary checking tools do not exist (or at least
|
|
|
439 |
are not widely available). Compiling the program on a number of API
|
|
|
440 |
compliant machines merely checks the program against the system headers
|
|
|
441 |
for these machines. For a genuine portability check we need to check
|
|
|
442 |
against the abstract API description, thereby in effect checking against
|
|
|
443 |
all possible implementations.<P>
|
|
|
444 |
Recall from above that the system headers on a given machine are an
|
|
|
445 |
amalgam of all the APIs it implements. This can cause programs which
|
|
|
446 |
should compile not to, because of namespace clashes; but it may also
|
|
|
447 |
cause programs to compile which should not, because they have used
|
|
|
448 |
objects which are not in their API, but which are in the system headers.
|
|
|
449 |
For example, the supposedly ANSI compliant program:<P>
|
|
|
450 |
<PRE>
|
|
|
451 |
#include <signal.h>
|
|
|
452 |
int sig = SIGKILL ;
|
|
|
453 |
</PRE>
|
|
|
454 |
will compile on most systems, despite the fact that <CODE>SIGKILL</CODE>
|
|
|
455 |
is not an ANSI signal, because <CODE>SIGKILL</CODE> is in POSIX, which
|
|
|
456 |
is also implemented in the system <CODE>signal.h</CODE>. Again, feature
|
|
|
457 |
test macros are of some use in trying to isolate the implementation
|
|
|
458 |
of a single API from the rest of the system headers. However they
|
|
|
459 |
are highly unlikely to detect the error in the following supposedly
|
|
|
460 |
POSIX compliant program which prints the entries of the directory
|
|
|
461 |
<CODE>nm</CODE>, together with their inode numbers:<P>
|
|
|
462 |
<PRE>
|
|
|
463 |
#include <stdio.h>
|
|
|
464 |
#include <sys/types.h>
|
|
|
465 |
#include <dirent.h>
|
|
|
466 |
|
|
|
467 |
void listdir ( char *nm )
|
|
|
468 |
{
|
|
|
469 |
struct dirent *entry ;
|
|
|
470 |
DIR *dir = opendir ( nm ) ;
|
|
|
471 |
if ( dir == NULL ) return ;
|
|
|
472 |
while ( entry = readdir ( dir ), entry != NULL ) {
|
|
|
473 |
printf ( "%s : %d\n", entry->d_name, ( int ) entry->d_ino ) ;
|
|
|
474 |
}
|
|
|
475 |
( void ) closedir ( dir ) ;
|
|
|
476 |
return ;
|
|
|
477 |
}
|
|
|
478 |
</PRE>
|
|
|
479 |
This is not POSIX compliant because, whereas the <CODE>d_name</CODE>
|
|
|
480 |
field of <CODE>struct dirent</CODE> is in POSIX, the <CODE>d_ino</CODE>
|
|
|
481 |
field is not. It is however in XPG3, so it is likely to be in many
|
|
|
482 |
system implementations.<P>
|
|
|
483 |
The previous examples have been concerned with simply telling whether
|
|
|
484 |
or not a particular object is in an API. A more difficult, and in
|
|
|
485 |
a way more important, problem is that of assuming too much about the
|
|
|
486 |
objects which are in the API. For example, in the program:<P>
|
|
|
487 |
<PRE>
|
|
|
488 |
#include <stdio.h>
|
|
|
489 |
#include <stdlib.h>
|
|
|
490 |
|
|
|
491 |
div_t d = { 3, 4 } ;
|
|
|
492 |
|
|
|
493 |
int main ()
|
|
|
494 |
{
|
|
|
495 |
printf ( "%d,%d\n", d.quot, d.rem ) ;
|
|
|
496 |
return ( 0 ) ;
|
|
|
497 |
}
|
|
|
498 |
</PRE>
|
|
|
499 |
the ANSI standard specifies that the type <CODE>div_t</CODE> is a
|
|
|
500 |
structure containing two fields, <CODE>quot</CODE> and <CODE>rem</CODE>,
|
|
|
501 |
of type <CODE>int</CODE>, but it does not specify which order these
|
|
|
502 |
fields appear in, or indeed if there are other fields. Therefore the
|
|
|
503 |
initialisation of <CODE>d</CODE> is not portable. Again, the type
|
|
|
504 |
<CODE>time_t</CODE> is used to represent times in seconds since a
|
|
|
505 |
certain fixed date. On most systems this is implemented as <CODE>long</CODE>,
|
|
|
506 |
so it is tempting to use <CODE>( t & 1 )</CODE> to determine for
|
|
|
507 |
a <CODE>time_t</CODE> <CODE>t</CODE> whether this number of seconds
|
|
|
508 |
is odd or even. But ANSI actually says that <CODE>time_t</CODE> is
|
|
|
509 |
an arithmetic, not an integer, type, so it would be possible for it
|
|
|
510 |
to be implemented as <CODE>double</CODE>. But in this case <CODE>(
|
|
|
511 |
t & 1 )</CODE> is not even type correct, so it is not a portable
|
|
|
512 |
way of finding out whether <CODE>t</CODE> is odd or even.<P>
|
|
|
513 |
<A NAME=S14>
|
|
|
514 |
<H4>2.2.4.2. API Implementation Errors</H4>
|
|
|
515 |
Undoubtedly the problem which causes the writer of portable programs
|
|
|
516 |
the greatest headache (and heartache) is that of incorrect API implementations.
|
|
|
517 |
However carefully you have chosen your API and checked that your program
|
|
|
518 |
conforms to it, you are still reliant on someone (usually the system
|
|
|
519 |
vendor) having implemented this API correctly on the target machine.
|
|
|
520 |
Machines which do not implement the API at all do not enter the equation
|
|
|
521 |
(they are not suitable target machines), what causes problems is incorrect
|
|
|
522 |
implementations. As the implementation may be divided into two parts
|
|
|
523 |
- system headers and system libraries - we shall similarly divide
|
|
|
524 |
our discussion. Inevitably the choice of examples is personal; anyone
|
|
|
525 |
who has ever attempted to port a program to a new machine is likely
|
|
|
526 |
to have their own favourite examples.<P>
|
|
|
527 |
<A NAME=S15>
|
|
|
528 |
<H4>2.2.4.3. System Header Problems</H4>
|
|
|
529 |
Some header problems are immediately apparent because they are syntactic
|
|
|
530 |
and cause the program to fail to compile. For example, values may
|
|
|
531 |
not be defined or be defined in the wrong place (not in the header
|
|
|
532 |
prescribed by the API).<P>
|
|
|
533 |
A common example (one which I have to include a workaround for in
|
|
|
534 |
virtually every program I write) is that <CODE>EXIT_SUCCESS</CODE>
|
|
|
535 |
and <CODE>EXIT_FAILURE</CODE> are not always defined (ANSI specifies
|
|
|
536 |
that they should be in <CODE>stdlib.h</CODE>). It is tempting to change
|
|
|
537 |
<CODE>exit (EXIT_FAILURE)</CODE> to <CODE>exit (1)</CODE> because
|
|
|
538 |
"everyone knows" that <CODE>EXIT_FAILURE</CODE> is 1. But
|
|
|
539 |
this is to decrease the portability of the program because it ties
|
|
|
540 |
it to a particular class of implementations. A better workaround would
|
|
|
541 |
be:<P>
|
|
|
542 |
<PRE>
|
|
|
543 |
#include <stdlib.h>
|
|
|
544 |
#ifndef EXIT_FAILURE
|
|
|
545 |
#define EXIT_FAILURE 1
|
|
|
546 |
#endif
|
|
|
547 |
</PRE>
|
|
|
548 |
which assumes that anyone choosing a non-standard value for <CODE>EXIT_FAILURE
|
|
|
549 |
</CODE> is more likely to put it in <CODE>stdlib.h</CODE>. Of course,
|
|
|
550 |
if one subsequently came across a machine on which not only is <CODE>EXIT_FAILURE
|
|
|
551 |
</CODE> not defined, but also the value it should have is not 1, then
|
|
|
552 |
it would be necessary to resort to <CODE>#ifdef machine_name</CODE>
|
|
|
553 |
statements. The same is true of all the API implementation problems
|
|
|
554 |
we shall be discussing : non-conformant machines require workarounds
|
|
|
555 |
involving conditional compilation. As more machines are considered,
|
|
|
556 |
so these conditional compilations multiply.<P>
|
|
|
557 |
As an example of things being defined in the wrong place, ANSI specifies
|
|
|
558 |
that <CODE>SEEK_SET</CODE>, <CODE>SEEK_CUR</CODE> and <CODE>SEEK_END</CODE>
|
|
|
559 |
should be defined in <CODE>stdio.h</CODE>, whereas POSIX specifies
|
|
|
560 |
that they should also be defined in <CODE>unistd.h</CODE>. It is not
|
|
|
561 |
uncommon to find machines on which they are defined in the latter
|
|
|
562 |
but not in the former. A possible workaround in this case would be:<P>
|
|
|
563 |
<PRE>
|
|
|
564 |
#include <stdio.h>
|
|
|
565 |
#ifndef SEEK_SET
|
|
|
566 |
#include <unistd.h>
|
|
|
567 |
#endif
|
|
|
568 |
</PRE>
|
|
|
569 |
Of course, by including "unnecessary" headers like <CODE>unistd.h
|
|
|
570 |
</CODE> the risk of namespace clashes such as those discussed above
|
|
|
571 |
is increased.<P>
|
|
|
572 |
A final syntactic problem, which perhaps should belong with the system
|
|
|
573 |
header problems above, concerns dependencies between the headers themselves.
|
|
|
574 |
For example, the POSIX header <CODE>unistd.h</CODE> declares functions
|
|
|
575 |
involving some of the types <CODE>pid_t</CODE>, <CODE>uid_t</CODE>
|
|
|
576 |
etc, defined in <CODE>sys/types.h</CODE>. Is it necessary to include
|
|
|
577 |
<CODE>sys/types.h</CODE> before including <CODE>unistd.h</CODE>, or
|
|
|
578 |
does <CODE>unistd.h</CODE> automatically include <CODE>sys/types.h</CODE>?
|
|
|
579 |
The approach of playing safe and including everything will normally
|
|
|
580 |
work, but this can lead to multiple inclusions of a header. This will
|
|
|
581 |
normally cause no problems because the system headers are protected
|
|
|
582 |
against multiple inclusions by means of macros, but it is not unknown
|
|
|
583 |
for certain headers to be left unprotected. Also not all header dependencies
|
|
|
584 |
are as clear cut as the one given, so that what headers need to be
|
|
|
585 |
included, and in what order, is in fact target dependent.<P>
|
|
|
586 |
There can also be semantic errors in the system headers : namely wrongly
|
|
|
587 |
defined values. The following two examples are taken from real operating
|
|
|
588 |
systems. Firstly the definition:<P>
|
|
|
589 |
<PRE>
|
|
|
590 |
#define DBL_MAX 1.797693134862316E+308
|
|
|
591 |
</PRE>
|
|
|
592 |
in <CODE>float.h</CODE> on an IEEE-compliant machine is subtly wrong
|
|
|
593 |
- the given value does not fit into a <CODE>double</CODE> - the correct
|
|
|
594 |
value is:<P>
|
|
|
595 |
<PRE>
|
|
|
596 |
#define DBL_MAX 1.7976931348623157E+308
|
|
|
597 |
</PRE>
|
|
|
598 |
Again, the type definition:<P>
|
|
|
599 |
<PRE>
|
|
|
600 |
typedef int size_t ; /* ??? */
|
|
|
601 |
</PRE>
|
|
|
602 |
(sic) is not compliant with ANSI, which says that <CODE>size_t</CODE>
|
|
|
603 |
is an unsigned integer type. (I'm not sure if this is better or worse
|
|
|
604 |
than another system which defines <CODE>ptrdiff_t</CODE> to be <CODE>unsigned
|
|
|
605 |
int</CODE> when it is meant to be signed. This would mean that the
|
|
|
606 |
difference between any two pointers is always positive.) These particular
|
|
|
607 |
examples are irritating because it would have cost nothing to get
|
|
|
608 |
things right, correcting the value of <CODE>DBL_MAX</CODE> and changing
|
|
|
609 |
the definition of <CODE>size_t</CODE> to <CODE>unsigned int</CODE>.
|
|
|
610 |
These corrections are so minor that the modified system headers would
|
|
|
611 |
still be a valid interface for the existing system libraries (we shall
|
|
|
612 |
have more to say about this later). However it is not possible to
|
|
|
613 |
change the system headers, so it is necessary to build workarounds
|
|
|
614 |
into the program. Whereas in the first case it is possible to devise
|
|
|
615 |
such a workaround:<P>
|
|
|
616 |
<PRE>
|
|
|
617 |
#include <float.h>
|
|
|
618 |
#ifdef machine_name
|
|
|
619 |
#undef DBL_MAX
|
|
|
620 |
#define DBL_MAX 1.7976931348623157E+308
|
|
|
621 |
#endif
|
|
|
622 |
</PRE>
|
|
|
623 |
for example, in the second, because <CODE>size_t</CODE> is defined
|
|
|
624 |
by a <CODE>typedef</CODE> it is virtually impossible to correct in
|
|
|
625 |
a simple fashion. Thus any program which relies on the fact that <CODE>size_t
|
|
|
626 |
</CODE> is unsigned will require considerable rewriting before it
|
|
|
627 |
can be ported to this machine.<P>
|
|
|
628 |
<A NAME=S16>
|
|
|
629 |
<H4>2.2.4.4. System Library Problems</H4>
|
|
|
630 |
The system header problems just discussed are primarily syntactic
|
|
|
631 |
problems. By contrast, system library problems are primarily semantic
|
|
|
632 |
- the provided library routines do not behave in the way specified
|
|
|
633 |
by the API. This makes them harder to detect. For example, consider
|
|
|
634 |
the routine:<P>
|
|
|
635 |
<PRE>
|
|
|
636 |
void *realloc ( void *p, size_t s ) ;
|
|
|
637 |
</PRE>
|
|
|
638 |
which reallocates the block of memory <CODE>p</CODE> to have size
|
|
|
639 |
<CODE>s</CODE> bytes, returning the new block of memory. The ANSI
|
|
|
640 |
standard says that if <CODE>p</CODE> is the null pointer, then the
|
|
|
641 |
effect of <CODE>realloc ( p, s )</CODE> is the same as <CODE>malloc
|
|
|
642 |
( s )</CODE>, that is, to allocate a new block of memory of size <CODE>s</CODE>.
|
|
|
643 |
This behaviour is exploited in the following program, in which the
|
|
|
644 |
routine <CODE>add_char</CODE> adds a character to the expanding array,
|
|
|
645 |
<CODE>buffer</CODE>:<P>
|
|
|
646 |
<PRE>
|
|
|
647 |
#include <stdio.h>
|
|
|
648 |
#include <stdlib.h>
|
|
|
649 |
|
|
|
650 |
char *buffer = NULL ;
|
|
|
651 |
int buff_sz = 0, buff_posn = 0 ;
|
|
|
652 |
|
|
|
653 |
void add_char ( char c )
|
|
|
654 |
{
|
|
|
655 |
if ( buff_posn >= buff_sz ) {
|
|
|
656 |
buff_sz += 100 ;
|
|
|
657 |
buffer = ( char * ) realloc ( ( void * ) buffer, buff_sz * sizeof ( char ) ) ;
|
|
|
658 |
if ( buffer == NULL ) {
|
|
|
659 |
fprintf ( stderr, "Memory allocation error\n" ) ;
|
|
|
660 |
exit ( EXIT_FAILURE ) ;
|
|
|
661 |
}
|
|
|
662 |
}
|
|
|
663 |
buffer [ buff_posn++ ] = c ;
|
|
|
664 |
return ;
|
|
|
665 |
}
|
|
|
666 |
</PRE>
|
|
|
667 |
On the first call of <CODE>add_char</CODE>, <CODE>buffer</CODE> is
|
|
|
668 |
set to a real block of memory (as opposed to <CODE>NULL</CODE>) by
|
|
|
669 |
a call of the form <CODE>realloc ( NULL, s )</CODE>. This is extremely
|
|
|
670 |
convenient and efficient - if it was not for this behaviour we would
|
|
|
671 |
have to have an explicit initialisation of <CODE>buffer</CODE>, either
|
|
|
672 |
as a special case in <CODE>add_char</CODE> or in a separate initialisation
|
|
|
673 |
routine.<P>
|
|
|
674 |
Of course this all depends on the behaviour of <CODE>realloc ( NULL,
|
|
|
675 |
s )</CODE> having been implemented precisely as described in the ANSI
|
|
|
676 |
standard. The first indication that this is not so on a particular
|
|
|
677 |
target machine might be when the program is compiled and run on that
|
|
|
678 |
machine for the first time and does not perform as expected. To track
|
|
|
679 |
the problem down will demand time debugging the program.<P>
|
|
|
680 |
Once the problem has been identified as being with <CODE>realloc</CODE>
|
|
|
681 |
a number of possible workarounds are possible. Perhaps the most interesting
|
|
|
682 |
is to replace the inclusion of <CODE>stdlib.h</CODE> by the following:<P>
|
|
|
683 |
<PRE>
|
|
|
684 |
#include <stdlib.h>
|
|
|
685 |
#ifdef machine_name
|
|
|
686 |
#define realloc ( p, s )\
|
|
|
687 |
( ( p ) ? ( realloc ) ( p, s ) : malloc ( s ) )
|
|
|
688 |
#endif
|
|
|
689 |
</PRE>
|
|
|
690 |
where <CODE>realloc ( p, s )</CODE> is redefined as a macro which
|
|
|
691 |
is the result of the procedure <CODE>realloc</CODE> if <CODE>p</CODE>
|
|
|
692 |
is not null, and <CODE>malloc ( s )</CODE> otherwise. (In fact this
|
|
|
693 |
macro will not always have the desired effect, although it does in
|
|
|
694 |
this case. Why (exercise)?)<P>
|
|
|
695 |
The only alternative to this trial and error approach to finding API
|
|
|
696 |
implementation problems is the application of personal experience,
|
|
|
697 |
either of the particular target machine or of things that are implemented
|
|
|
698 |
wrongly by many machines and as such should be avoided. This sort
|
|
|
699 |
of detailed knowledge is not easily acquired. Nor can it ever be complete:
|
|
|
700 |
new operating system releases are becoming increasingly regular and
|
|
|
701 |
are on occasions quite as likely to introduce new implementation errors
|
|
|
702 |
as to solve existing ones. It is in short a "black art".<P>
|
|
|
703 |
<A NAME=S17>
|
|
|
704 |
<HR><H2>2.3. APIs and Portability</H2>
|
|
|
705 |
We now return to our discussion of the general issues involved in
|
|
|
706 |
portability to more closely examine the role of the API.<P>
|
|
|
707 |
<A NAME=S18>
|
|
|
708 |
<H3>2.3.1. Target Dependent Code</H3>
|
|
|
709 |
So far we have been considering programs which contain no conditional
|
|
|
710 |
compilation, in which the API forms the basis of the separation of
|
|
|
711 |
the target independent code (the whole program) and the target dependent
|
|
|
712 |
code (the API implementation). But a glance at most large C programs
|
|
|
713 |
will reveal that they do contain conditional compilation. The code
|
|
|
714 |
is scattered with <CODE>#if</CODE>'s and <CODE>#ifdef</CODE>'s which,
|
|
|
715 |
in effect, cause the pre-processor to construct slightly different
|
|
|
716 |
programs on different target machines. So here we do not have a clean
|
|
|
717 |
division between the target independent and the target dependent code
|
|
|
718 |
- there are small sections of target dependent code spread throughout
|
|
|
719 |
the program.<P>
|
|
|
720 |
Let us briefly consider some of the reasons why it is necessary to
|
|
|
721 |
introduce this conditional compilation. Some have already been mentioned
|
|
|
722 |
- workarounds for compiler bugs, compiler limitations, and API implementation
|
|
|
723 |
errors; others will be considered later. However the most interesting
|
|
|
724 |
and important cases concern things which need to be done genuinely
|
|
|
725 |
differently on different machines. This can be because they really
|
|
|
726 |
cannot be expressed in a target independent manner, or because the
|
|
|
727 |
target independent way of doing them is unacceptably inefficient.<P>
|
|
|
728 |
Efficiency (either in terms of time or space) is a key issue in many
|
|
|
729 |
programs. The argument is often advanced that writing a program portably
|
|
|
730 |
means using the, often inefficient, lowest common denominator approach.
|
|
|
731 |
But under our definition of portability it is the functionality that
|
|
|
732 |
matters, not the actual source code. There is nothing to stop different
|
|
|
733 |
code being used on different machines for reasons of efficiency.<P>
|
|
|
734 |
To examine the relationship between target dependent code and APIs,
|
|
|
735 |
consider the simple program:<P>
|
|
|
736 |
<PRE>
|
|
|
737 |
#include <stdio.h>
|
|
|
738 |
|
|
|
739 |
int main ()
|
|
|
740 |
{
|
|
|
741 |
#ifdef mips
|
|
|
742 |
fputs ( "This machine is a mips\n", stdout ) ;
|
|
|
743 |
#endif
|
|
|
744 |
return ( 0 ) ;
|
|
|
745 |
}
|
|
|
746 |
</PRE>
|
|
|
747 |
which prints a message if the target machine is a mips. What is the
|
|
|
748 |
API of this program? Basically it is the same as in the "Hello
|
|
|
749 |
world" example discussed in sections 2.1.1</A> and 2.1.2</A>,
|
|
|
750 |
but if we wish the API to fully describe the interface between the
|
|
|
751 |
program and the target machine, we must also say that whether or not
|
|
|
752 |
the macro <CODE>mips</CODE> is defined is part of the API. Like the
|
|
|
753 |
rest of the API, this has a semantic aspect as well as a syntactic
|
|
|
754 |
- in this case that <CODE>mips</CODE> is only defined on mips machines.
|
|
|
755 |
Where it differs is in its implementation. Whereas the main part of
|
|
|
756 |
the API is implemented in the system headers and the system libraries,
|
|
|
757 |
the implementation of either defining, or not defining, <CODE>mips</CODE>
|
|
|
758 |
ultimately rests with the person performing the compilation. (In this
|
|
|
759 |
particular example, the macro <CODE>mips</CODE> is normally built
|
|
|
760 |
into the compiler on mips machines, but this is only a convention.)<P>
|
|
|
761 |
So the API in this case has two components : a system-defined part
|
|
|
762 |
which is implemented in the system headers and system libraries, and
|
|
|
763 |
a user-defined part which ultimately relies on the person performing
|
|
|
764 |
the compilation to provide an implementation. The main point to be
|
|
|
765 |
made in this section is that introducing target dependent code is
|
|
|
766 |
equivalent to introducing a user-defined component to the API. The
|
|
|
767 |
actual compilation process in the case of programs containing target
|
|
|
768 |
dependent code is basically the same as that shown in Fig. 1. But
|
|
|
769 |
whereas previously the vertical division of the diagram also reflects
|
|
|
770 |
a division of responsibility - the left hand side is the responsibility
|
|
|
771 |
of the programmer (the person writing the program), and the right
|
|
|
772 |
hand side of the API specifier (for example, a standards defining
|
|
|
773 |
body) and the API implementor (the system vendor) - now the right
|
|
|
774 |
hand side is partially the responsibility of the programmer and the
|
|
|
775 |
person performing the compilation. The programmer specifies the user-defined
|
|
|
776 |
component of the API, and the person compiling the program either
|
|
|
777 |
implements this API (as in the mips example above) or chooses between
|
|
|
778 |
a number of alternative implementations provided by the programmer
|
|
|
779 |
(as in the example below).<P>
|
|
|
780 |
Let us consider a more complex example. Consider the following program
|
|
|
781 |
which assumes, for simplicity, that an <CODE>unsigned int</CODE> contains
|
|
|
782 |
32 bits:<P>
|
|
|
783 |
<PRE>
|
|
|
784 |
#include <stdio.h>
|
|
|
785 |
#include "config.h"
|
|
|
786 |
|
|
|
787 |
#ifndef SLOW_SHIFT
|
|
|
788 |
#define MSB ( a ) ( ( unsigned char ) ( a >> 24 ) )
|
|
|
789 |
#else
|
|
|
790 |
#ifdef BIG_ENDIAN
|
|
|
791 |
#define MSB ( a ) *( ( unsigned char * ) &( a ) )
|
|
|
792 |
#else
|
|
|
793 |
#define MSB ( a ) *( ( unsigned char * ) &( a ) + 3 )
|
|
|
794 |
#endif
|
|
|
795 |
#endif
|
|
|
796 |
|
|
|
797 |
unsigned int x = 100000000 ;
|
|
|
798 |
|
|
|
799 |
int main ()
|
|
|
800 |
{
|
|
|
801 |
printf ( "%u\n", MSB ( x ) ) ;
|
|
|
802 |
return ( 0 ) ;
|
|
|
803 |
}
|
|
|
804 |
</PRE>
|
|
|
805 |
The intention is to print the most significant byte of <CODE>x</CODE>.
|
|
|
806 |
Three alternative definitions of the macro <CODE>MSB</CODE> used to
|
|
|
807 |
extract this value are provided. The first, if <CODE>SLOW_SHIFT</CODE>
|
|
|
808 |
is not defined, is simply to shift the value right by 24 bits. This
|
|
|
809 |
will work on all 32-bit machines, but may be inefficient (depending
|
|
|
810 |
on the nature of the machine's shift instruction). So two alternatives
|
|
|
811 |
are provided. An <CODE>unsigned int</CODE> is assumed to consist of
|
|
|
812 |
four <CODE>unsigned char</CODE>'s. On a big-endian machine, the most
|
|
|
813 |
significant byte is the first of these <CODE>unsigned char</CODE>'s;
|
|
|
814 |
on a little-endian machine it is the fourth. The second definition
|
|
|
815 |
of <CODE>MSB</CODE> is intended to reflect the former case, and the
|
|
|
816 |
third the latter.<P>
|
|
|
817 |
The person compiling the program has to choose between the three possible
|
|
|
818 |
implementations of <CODE>MSB</CODE> provided by the programmer. This
|
|
|
819 |
is done by either defining, or not defining, the macros <CODE>SLOW_SHIFT</CODE>
|
|
|
820 |
and <CODE>BIG_ENDIAN</CODE>. This could be done as command line options,
|
|
|
821 |
but we have chosen to reflect another commonly used device, the configuration
|
|
|
822 |
file. For each target machine, the programmer provides a version of
|
|
|
823 |
the file <CODE>config.h</CODE> which defines the appropriate combination
|
|
|
824 |
of the macros <CODE>SLOW_SHIFT</CODE> and <CODE>BIG_ENDIAN</CODE>.
|
|
|
825 |
The person performing the compilation simply chooses the appropriate
|
|
|
826 |
<CODE>config.h</CODE> for the target machine.<P>
|
|
|
827 |
There are two possible ways of looking at what the user-defined API
|
|
|
828 |
of this program is. Possibly it is most natural to say that it is
|
|
|
829 |
<CODE>MSB</CODE>, but it could also be argued that it is the macros
|
|
|
830 |
<CODE>SLOW_SHIFT</CODE> and <CODE>BIG_ENDIAN</CODE>. The former more
|
|
|
831 |
accurately describes the target dependent code, but is only implemented
|
|
|
832 |
indirectly, via the latter.<P>
|
|
|
833 |
<A NAME=S19>
|
|
|
834 |
<H3>2.3.2. Making APIs Explicit</H3>
|
|
|
835 |
As we have said, every program has an API even if it is implicit rather
|
|
|
836 |
than explicit. Every system header included, every type or value used
|
|
|
837 |
from it, and every library routine used, adds to the system-defined
|
|
|
838 |
component of the API, and every conditional compilation adds to the
|
|
|
839 |
user-defined component. What making the API explicit does is to encapsulate
|
|
|
840 |
the set of requirements that the program has of the target machine
|
|
|
841 |
(including requirements like, I need to know whether or not the target
|
|
|
842 |
machine is big-endian, as well as, I need <CODE>fputs</CODE> to be
|
|
|
843 |
implemented as in the ANSI standard). By making these requirements
|
|
|
844 |
explicit it is made absolutely clear what is needed on a target machine
|
|
|
845 |
if a program is to be ported to it. If the requirements are not explicit
|
|
|
846 |
this can only be found by trial and error. This is what we meant earlier
|
|
|
847 |
by saying that a program without an explicit API is only portable
|
|
|
848 |
by accident.<P>
|
|
|
849 |
Another advantage of specifying the requirements of a program is that
|
|
|
850 |
it may increase their chances of being implemented. We have spoken
|
|
|
851 |
as if porting is a one-way process; program writers porting their
|
|
|
852 |
programs to new machines. But there is also traffic the other way.
|
|
|
853 |
Machine vendors may wish certain programs to be ported to their machines.
|
|
|
854 |
If these programs come with a list of requirements then the vendor
|
|
|
855 |
knows precisely what to implement in order to make such a port possible.<P>
|
|
|
856 |
<A NAME=S20>
|
|
|
857 |
<H3>2.3.3. Choosing an API</H3>
|
|
|
858 |
So how does one go about choosing an API? In a sense the user-defined
|
|
|
859 |
component is easier to specify than the system-defined component because
|
|
|
860 |
it is less tied to particular implementation models. What is required
|
|
|
861 |
is to abstract out what exactly needs to be done in a target dependent
|
|
|
862 |
manner and to decide how best to separate it out. The most difficult
|
|
|
863 |
problem is how to make the implementation of this API as simple as
|
|
|
864 |
possible for the person performing the compilation, if necessary providing
|
|
|
865 |
a number of alternative implementations to choose between and a simple
|
|
|
866 |
method of making this choice (for example, the <CODE>config.h</CODE>
|
|
|
867 |
file above). With the system-defined component the question is more
|
|
|
868 |
likely to be, how do the various target machines I have in mind implement
|
|
|
869 |
what I want to do? The abstraction of this is usually to choose a
|
|
|
870 |
standard and widely implemented API, such as POSIX, which provides
|
|
|
871 |
all the necessary functionality.<P>
|
|
|
872 |
The choice of "standard" API is of course influenced by
|
|
|
873 |
the type of target machines one has in mind. Within the Unix world,
|
|
|
874 |
the increasing adoption of Open Standards, such as POSIX, means that
|
|
|
875 |
choosing a standard API which is implemented on a wide variety Unix
|
|
|
876 |
boxes is becoming easier. Similarly, choosing an API which will work
|
|
|
877 |
on most MSDOS machines should cause few problems. The difficulty is
|
|
|
878 |
that these are disjoint worlds; it is very difficult to find a standard
|
|
|
879 |
API which is implemented on both Unix and MSDOS machines. At present
|
|
|
880 |
not much can be done about this, it reflects the disjoint nature of
|
|
|
881 |
the computer market.<P>
|
|
|
882 |
To develop a similar point : the drawback of choosing POSIX (for example)
|
|
|
883 |
as an API is that it restricts the range of possible target machines
|
|
|
884 |
to machines which implement POSIX. Other machines, for example, BSD
|
|
|
885 |
compliant machines, might offer the same functionality (albeit using
|
|
|
886 |
different methods), so they should be potential target machines, but
|
|
|
887 |
they have been excluded by the choice of API. One approach to the
|
|
|
888 |
problem is the "alternative API" approach. Both the POSIX
|
|
|
889 |
and the BSD variants are built into the program, but only one is selected
|
|
|
890 |
on any given target machine by means of conditional compilation. Under
|
|
|
891 |
our "equivalent functionality" definition of portability,
|
|
|
892 |
this is a program which is portable to both POSIX and BSD compliant
|
|
|
893 |
machines. But viewed in the light of the discussion above, if we regard
|
|
|
894 |
a program as a program-API pair, it could be regarded as two separate
|
|
|
895 |
programs combined on a single source code tree. A more interesting
|
|
|
896 |
approach would be to try to abstract out what exactly the functionality
|
|
|
897 |
which both POSIX and BSD offer is and use that as the API. Then instead
|
|
|
898 |
of two separate APIs we would have a single API with two broad classes
|
|
|
899 |
of implementations. The advantage of this latter approach becomes
|
|
|
900 |
clear if wished to port the program to a machine which implements
|
|
|
901 |
neither POSIX nor BSD, but provides the equivalent functionality in
|
|
|
902 |
a third way.<P>
|
|
|
903 |
As a simple example, both POSIX and BSD provide very similar methods
|
|
|
904 |
for scanning the entries of a directory. The main difference is that
|
|
|
905 |
the POSIX version is defined in <CODE>dirent.h</CODE> and uses a structure
|
|
|
906 |
called <CODE>struct dirent</CODE>, whereas the BSD version is defined
|
|
|
907 |
in <CODE>sys/dir.h</CODE> and calls the corresponding structure <CODE>struct
|
|
|
908 |
direct</CODE>. The actual routines for manipulating directories are
|
|
|
909 |
the same in both cases. So the only abstraction required to unify
|
|
|
910 |
these two APIs is to introduce an abstract type, <CODE>dir_entry</CODE>
|
|
|
911 |
say, which can be defined by:<P>
|
|
|
912 |
<PRE>
|
|
|
913 |
typedef struct dirent dir_entry ;
|
|
|
914 |
</PRE>
|
|
|
915 |
on POSIX machines, and:<P>
|
|
|
916 |
<PRE>
|
|
|
917 |
typedef struct direct dir_entry ;
|
|
|
918 |
</PRE>
|
|
|
919 |
on BSD machines. Note how this portion of the API crosses the system-user
|
|
|
920 |
boundary. The object <CODE>dir_entry</CODE> is defined in terms of
|
|
|
921 |
the objects in the system headers, but the precise definition depends
|
|
|
922 |
on a user-defined value (whether the target machine implements POSIX
|
|
|
923 |
or BSD).<P>
|
|
|
924 |
<A NAME=S21>
|
|
|
925 |
<H3>2.3.4. Alternative Program Versions</H3>
|
|
|
926 |
Another reason for introducing conditional compilation which relates
|
|
|
927 |
to APIs is the desire to combine several programs, or versions of
|
|
|
928 |
programs, on a single source tree. There are several cases to be distinguished
|
|
|
929 |
between. The reuse of code between genuinely different programs does
|
|
|
930 |
not really enter the argument : any given program will only use one
|
|
|
931 |
route through the source tree, so there is no real conditional compilation
|
|
|
932 |
per se in the program. What is more interesting is the use of conditional
|
|
|
933 |
compilation to combine several versions of the same program on the
|
|
|
934 |
same source tree to provide additional or alternative features.<P>
|
|
|
935 |
It could be argued that the macros (or whatever) used to select between
|
|
|
936 |
the various versions of the program are just part of the user-defined
|
|
|
937 |
API as before. But consider a simple program which reads in some numerical
|
|
|
938 |
input, say, processes it, and prints the results. This might, for
|
|
|
939 |
example, have POSIX as its API. We may wish to optionally enhance
|
|
|
940 |
this by displaying the results graphically rather than textually on
|
|
|
941 |
machines which have X Windows, the compilation being conditional on
|
|
|
942 |
some boolean value, <CODE>HAVE_X_WINDOWS</CODE>, say. What is the
|
|
|
943 |
API of the resultant program? The answer from the point of view of
|
|
|
944 |
the program is the union of POSIX, X Windows and the user-defined
|
|
|
945 |
value <CODE>HAVE_X_WINDOWS</CODE>. But from the implementation point
|
|
|
946 |
of view we can either implement POSIX and set <CODE>HAVE_X_WINDOWS</CODE>
|
|
|
947 |
to false, or implement both POSIX and X Windows and set <CODE>HAVE_X_WINDOWS
|
|
|
948 |
</CODE> to true. So what introducing <CODE>HAVE_X_WINDOWS</CODE> does
|
|
|
949 |
is to allow flexibility in the API implementation.<P>
|
|
|
950 |
This is very similar to the alternative APIs discussed above. However
|
|
|
951 |
the approach outlined will really only work for optional API extensions.
|
|
|
952 |
To work in the alternative API case, we would need to have the union
|
|
|
953 |
of POSIX, BSD and a boolean value, say, as the API. Although this
|
|
|
954 |
is possible in theory, it is likely to lead to namespace clashes between
|
|
|
955 |
POSIX and BSD.<P>
|
|
|
956 |
<HR><H2>Appendix: Namespaces and APIs</H2>
|
|
|
957 |
Namespace problems are amongst the most difficult faced by standard
|
|
|
958 |
defining bodies (for example, the ANSI and POSIX committees) and they
|
|
|
959 |
often go to great lengths to specify which names should, and should
|
|
|
960 |
not, appear when certain headers are included. (The position is set
|
|
|
961 |
out in D. F. Prosser, <I>Header and name space rules for UNIX systems</I>
|
|
|
962 |
(private communication), USL, 1993.)<P>
|
|
|
963 |
For example, the intention, certainly in ANSI, is that each header
|
|
|
964 |
should operate as an independent sub-API. Thus <CODE>va_list</CODE>
|
|
|
965 |
is prohibited from appearing in the namespace when <CODE>stdio.h</CODE>
|
|
|
966 |
is included (it is defined only in <CODE>stdarg.h</CODE>) despite
|
|
|
967 |
the fact that it appears in the prototype:<P>
|
|
|
968 |
<PRE>
|
|
|
969 |
int vprintf ( char *, va_list ) ;
|
|
|
970 |
</PRE>
|
|
|
971 |
This seeming contradiction is worked round on most implementations
|
|
|
972 |
by defining a type <CODE>__va_list</CODE> in <CODE>stdio.h</CODE>
|
|
|
973 |
which has exactly the same definition as <CODE>va_list</CODE>, and
|
|
|
974 |
declaring <CODE>vprintf</CODE> as:<P>
|
|
|
975 |
<PRE>
|
|
|
976 |
int vprintf ( char *, __va_list ) ;
|
|
|
977 |
</PRE>
|
|
|
978 |
This is only legal because <CODE>__va_list</CODE> is deemed not to
|
|
|
979 |
corrupt the namespace because of the convention that names beginning
|
|
|
980 |
with <CODE>__</CODE> are reserved for implementation use.<P>
|
|
|
981 |
This particular namespace convention is well-known, but there are
|
|
|
982 |
others defined in these standards which are not generally known (and
|
|
|
983 |
since no compiler I know tests them, not widely adhered to). For example,
|
|
|
984 |
the ANSI header <CODE>errno.h</CODE> reserves all names given by the
|
|
|
985 |
regular expression:<P>
|
|
|
986 |
<PRE>
|
|
|
987 |
E[0-9A-Z][0-9a-z_A-Z]+
|
|
|
988 |
</PRE>
|
|
|
989 |
against macros (i.e. in all namespaces). By prohibiting the user from
|
|
|
990 |
using names of this form, the intention is to protect against namespace
|
|
|
991 |
clashes with extensions of the ANSI API which introduce new error
|
|
|
992 |
numbers. It also protects against a particular implementation of these
|
|
|
993 |
extensions - namely that new error numbers will be defined as macros.<P>
|
|
|
994 |
A better example of protecting against particular implementations
|
|
|
995 |
comes from POSIX. If <CODE>sys/stat.h</CODE> is included names of
|
|
|
996 |
the form:<P>
|
|
|
997 |
<PRE>
|
|
|
998 |
st_[0-9a-z_A-Z]+
|
|
|
999 |
</PRE>
|
|
|
1000 |
are reserved against macros (as member names). The intention here
|
|
|
1001 |
is not only to reserve field selector names for future extensions
|
|
|
1002 |
to <CODE>struct stat</CODE> (which would only affect API implementors,
|
|
|
1003 |
not ordinary users), but also to reserve against the possibility that
|
|
|
1004 |
these field selectors might be implemented by macros. So our <CODE>st_atime
|
|
|
1005 |
</CODE> example in section 2.2.3</A> is strictly illegal because the
|
|
|
1006 |
procedure name <CODE>st_atime</CODE> lies in a restricted namespace.
|
|
|
1007 |
Indeed the namespace is restricted precisely to disallow this program.<P>
|
|
|
1008 |
As an exercise to the reader, how many of your programs use names
|
|
|
1009 |
from the following restricted namespaces (all drawn from ANSI, all
|
|
|
1010 |
applying to all namespaces)?<P>
|
|
|
1011 |
<PRE>
|
|
|
1012 |
is[a-z][0-9a-z_A-Z]+ (ctype.h)
|
|
|
1013 |
to[a-z][0-9a-z_A-Z]+ (ctype.h)
|
|
|
1014 |
str[a-z][0-9a-z_A-Z]+ (stdlib.h)
|
|
|
1015 |
</PRE>
|
|
|
1016 |
With the TDF approach of describing APIs in abstract terms using the
|
|
|
1017 |
<CODE>#pragma token</CODE> syntax most of these namespace restrictions
|
|
|
1018 |
are seen to be superfluous. When a target independent header is included
|
|
|
1019 |
precisely the objects defined in that header in that version of the
|
|
|
1020 |
API appear in the namespace. There are no worries about what else
|
|
|
1021 |
might happen to be in the header, because there is nothing else. Also
|
|
|
1022 |
implementation details are separated off to the TDF library building,
|
|
|
1023 |
so possible namespace pollution through particular implementations
|
|
|
1024 |
does not arise.<P>
|
|
|
1025 |
Currently TDF does not have a neat way of solving the <CODE>va_list</CODE>
|
|
|
1026 |
problem. The present target independent headers use a similar workaround
|
|
|
1027 |
to that described above (exploiting a reserved namespace). (See the
|
|
|
1028 |
footnote in section 3.4.1.1.)<P>
|
|
|
1029 |
None of this is intended as criticism of the ANSI or POSIX standards.
|
|
|
1030 |
It merely shows some of the problems that can arise from the insufficient
|
|
|
1031 |
separation of code.<P>
|
|
|
1032 |
<HR>
|
|
|
1033 |
<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown
|
|
|
1034 |
Copyright © 1998.</I></P>
|
|
|
1035 |
</BODY>
|
|
|
1036 |
</HTML>
|