Subversion Repositories tendra.SVN

Rev

Rev 2 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
2 7u83 1
<!-- Crown Copyright (c) 1998 -->
2
<HTML>
3
<HEAD>
4
<TITLE>C Checker Reference Manual: Integral Types</TITLE>
5
</HEAD>
6
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000FF" VLINK="#400080" ALINK="#FF0000">
7
<A NAME=S31>
8
<H1>C Checker Reference Manual</H1>
9
<H3>January 1998</H3>
10
<A HREF="tdfc8.html"><IMG SRC="../images/next.gif" ALT="next section"></A>
11
<A HREF="tdfc6.html"><IMG SRC="../images/prev.gif" ALT="previous section"></A>
12
<A HREF="tdfc1.html"><IMG SRC="../images/top.gif" ALT="current document"></A>
13
<A HREF="../index.html"><IMG SRC="../images/home.gif" ALT="TenDRA home page">
14
</A>
15
<IMG SRC="../images/no_index.gif" ALT="document index"><P>
16
<HR>
17
<DL>
18
<DT><A HREF="#S32"><B>4.1 </B> - Introduction</A><DD>
19
<DT><A HREF="#S33"><B>4.2 </B> - Integer promotion rules</A><DD>
20
<DT><A HREF="#S34"><B>4.3 </B> - Arithmetic operations on integer
21
types</A><DD>
22
<DT><A HREF="#S35"><B>4.4 </B> - Interaction with the integer conversion
23
checks</A><DD>
24
<DT><A HREF="#S36"><B>4.5 </B> - Target dependent integral types</A><DD>
25
<DL>
26
<DT><A HREF="#S37"><B>4.5.1 </B> - Integer literals</A><DD>
27
<DT><A HREF="#S38"><B>4.5.2 </B> - Abstract API types</A><DD>
28
</DL>
29
<DT><A HREF="#S39"><B>4.6 </B> - Integer overflow checks</A><DD>
30
<DT><A HREF="#S40"><B>4.7 </B> - Integer operator checks</A><DD>
31
<DT><A HREF="#S41"><B>4.8 </B> - Support for 64 bit integer types
32
(long long)</A><DD>
33
</DL>
34
 
35
<HR>
36
<H1>4  Integral Types</H1>
37
<A NAME=S32>
38
<HR><H2>4.1  Introduction</H2>
39
The checks described in the previous chapter involved the detection
40
of conversions which could result in undefined values. Certain conversions
41
involving integral types, however, are defined in the ISO C standard
42
and so might be considered safe and unlikely to cause problems. This
43
unfortunately is not the case: some of these conversions may still
44
result in a change in value; the actual size of each integral type
45
is implementation-dependent; and the &quot;old-style&quot; integer
46
conversion rules which predate the ISO standard are still in common
47
use. The checker provides support for both ISO and traditional integer
48
promotion rules. The set of rules used may be specified independently
49
of the two integral range scenarios, 16 bit(default) and 32 bit, described
50
in section 2.1.2.<P>
51
The means of specifying and alternative sets of promotion rules, their
52
interaction with the conversion checks described in section 3.2 and
53
the additional checks which may be performed on integers and integer
54
operations are described in the remainder of this chapter.<P>
55
<A NAME=S33>
56
<HR><H2>4.2  Integer promotion rules</H2>
57
<CODE>The ISO C standard rules</CODE> may be summarised as follows:
58
long integral types promote to themselves; other integral types promote
59
to whichever of int or unsigned int they fit into. In full the promotions
60
are:<P>
61
<UL>
62
<LI>char -&gt; int<P>
63
<LI>signed char -&gt; int<P>
64
<LI>unsigned char -&gt; int<P>
65
<LI>short -&gt; int<P>
66
<LI>unsigned short -&gt; int or unsigned int<P>
67
<LI>int -&gt; int<P>
68
<LI>unsigned int -&gt; unsigned int<P>
69
<LI>long -&gt; long<P>
70
<LI>unsigned long -&gt; unsigned long<P>
71
</UL>
72
Note that even with these simple built-in types, there is a degree
73
of uncertainty, namely concerning the promotion of unsigned short.
74
On most machines, int is strictly larger than short so the promotion
75
of unsigned short is int. However, it is possible for short and int
76
to have the same size, in which case the promotion is unsigned int.
77
When using the ISO C promotion rules, the checker usually avoids making
78
assumptions about the implementation by treating the promotion of
79
unsigned short as an abstract integral type. If, however, the <CODE>-Y32bit
80
</CODE>option is specified, int is assumed to be strictly larger than
81
short, and unsigned short promotes to int. <P>
82
<CODE>The traditional C integer promotion rules</CODE>
83
are often referred to as the signed promotion rules. Under these rules,
84
long integral types promote to themselves, as in ISO C, but the other
85
integral types promote to unsigned int if they are qualified by unsigned,
86
and int otherwise. Thus the signed promotion rules may be represented
87
as follows:<P>
88
<UL>
89
<LI>char -&gt; int<P>
90
<LI>signed char -&gt; int<P>
91
<LI>unsigned char -&gt; unsigned int<P>
92
<LI>short -&gt; int<P>
93
<LI>unsigned short -&gt; unsigned int<P>
94
<LI>int -&gt; int<P>
95
<LI>unsigned int -&gt; unsigned int<P>
96
<LI>long -&gt; long<P>
97
<LI>unsigned long -&gt; unsigned long<P>
98
</UL>
99
The traditional promotion rules are applied in the <CODE>Xt</CODE>
100
built-in environment only. All of the other built-in environments
101
specify the ISO C promotion rules. Users may also specify their own
102
rules for integer promotions and minimum integer ranges; the methods
103
for doing this are described in Annex H.<P>
104
<A NAME=S34>
105
<HR><H2>4.3  Arithmetic operations on integer types</H2>
106
<CODE>The ISO C standard rules</CODE> for calculating the type of
107
an arithmetic operation involving two integer types is as follows
108
- work out the integer promotions of the types of the two operands,
109
then:<P>
110
<UL>
111
<LI>If either promoted type is unsigned long, the result type is unsigned
112
long;<P>
113
<LI>Otherwise, if one promoted type is long and the other is unsigned
114
int, then if a long int can represent all values of an unsigned int,
115
the result type is long; otherwise the result type is unsigned long;<P>
116
<LI>Otherwise, if either promoted type is long, the result type is
117
long;<P>
118
<LI>Otherwise, if either promoted type is unsigned int, the result
119
type is unsigned int;<P>
120
<LI>Otherwise the result type is int.<P>
121
</UL>
122
Both promoted values are converted to the result type, and the operation
123
is then applied.<P>
124
<A NAME=S35>
125
<HR><H2>4.4  <A NAME=4>Interaction with the integer conversion checks</H2>
126
A simple-minded implementation of the integer conversion checks described
127
in 3.2 would interact badly with these rules. Consider, for example,
128
adding two values of type char:<P>
129
<PRE>
130
	char f ( char a, char b )
131
	{
132
		char c = a + b ;
133
		return ( c ) ;
134
	}
135
</PRE>
136
The various stages in the calculation of c are as follows - a and
137
b are converted to their promotion type, int, added together to give
138
an int result, which is converted to a char and assigned to c. The
139
conversions of a and b from char to int are always safe, and so present
140
no difficulties to the integer conversion checks. The conversion of
141
the result from int to char, however, is precisely the type of value
142
destroying conversion which these checks are designed to detect.<P>
143
Obviously, an integer conversion check which flagged all char arithmetic
144
would never be used, thereby losing the potential to detect many subtle
145
portability errors. For this reason, the integer conversion checks
146
are more sophisticated. In all typed languages, the type is used for
147
two purposes - for static type checking and for expressing information
148
about the actual representation of data on the target machine. Essentially
149
it is a confusion between these two roles which leads to the problems
150
above. The C promotion and arithmetic rules are concerned with how
151
data is represented and manipulated, rather than the underlying abstract
152
types of this data. When a and b are promoted to int prior to being
153
added together, this is only a change in representation; at the conceptual
154
level they are still char's. Again, when they are added, the result
155
may be represented as an int, but conceptually it is a char. Thus
156
the assignment to c, an actual char, is just a change in representation,
157
not a change in conceptual type.<P>
158
So each expression may be regarded as having two types - a conceptual
159
type which stands for what the expression means, and a representational
160
type which stands for how the expression is to represented as data
161
on the target machine. In the vast majority of expressions, these
162
types coincide, however the integral promotion and arithmetic conversions
163
are changes of representational, not conceptual, types. The integer
164
conversion checks are concerned with detecting changes of conceptual
165
type, since it is these which are most likely to be due to actual
166
programming errors.<P>
167
It is possible to define integral types within the TenDRA extensions
168
to C in which the split between concept and representation is made
169
explicit. The pragma:<P>
170
<PRE>
171
	#pragma TenDRA keyword TYPE for type representation
172
</PRE>
173
may be used to introduce a keyword TYPE for this purpose (as with
174
all such pragmas, the precise keyword to be used is left to the user).
175
Once this has been done, TYPE ( r, t ) may be used to represent a
176
type which is conceptually of type t but is represented as data like
177
type r. Both t and r must be integral types. For example:<P>
178
<PRE>
179
	TYPE ( int, char ) a ;
180
</PRE>
181
declares a variable a which is represented as an int, but is conceptually
182
a char.<P>
183
In order to maintain compatibility with other compilers, it is necessary
184
to give TYPE a sensible alternative definition. For all but conversion
185
checking purposes, TYPE ( r, t ) is identical to r, so a suitable
186
definition is:<P>
187
<PRE>
188
	#ifdef __TenDRA__
189
	#pragma TenDRA keyword TYPE for type representation
190
	#else
191
	#define TYPE( r, t ) r
192
	#endif
193
</PRE>
194
<A NAME=S36>
195
<HR><H2>4.5  Target dependent integral types</H2>
196
Since the checker uses only information about the minimum guaranteed
197
ranges of integral types, integer values for which the actual type
198
of the values is unknown may arise. Integer values of undetermined
199
type generally arise in one of two ways: through the use of integer
200
literals and from API types which are not completely specified.<P>
201
<A NAME=S37>
202
<H3>4.5.1  Integer literals</H3>
203
<CODE>The ISO C rules</CODE> on the type of integer literals are set
204
out as follows. For each class of integer literals a list of types
205
is given. The type of an integer literal is then the first type in
206
the appropriate list which is large enough to contain the value of
207
the integer literal. The class of the integer literal depends on whether
208
it is decimal, hexadecimal or octal, and whether it is qualified by
209
U (or u) or L (or l) or both. The rules may be summarised as follows:<P>
210
<UL>
211
<LI>decimal -&gt; int or long or unsigned long<P>
212
<LI>hex or octal -&gt; int or unsigned int or long or unsigned long<P>
213
<LI>any + U -&gt; unsigned int or unsigned long<P>
214
<LI>any + L -&gt; long or unsigned long<P>
215
<LI>any + UL -&gt; unsigned long<P>
216
</UL>
217
These rules are applied in all the built-in checking modes except
218
<CODE>Xt</CODE>. Traditional C does not have the U and L qualifiers,
219
so if the <CODE>Xt</CODE> mode is used, these qualifiers are ignored
220
and all integer literals are treated as int, long or unsigned long,
221
depending on the size of the number. <P>
222
If a number fits into the minimal range for the first type of the
223
appropriate list, then it is of that type; otherwise its type is undetermined
224
and is said to be target dependent. The checker treats target dependent
225
types as abstract integral types which may lead to integer conversion
226
problems. For example, in:<P>
227
<PRE>
228
	int f ( int n ) {
229
		return ( n &amp; 0xff00 ) ;
230
	}
231
</PRE>
232
the type of 0xff00 is target dependent, since it does not fit into
233
the minimal range for int specified by the ISO C standard (this is
234
detected by the integer overflow analysis described in section 4.6).
235
The arithmetic conversions resulting from the &amp; operation is detected
236
by the checker's conversion analysis. Note that if the -Y32bit option
237
is specified to tchk, an int is assumed to contain at least 32 bits.
238
In this case, 0xff00 fits into the type int, and so this is the type
239
of the integer literal. No invalid integer conversions is then detected.<P>
240
<A NAME=S38>
241
<H3>4.5.2  <A NAME=10>Abstract API types</H3>
242
Target dependent integral types also occur in API specifications and
243
may be encountered when checking against one of the implementation-independent
244
APIs provided with the checker. The commonest example of this is 
245
size_t, which is stated by the ISO C standard to be a target dependent
246
unsigned integral type, and which arises naturally within the language
247
as the type of a sizeof expression. <P>
248
The checker has its own internal version of size_t, wchar_t and ptrdiff_t
249
for evaluating static compile-time expressions. These internal types
250
are compatible with the ISO C specification of size_t, wchar_t and
251
ptrdiff_t, and thus are compatible with any conforming definitions
252
of these types found in included files. However, when checking the
253
following program against the system headers, a warning is produced
254
on some machines concerning the implicit conversion of an unsigned
255
int to type size_t: <P>
256
<PRE>
257
	#include &lt;stdlib.h&gt;
258
	int main() {
259
		size_t size;
260
		size = sizeof(int);
261
	}
262
</PRE>
263
The system header on the machine in question actually defines size_t
264
to be a signed int (this of course contravenes the ISO C standard)
265
but the compile time function sizeof returns the checker's internal
266
version of size_t which is an abstract unsigned integral type. By
267
using the pragma: <P>
268
<PRE>
269
	#pragma TenDRA set size_t:signed int
270
</PRE>
271
the checker can be instructed to use a different internal definition
272
of size_t when evaluating the sizeof function and the error does not
273
arise. Equivalent options are also available for the ptrdiff_t and
274
wchar_t types.<P>
275
<A NAME=S39>
276
<HR><H2>4.6  <A NAME=14>Integer overflow checks</H2>
277
Given the complexity of the rules governing the types of integers
278
and results of integer operations, as well as the variation of integral
279
ranges with machine architecture, it is hardly surprising that unexpected
280
results of integer operations are at the root of many programming
281
problems. These problems can often be hard to track down and may suddenly
282
appear in an application which was previously considered &quot;safe&quot;,
283
when it is moved to a new system. Since the checker supports the concept
284
of a guaranteed minimum size of an integer it is able to detect many
285
potential problems involving integer constants. The pragma:<P>
286
<PRE>
287
	#pragma TenDRA integer overflow analysis <EM>status</EM>
288
</PRE>
289
where <CODE>status</CODE> is <CODE>on</CODE>, <CODE>warning</CODE>
290
or <CODE>off</CODE>, controls a set of checks on arithmetic expressions
291
involving integer constants. These checks cover overflow, use of constants
292
exceeding the minimum guaranteed size for their type and division
293
by zero. They are not enabled in the default mode.<P>
294
There are two special cases of integer overflow for which checking
295
is controlled seperately: <P>
296
<OL>
297
<LI><CODE>Bitfield sizes</CODE>. Obviously, the size of a bitfield
298
must be smaller than or equal to the minimum size of its integral
299
type. A bitfield which is too large is flagged as an error in the
300
default mode. The check on bitfield sizes is controlled by:  
301
<PRE>
302
	#pragma TenDRA bitfield overflow <EM>permit</EM>
303
</PRE>
304
where <EM>permit</EM> is one of <CODE>allow</CODE>, <CODE>disallow</CODE>
305
or<CODE> warning</CODE>. <P>
306
<LI><CODE>Octal and hexadecimal escape sequences</CODE>. According
307
to the ISO C standard, the value of an octal or hexadecimal escape
308
sequence shall be in the range of representable values for the type
309
unsigned char for an integer character constant, or the unsigned type
310
corresponding to wchar_t for a wide character constant. The check
311
on escape sequence sizes is controlled by:  
312
<PRE>
313
	#pragma TenDRA character escape overflow <EM>permit</EM>
314
</PRE>
315
where the options for <EM>permit</EM> are <CODE>allow</CODE>, <CODE>warning
316
</CODE> and <CODE>disallow</CODE>. The check is switched on by default.
317
</OL>
318
<P>
319
<A NAME=S40>
320
<HR><H2>4.7  <A NAME=18>Integer operator checks</H2>
321
The results of some integer operations are undefined by the ISO C
322
standard for certain argument types. Others are implementation-defined
323
or simply likely to produce unexpected results.In the default mode
324
such operations are processed silently, however a set of checks on
325
operations involving integer constants may be controlled using:<P>
326
<PRE>
327
	#pragma TenDRA integer operator analysis <EM>status</EM>
328
</PRE>
329
where <EM>status</EM> is replaced by <CODE>on</CODE>, <CODE>warning</CODE>
330
or <CODE>off</CODE>. This pragma enables checks on:<P>
331
<UL>
332
<LI>shift operations where an expression is shifted by a negative
333
number or by an amount greater than or equal to the width in bits
334
of the expression being shifted;<P>
335
<LI>right shift operation with a negative value of signed integral
336
type as the first argument;<P>
337
<LI>division operation with a negative operand;<P>
338
<LI>test for an unsigned value strictly greater than or less than
339
 
340
<LI>conversion of a negative constant value to an unsigned type;<P>
341
<LI>application of unary - operator to an unsigned value.<P>
342
</UL>
343
<A NAME=S41>
344
<HR><H2>4.8  <A NAME=24>Support for 64 bit integer types (long long)</H2>
345
Although the use of long long to specify a 64 bit integer type is
346
not supported by the ISO C standard it is becoming increasingly popular
347
as in programming use. By default, tchk does not support the use of
348
long long but the checker can be configured to support the long long
349
type to different degrees using the following pragmas:<P>
350
<PRE>
351
	#pragma TenDRA longlong type <EM>permit</EM>
352
</PRE>
353
where <EM>permit</EM> is one of <CODE>allow</CODE> (long long type
354
accepted), <CODE>disallow</CODE> (errors produced when long long types
355
are detected) or <CODE>warning</CODE> (long long type are accepted
356
but a warning is raised).<P>
357
<PRE>
358
	#pragma TenDRA set longlong type : type_name
359
</PRE>
360
where <EM>type_name</EM> is <CODE>long</CODE> or <CODE>long long</CODE>.<P>
361
The first pragma determines the behaviour of the checker if the type
362
long long is encountered as a type specifier. In the disallow case,
363
an error is raised and the type specifier mapped to long, otherwise
364
the type is stored as long long although a message alerting the user
365
to the use of long long is raised in the warning mode. The second
366
pragma determines the semantics of long long. If the type specified
367
is long long, then long long is treated as a separate integer type
368
and if code generation is enabled, long long types appears in the
369
output. Otherwise the type is mapped to long and all objects declared
370
long long are output as if they had been declared long (a warning
371
is produced when this occurs). In either case, long long is treated
372
as a distinct integer type for the purpose of integer conversion checking.<P>
373
Extensions to the integer promotion and arithmetic conversion rules
374
are required for the long long type. These have been implemented as
375
follows:<P>
376
<UL>
377
<LI>the types of integer arithmetic operations where neither argument
378
has long long type are unaffected;<P>
379
<LI>long long and unsigned long long both promote to themselves;<P>
380
<LI>the result type of arithmetic operations with one or more arguments
381
of type unsigned long long is unsigned long long;<P>
382
<LI>otherwise if either argument has type signed long long the overall
383
type is long long if both arguments can be represented in this form,
384
otherwise the type is unsigned long long.<P>
385
</UL>
386
There are now three cases where the type of an integer arithmetic
387
operation is not completely determined from the type of its arguments,
388
i.e.<P>
389
<OL>
390
<LI>signed long long + unsigned long = signed long long <EM>or</EM>
391
unsigned long long;<P>
392
<LI>signed long long + unsigned int = signed long long <EM>or</EM>
393
unsigned long long;<P>
394
<LI>signed int + unsigned short = signed int <EM>or</EM> unsigned
395
int ( as before ).<P>
396
</OL>
397
In these cases, the type of the operation is represented using an
398
abstract integral type as described in section 4.2.<P>
399
<!-- FM pgf ignored -->
400
<HR>
401
<P><I>Part of the <A HREF="../index.html">TenDRA Web</A>.<BR>Crown
402
Copyright &copy; 1998.</I></P>
403
</BODY>
404
</HTML>