06-07-2012, 10:53 AM
Basic Types and Operators
EssentialC.pdf (Size: 85.06 KB / Downloads: 50)
Integer Types
The "integral" types in C form a family of integer types. They all behave like integers and
can be mixed together and used in similar ways. The differences are due to the different
number of bits ("widths") used to implement each type -- the wider types can store a
greater ranges of values.
char ASCII character -- at least 8 bits. Pronounced "car". As a practical matter
char is basically always a byte which is 8 bits which is enough to store a single
ASCII character. 8 bits provides a signed range of -128..127 or an unsigned range is
0..255. char is also required to be the "smallest addressable unit" for the machine --
each byte in memory has its own address.
short Small integer -- at least 16 bits which provides a signed range of
-32768..32767. Typical size is 16 bits. Not used so much.
int Default integer -- at least 16 bits, with 32 bits being typical. Defined to be
the "most comfortable" size for the computer. If you do not really care about the
range for an integer variable, declare it int since that is likely to be an appropriate
size (16 or 32 bit) which works well for that machine.
long Large integer -- at least 32 bits. Typical size is 32 bits which gives a signed
range of about -2 billion ..+2 billion. Some compilers support "long long" for 64 bit
ints.
The integer types can be preceded by the qualifier unsigned which disallows
representing negative numbers, but doubles the largest positive number representable. For
example, a 16 bit implementation of short can store numbers in the range
-32768..32767, while unsigned short can store 0..65535. You can think of pointers
as being a form of unsigned long on a machine with 4 byte pointers. In my opinion,
it's best to avoid using unsigned unless you really need to. It tends to cause more
misunderstandings and problems than it is worth.
Extra: Portability Problems
Instead of defining the exact sizes of the integer types, C defines lower bounds. This
makes it easier to implement C compilers on a wide range of hardware. Unfortunately it
occasionally leads to bugs where a program runs differently on a 16-bit-int machine than
it runs on a 32-bit-int machine. In particular, if you are designing a function that will be
implemented on several different machines, it is a good idea to use typedefs to set up
types like Int32 for 32 bit int and Int16 for 16 bit int. That way you can prototype a
function Foo(Int32) and be confident that the typedefs for each machine will be set so
that the function really takes exactly a 32 bit int. That way the code will behave the same
on all the different machines.
char Constants
A char constant is written with single quotes (') like 'A' or 'z'. The char constant 'A' is
really just a synonym for the ordinary integer value 65 which is the ASCII value for
uppercase 'A'. There are special case char constants, such as '\t' for tab, for characters
which are not convenient to type on a keyboard.
'A' uppercase 'A' character
'\n' newline character
'\t' tab character
'\0' the "null" character -- integer value 0 (different from the char digit '0')
'\012' the character with value 12 in octal, which is decimal 10
int Constants
Numbers in the source code such as 234 default to type int. They may be followed by
an 'L' (upper or lower case) to designate that the constant should be a long such as 42L.
An integer constant can be written with a leading 0x to indicate that it is expressed in
hexadecimal -- 0x10 is way of expressing the number 16. Similarly, a constant may be
written in octal by preceding it with "0" -- 012 is a way of expressing the number 10.
Type Combination and Promotion
The integral types may be mixed together in arithmetic expressions since they are all
basically just integers with variation in their width. For example, char and int can be
combined in arithmetic expressions such as ('b' + 5). How does the compiler deal
with the different widths present in such an expression? In such a case, the compiler
"promotes" the smaller type (char) to be the same size as the larger type (int) before
combining the values. Promotions are determined at compile time based purely on the
types of the values in the expressions. Promotions do not lose information -- they always
convert from a type to compatible, larger type to avoid losing information.
Pitfall -- int Overflow
I once had a piece of code which tried to compute the number of bytes in a buffer with
the expression (k * 1024) where k was an int representing the number of kilobytes
I wanted. Unfortunately this was on a machine where int happened to be 16 bits. Since
k and 1024 were both int, there was no promotion. For values of k >= 32, the product
was too big to fit in the 16 bit int resulting in an overflow. The compiler can do
whatever it wants in overflow situations -- typically the high order bits just vanish. One
way to fix the code was to rewrite it as (k * 1024L) -- the long constant forced the
promotion of the int. This was not a fun bug to track down -- the expression sure looked
reasonable in the source code. Only stepping past the key line in the debugger showed the
overflow problem. "Professional Programmer's Language." This example also
demonstrates the way that C only promotes based on the types in an expression. The
compiler does not consider the values 32 or 1024 to realize that the operation will
overflow (in general, the values don't exist until run time anyway). The compiler just
looks at the compile time types, int and int in this case, and thinks everything is fine.