This is a partial draft. I have not had a chance to finish these notes yet, but I thought what I do have might be useful.
Consider the following hashcode
function. (Note that this is a horrible implementation
for a variety of reasons. We’ll fix it in a bit. (Haha bit.))
hashcode.c
int8_t hashcode(char *data) {
int8_t h = 0;
for (size_t i = 0; i < strlen(data); i++) {
h += pow(31, i) * data[i];
}
return h;
}
When you run this function on some inputs, the outputs are somewhat peculiar:
First, why is the second one smaller than the first? Second, why is the third one negative??
The answers to these questions lie in the specification of the int8_t
type, but we would observe similar behavior for a more “normal”
type like int
or long
.
Integer Types
The C specification provides quite…strange definitions of the “standard” integer types. All integer types are “fixed-width” meaning they are represented by a fixed number of bits and cannot be expanded beyond that range. Specifically, a \(b\)-bit fixed-width integer can represent \(2^b\) distinct numbers–which numbers those are depends on the type!
What is an int
?
An int
is defined by the C specification to be at least 16 bits in width. Given that we work mostly with 64-bit machines, this is not a particularly useful definition for us.
In fact, you’ll find that on most (but not all!) modern machines int
’s are 32 bits and long
s are 64 bits. Instead of using these loosely defined types, we will use types
with explicit sizes instead:
- If we want to work with an unsigned (non-negative) integer type, we will use
uintx_t
(found ininttypes.h
) wherex
is the width we want (8, 16, 32, or 64). - If we want to work with an signed (capable of negative numbers) integer type, we will use
intx_t
(found ininttypes.h
) wherex
is the width we want (8, 16, 32, or 64). - For indices into memory or sizes, we will use the
size_t
(found instddef.h
) type which will have a range including all valid sizes for the corresponding machine.
Why not just use int
and long
?
In general, we want to use the special types to write portable code. However, in particular circumstances (e.g., the jvm
project), certain fields are guranteed to be exactly
a particular width; if we want to represent these values correctly, we need to ensure that we are capable of representing exactly the right range.
Unexpected Results?
Now, we get back to the unexpected results from before. Unfortunately, a reality with fixed-width integer types is that we must define arithmetic to “work” even if the result is not storable in the number of bits we have. (More formally, we want (Z/nZ, +, x) to form a ring to get the “normal” properties we expect.)
For unsigned integers, this works out to modular arithmetic. For signed integers, we use a representation called “two’s complement” which we explain below.
Unsigned Representation
Unsigned integers use modular arithmetic. This means if we overflow or underflow, we just normalize and take a modulus to get back a valid answer. Here are three examples each of which demonstrates a facet of how fixed-width unsigned integers work:
unsigned-overflow.c
uint8_t x = 100;
uint8_t y = 200;
uint8_t z = x + y;
printf("%d %d %d\n", x,y,z);
100 200 44
b/c \(300 \!\!\mod 2^8 = 44\)
unsigned-widening.c
uint8_t x = 100;
uint8_t y = 200;
uint16_t z = x + y;
printf("%d %d %d\n", x,y,z);
100 200 300
The pieces get “upcast”.
unsigned-narrowing.c
uint16_t x = 1000;
uint16_t y = 2000;
uint8_t z = x + y;
printf("%d %d %d\n", x,y,z);
1000 2000 184
b/c \(3000 \!\!\mod 2^8 = 184\)
Since the smallest number we can represent is \(0\), and there are \(2^b\) total distinct representable numbers, it follows that the largest number we can represent is \(2^b - 1\). A binary number can be converted to decimal using the formula \((x_{b - 1}\cdots x_1x_0)_2 = \displaystyle \sum_{i=0}^{b-1}{x_i 2^i}\) as described in the notes on representation.
Signed Representation (“Two’s Complement”)
The formula for a signed number is slightly different:
\[(x_{b - 1}\cdots x_1x_0)_2 = \displaystyle x_{b-1}(-2^{b-1}) + \sum_{i=0}^{b-2}{x_i 2^i}\]That is, the most significant bit is has a negative weight (instead of a positive one like all the other bits). This definition seems a little strange; so, we’ll go through some theory on why it makes sense along with some examples of its use.
There are several reasons why this definition is convenient:
- We can use the same exact operations (i.e., circuits) for addition and multiplication for two’s complement as for unsigned numbers. This has implications for what assembly instructions and types are necessary; this representation makes the processor agnostic to if the number is signed or unsigned!
- There is only one representation for each number in the range. In particular, other representations often end up with “positive zero” and “negative zero”.
- We can immediately determine if a number is negative or non-negative by checking the highest bit (1 means negative; 0 means non-negative).
There are also some drawbacks to this definition:
- There are not the same number of positive numbers as negative numbers! There is one more representable negative number than positive number. (This can be quickly seen by noting that there are an equal number of bit patterns that start with 0 as 1 and 0 starts with a 0.)
- Inverting all the bits (written
~x
) does not give us the negation of the number. Instead, we have to add 1; that is,-x = ~x + 1
.
Processor designers have almost universally decided that the convenience is more compelling than the drawbacks; so, we’re stuck with it! Might as well learn how it works. :)
Important Values
-
INTx_MAX
is 0b011…11 -
INTx_MIN
is 0b100…00 -
-1
is 0b11..11
Signed Overflow
signed-overflow.c
int8_t x = 100;
int8_t y = 99;
int8_t z = x + y;
printf("%d %d %d\n", x,y,z);
100 99 -57