Wikipedia: History of Integral data type

Difference (from prior major revision) (minor diff, author diff)

Changed: 30,33c30,32

In addition to their interpretation as sizes of numerical values, three terms (bit, byte, and word) have other common usages. In particular, word was originally used to indicate the "most efficient size" of data for a processor--typically the size of its internal registers. Thus various families, or different models within families, of processors had different-sized words-- 8-, 12-, 16-, 32-, 36-, 60- and 64-bit words have all been used. Machines also exist with 9-bit words, and may use the term "byte" for them.
The term "octet" can be used for more clarity, and always refers to eight bits.
Popular usage has narrowed (sorry) the usual meaning of word to 16-bits, unless the context indicates
otherwise. The other terms are typically used only when the content is to be interpreted numerically.

In addition to their interpretation as sizes of numerical values, three terms (bit, byte, and word) have other common usages. word is ambiguous, it often indicates the "most efficient size" of data for a processor--typically the size of its internal registers. Thus various families, or different models within families, of processors had different sized words-- 8-, 12-, 16-, 32-, 36-, 60- and 64-bit words have all been used. byte sometimes means some a quantity of bits other than 8; 36-bit word architectures commonly had 9-bit bytes.
The term octet? can be used for more clarity, and always refers to eight bits.
The other terms (in the table) are typically used only when the content is to be interpreted numerically.

Changed: 44c43,45

complement, one's-complement, two's-complement

Representing integers

complement, one's-complement, two's-complement, and so on.

Changed: 54,72c55,77

indicate the sign of the number, rather than contributing to its magnitude. With only seven
bits, the magnitude can range from 0000000 (0) to 1111111 (127). The MSB is set to 0
for a positive number and 1 for a negative number. Thus you can represent numbers from
-127₁₀ to +127₁₀.

However, negative integers aren't just a sign and an independent magnitude. Two conventions
are used to convert a positive integer to its negative counterpart.

The one's-complement (OC) representation of a negative number is created by taking the
complement of its positive representation. For example, negated 00101011 (43) becomes 11010100 (-43).
(Notice that the lower seven bits could be interpreted as a magnitude of 84, but that's not
the convention.)

In One's Complement (OC), there are two ways to represent zero: 00000000 (+0) and 11111111 (-0). To avoid this, and to also make integer addition simpler, the two's-complement (TC) representation is the one generally used. The Two's Complement (TC) representation is created by first complementing the positive number, then adding 1 to it. Thus 00101011 (43) becomes 11010101 (-43).

In TC, there is only one zero (00000000). Negating a negative number involves the same operation: complementing, then adding 1. The pattern 11111111 now represents -1₁₀ and 10000000 represents -128₁₀;
that is, the range of TC integers is -128₁₀ to +127₁₀.
To add two TC integers, treat them as unsigned numbers, add them, and ignore any potentical carry over. The
result will be the correct TC number, unless both summands were positive and the result is negative or both summands were negative and the result is non-negative. The latter cases are refered to as "overflow" or "wrap around"; the addition cannot be carried out in 8 bit TC in these cases. For example:

indicate the sign of the number, rather than contributing to its magnitude; three formats have been used for representing the magnitude: sign-and-magnitude, one's complement and two's complement, which is by far the most common nowadays.

Sign-and-magnitude is the simplest and most like human writing forms.
The MSB is set to 0
for a positive number and 1 for a negative number. The remaining bits in the number indicate the (positive) magnitude. Hence in a byte with only seven
bits (apart from the sign bit), the magnitude can range from 0000000 (0) to 1111111 (127). Thus you can represent numbers from
-127₁₀ to +127₁₀. -43 encoded in a byte this way is 10101011.

The one's-complement representation of a negative number is created by taking the
complement of its positive counterpart. For example, negated 00101011 (43) becomes 11010100 (-43)
(Notice how this is different from the sign-and-magnitude convention where the same bit pattern would be -84).
The PDP-1 uses one's-complement arithmetic.
The range of signed numbers using one's complement in a byte is -127₁₀ to +127₁₀.

Both one's-complement and sign-and-magnitude have two ways to represent zero: 00000000 (+0) and 11111111 (-0) in one's-complement and 10000000 in sign-and-magnitude. This is sometimes problematic (as hardware for adding and subtracting may be more complicated, as might testing for 0).

To avoid this, and to also make integer addition simpler, the two's-complement representation is the one generally used. The two's-complement representation is created by first complementing the positive number, then adding 1 to it. Thus 00101011 (43) becomes 11010101 (-43).

In two's-complement, there is only one zero (00000000). Negating a negative number involves the same operation: complementing, then adding 1. The pattern 11111111 now represents -1₁₀ and 10000000 represents -128₁₀;
that is, the range of two's-complement integers is -128₁₀ to +127₁₀.

To add two two's-complement integers, treat them as unsigned numbers, add them, and ignore any potentical carry over (this is essentially the great advantage that two's-complement has other the other conventions). The
result will be the correct two's-complement number, unless both summands were positive and the result is negative or both summands were negative and the result is non-negative. The latter cases are refered to as "overflow" or "wrap around"; the addition cannot be carried out in 8 bit two's-complement in these cases. For example:

Changed: 96c101

Processor families that use little-endian format: Intel, VAX

Processor families that use little-endian format: Intel 386, VAX

Changed: 98c103

Processor families that use either (determined by software): MIPS, Alpha

Processor families that use either (determined by software): MIPS, DEC Alpha, PowerPC

Changed: 108c113,121

See also:
Kilobyte,
Megabyte,
Gigabyte,
Terabyte,
Petabyte,
Exabyte,
Zettabyte,
Yottabyte

	Revision 15 . . (edit) December 15, 2001 12:31 am by (logged).253.64.xxx
	Revision 14 . . (edit) September 27, 2001 3:48 pm by Bignose [formatting]
	Revision 11 . . (edit) September 16, 2001 5:54 am by (logged).68.87.xxx