Table of Contents
Numeric Types
Numeric types represent Real numbers in a computer. Because computers have limited space, most numeric values are approximations of a real value.
Math in a computer is a combination of a storage of data, and a collection of operators. Interpretation of the bits is up to the operators. When we talk about a type of a number, we are usually referring to the operations we can do on them, rather than the storage.
Here are some Numeric Types that programming languages usually support.
- Signed Integer (32 bits) - These usually have fast support in the processor. Sometimes called an int.
- Two's Complement - Represents Numbers as a string of bits. Bit Vector addition results in (inv(N) + 1) + N = 0
- One's Complement - Represents Numbers as a string of bits. Bit Vector addition results in (inv(N)) + N = 0
- Signed Magnitude - Represents Numbers as one bit for the sign, and the remaining as an unsigned magnitude. N = (-1)^S x M
- Signed Integer (8 bits) - Sometimes called a byte. Sometimes called an octet in the context of networking.
- Signed Integer (16 bits) - Sometimes called a short
- Signed Integer (24 bits) - Uncommon, sometimes called a medium. Some GPUs include support for these.
- Signed Integer (64 bits) - Sometimes called a long. These are the most common word size on a machine today
- Signed Integer (128 bits) - Less common, Sometimes used in C#.
- Unsigned Integer (32 bits) - Represents a magnitude from 0 to 2^B - 1, where B is bits. Overflow wraps around usually.
- Unsigned Integer (8 bits) - Sometimes also called a byte, or an octet. C usually calls these
char. - Unsigned Integer (16 bits) - Sometimes called a short, Java calls this a
char. - Unsigned Integer (64 bits) - Occasionally called a long long in C.
- Float (32 bit) - Represents a number in binary scientific notation.
- Double (64 bit) - Also a floating point number, but bigger. Sometimes called a Float64. Double is short for double precision floating point number.
- Fixed point - Represents a decimal value with a non floating decimal point. Simpler to implement but doesn't capture much range. This type is not a native type to most CPUs.
- Arbitrary Precision Integer - Sometimes called big integers. Some languages use them as the native type.
- Arbitrary Precision Decimal - Represents a number with high precision.
- Complex (x + jy) - Usually a real + imaginary number.
- Complex (m * theta) - An alternative representation in Polar. Uncommon, but better for multiplication.
There are many less common types, or types that libraries implement.
- Rational - A ratio of two integers. Go supports this.
As a consequence of their representation, most operations on numeric types are not true to their eponymous mathematical function. For example, adding two Int32 numbers can overflow, resulting in a wrong answer. Also, most floating point operations are not associative: (a + b) + c != a + (b + c). This is usually uncommon enough that it isn't a problem, but care should be taken to programming defensively.
Support in Languages
Java
Main: Java Numeric.
