In computer science and mathematics, numbers are represented in various forms to enable computation and storage. Understanding the representation of integers and floating-point numbers is fundamental for studying how computers process numerical data. This educational page explains these representations in detail.
1. Integer Representation
Integers are whole numbers, both positive and negative, including zero. Computers represent integers using a fixed number of bits, depending on the architecture of the system (e.g., 8-bit, 16-bit, 32-bit, or 64-bit). The two main methods for representing integers are unsigned integers and signed integers.
Unsigned Integers
Unsigned integers can only represent non-negative numbers.
The range of values depends on the number of bits used. For example:
An 8-bit unsigned integer can represent values from 0 to 255.
A 16-bit unsigned integer can represent values from 0 to 65,535.
Representation: The value is stored directly in binary form. For example:
Decimal 10 is represented as 00001010 in 8-bit binary.
Signed Integers
Signed integers can represent both positive and negative numbers.
The most common method for signed integer representation is Two’s Complement.
In Two’s Complement, the most significant bit (MSB) represents the sign: 0 for positive and 1 for negative.
The range of values for an n-bit signed integer is from −(2^(n−1)) to (2^(n−1)) − 1. For example:
An 8-bit signed integer can represent values from −128 to 127.
Example: In an 8-bit system, −10 is represented as 11110110.
Advantages of Two’s Complement
Simplifies arithmetic operations (e.g., addition and subtraction).
There is a single representation for zero.
2. Floating-Point Representation
Floating-point numbers are used to represent real numbers, which include fractions, decimals, and numbers with very large or very small magnitudes. These numbers are represented in a format similar to scientific notation.
Structure of Floating-Point Numbers
A floating-point number is typically divided into three parts:
Sign (S): Indicates whether the number is positive (0) or negative (1).
Exponent (E): Represents the power of 2 to which the number is scaled.
Mantissa (M) or Significand: Represents the significant digits of the number.
The general formula for a floating-point number is:
IEEE 754 Standard
The IEEE 754 standard is the most widely used format for floating-point representation. It defines two commonly used formats:
Single Precision (32-bit):
1 bit for the sign
8 bits for the exponent
23 bits for the mantissa
Range: Approximately ±3.4 × 10^38
Double Precision (64-bit):
1 bit for the sign
11 bits for the exponent
52 bits for the mantissa
Range: Approximately ±1.8 × 10^308
Normalization and Bias
The mantissa is normalized so that it always has a leading 1 (e.g., 1.xxxxxx). This leading 1 is implicit and not stored, allowing for greater precision.
The exponent is stored with a bias to handle both positive and negative exponents. For example, in single precision, the bias is 127, so an exponent of 0 is stored as 127.
Example of Single-Precision Floating-Point Representation
To represent the decimal number −7.25:
Convert to binary: −7.25 = −1.001 (binary).
Normalize: −1.001 = −1.001 × 2^2.
Determine the sign, exponent, and mantissa:
Sign = 1 (negative)
Exponent = 127 (bias) + 2 = 129 = 10000001 (binary)
Mantissa = 00100000000000000000000
Final representation: 1 10000001 00100000000000000000000
3. Key Differences Between Integer and Floating-Point Representation
4. Common Challenges and Errors
Integer Overflow
Occurs when a calculation produces a result outside the range representable by the number of bits.
Example: Adding 1 to the maximum value of an 8-bit unsigned integer (255) wraps around to 0.
Floating-Point Precision Errors
Floating-point numbers cannot precisely represent all decimal values due to limited mantissa bits.
Example: Representing 0.1 in binary results in an infinite repeating fraction, leading to rounding errors.
Underflow and Overflow in Floating-Point Numbers
Underflow: Occurs when a number is too small to be represented.
Overflow: Occurs when a number is too large to be represented.
5. Applications
Integer Representation: Used in counting, indexing, and addressing.
Floating-Point Representation: Used in scientific calculations, graphics rendering, and any application requiring fractional or very large/small numbers.
Understanding integer and floating-point representations is essential for working with numerical data in computers. Integers provide exact values for whole numbers, while floating-point numbers enable the representation of real numbers at the cost of precision. Awareness of their limitations, such as overflow and rounding errors, is crucial for designing robust numerical algorithms.