Header Ads

Breaking News

Representation of Numbers: Integer and Floating-Point Representation

In computer science and mathematics, numbers are represented in various forms to enable computation and storage. Understanding the representation of integers and floating-point numbers is fundamental for studying how computers process numerical data. This educational page explains these representations in detail.

1. Integer Representation

Integers are whole numbers, both positive and negative, including zero. Computers represent integers using a fixed number of bits, depending on the architecture of the system (e.g., 8-bit, 16-bit, 32-bit, or 64-bit). The two main methods for representing integers are unsigned integers and signed integers.

Unsigned Integers

  1. Unsigned integers can only represent non-negative numbers.

  2. The range of values depends on the number of bits used. For example:

    1. An 8-bit unsigned integer can represent values from 0 to 255.

    2. A 16-bit unsigned integer can represent values from 0 to 65,535.

  3. Representation: The value is stored directly in binary form. For example:

    1. Decimal 10 is represented as 00001010 in 8-bit binary.

Signed Integers

  1. Signed integers can represent both positive and negative numbers.

  2. The most common method for signed integer representation is Two’s Complement.

    1. In Two’s Complement, the most significant bit (MSB) represents the sign: 0 for positive and 1 for negative.

    2. The range of values for an n-bit signed integer is from −(2^(n−1)) to (2^(n−1)) − 1. For example:

      1. An 8-bit signed integer can represent values from −128 to 127.

    3. Example: In an 8-bit system, −10 is represented as 11110110.

Advantages of Two’s Complement

  1. Simplifies arithmetic operations (e.g., addition and subtraction).

  2. There is a single representation for zero.

2. Floating-Point Representation

Floating-point numbers are used to represent real numbers, which include fractions, decimals, and numbers with very large or very small magnitudes. These numbers are represented in a format similar to scientific notation.

Structure of Floating-Point Numbers

A floating-point number is typically divided into three parts:

  1. Sign (S): Indicates whether the number is positive (0) or negative (1).

  2. Exponent (E): Represents the power of 2 to which the number is scaled.

  3. Mantissa (M) or Significand: Represents the significant digits of the number.

The general formula for a floating-point number is:

IEEE 754 Standard

The IEEE 754 standard is the most widely used format for floating-point representation. It defines two commonly used formats:

  1. Single Precision (32-bit):

    1. 1 bit for the sign

    2. 8 bits for the exponent

    3. 23 bits for the mantissa

    4. Range: Approximately ±3.4 × 10^38

  2. Double Precision (64-bit):

    1. 1 bit for the sign

    2. 11 bits for the exponent

    3. 52 bits for the mantissa

    4. Range: Approximately ±1.8 × 10^308

Normalization and Bias

  1. The mantissa is normalized so that it always has a leading 1 (e.g., 1.xxxxxx). This leading 1 is implicit and not stored, allowing for greater precision.

  2. The exponent is stored with a bias to handle both positive and negative exponents. For example, in single precision, the bias is 127, so an exponent of 0 is stored as 127.

Example of Single-Precision Floating-Point Representation

To represent the decimal number −7.25:

  1. Convert to binary: −7.25 = −1.001 (binary).

  2. Normalize: −1.001 = −1.001 × 2^2.

  3. Determine the sign, exponent, and mantissa:

    1. Sign = 1 (negative)

    2. Exponent = 127 (bias) + 2 = 129 = 10000001 (binary)

    3. Mantissa = 00100000000000000000000

  4. Final representation: 1 10000001 00100000000000000000000

3. Key Differences Between Integer and Floating-Point Representation


Feature

Integer

Floating-Point

Type of Numbers

Whole numbers

Real numbers (including fractions)

Precision

Exact

Approximate (due to rounding errors)

Range

Limited by the number of bits

Wider range with trade-off in precision

Arithmetic Operations

Simple

More complex

Storage

Fixed binary representation

Sign, exponent, and mantissa representation

4. Common Challenges and Errors

Integer Overflow

  1. Occurs when a calculation produces a result outside the range representable by the number of bits.

  2. Example: Adding 1 to the maximum value of an 8-bit unsigned integer (255) wraps around to 0.

Floating-Point Precision Errors

  1. Floating-point numbers cannot precisely represent all decimal values due to limited mantissa bits.

  2. Example: Representing 0.1 in binary results in an infinite repeating fraction, leading to rounding errors.

Underflow and Overflow in Floating-Point Numbers

  1. Underflow: Occurs when a number is too small to be represented.

  2. Overflow: Occurs when a number is too large to be represented.

5. Applications

  1. Integer Representation: Used in counting, indexing, and addressing.

  2. Floating-Point Representation: Used in scientific calculations, graphics rendering, and any application requiring fractional or very large/small numbers.


Understanding integer and floating-point representations is essential for working with numerical data in computers. Integers provide exact values for whole numbers, while floating-point numbers enable the representation of real numbers at the cost of precision. Awareness of their limitations, such as overflow and rounding errors, is crucial for designing robust numerical algorithms.