MENU

[AI from Scratch] Episode 278: Image Binarization

TOC

Recap and Today’s Theme

Hello! In the previous episode, we discussed image histograms and how to visualize the distribution of brightness and color in an image.

This time, we’ll cover image binarization, a fundamental operation in image processing. Binarization is the process of converting an image into two colors, black and white, and is widely used in tasks like edge detection, object recognition, and OCR (Optical Character Recognition). In this article, we’ll explain the basic concepts and methods of binarization and show how to implement them using OpenCV.

What is Image Binarization?

1. Definition of Binarization

Binarization refers to the operation of converting each pixel in an image to either white (255) or black (0). Binarization simplifies the separation of objects from the background, making it easier to analyze shapes and extract regions of interest.

2. Purpose of Binarization

Binarization is used to emphasize specific areas of an image or to reduce noise. It’s commonly used in the following tasks:

  • Edge detection: Makes object contours clearer.
  • Object recognition: Helps recognize specific patterns or shapes.
  • OCR (Optical Character Recognition): Converts printed or handwritten text into digital text by distinguishing text from the background.

Basic Binarization Methods

1. Binarization with a Fixed Threshold

The most basic binarization method is fixed thresholding, where each pixel’s value is compared to a predefined threshold:

  • Above the threshold: Set to white (255).
  • Below the threshold: Set to black (0).

Implementation in OpenCV

Below is an example of how to perform binarization with a fixed threshold using OpenCV.

import cv2

# Load the image (grayscale)
image = cv2.imread('example.jpg', cv2.IMREAD_GRAYSCALE)

# Apply fixed thresholding
_, binary_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

# Display the results
cv2.imshow('Original Image', image)
cv2.imshow('Binary Image', binary_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this code, the cv2.threshold() function is used to perform binarization. The second argument is the threshold value (127), and the third argument is the maximum value for white (255).

2. Otsu’s Binarization

Otsu’s Binarization automatically determines the optimal threshold by analyzing the histogram of the image. It calculates a threshold that minimizes the variance within the image’s two classes (foreground and background).

Implementation in OpenCV

To use Otsu’s method, add the cv2.THRESH_OTSU option.

# Apply Otsu's binarization
_, otsu_binary_image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

# Display the results
cv2.imshow('Original Image', image)
cv2.imshow('Otsu Binary Image', otsu_binary_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Here, the threshold is set to 0, and cv2.THRESH_OTSU calculates the optimal threshold automatically.

3. Adaptive Thresholding

Adaptive thresholding applies different threshold values to different regions of the image, making it useful for images with varying lighting conditions or contrast.

Implementation in OpenCV

Here’s an example of adaptive thresholding in OpenCV.

# Apply adaptive thresholding (mean method)
adaptive_binary_image = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)

# Apply adaptive thresholding (Gaussian method)
adaptive_gaussian_binary_image = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)

# Display the results
cv2.imshow('Original Image', image)
cv2.imshow('Adaptive Mean Binary Image', adaptive_binary_image)
cv2.imshow('Adaptive Gaussian Binary Image', adaptive_gaussian_binary_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this code, cv2.ADAPTIVE_THRESH_MEAN_C and cv2.ADAPTIVE_THRESH_GAUSSIAN_C are used to apply two types of adaptive thresholding. The block size (11) and constant (2) are parameters that affect the calculation of the local thresholds.

Applications of Binarization

1. Edge Detection

Binarized images play a crucial role in edge detection. By highlighting edges, it becomes easier to identify objects and shapes in the image.

2. OCR (Optical Character Recognition)

In OCR, binarization is commonly used to distinguish text from the background. This enhances the contrast between characters and the background, improving the accuracy of text recognition.

3. Medical Image Analysis

In the medical field, binarization is used to analyze X-rays or CT scans, helping to extract regions of interest like lesions or specific tissues.

Considerations and Challenges of Binarization

1. Impact of Lighting

In fixed thresholding, uneven lighting in the image can affect the accuracy of binarization. Adaptive thresholding or Otsu’s method can help mitigate this issue by adjusting thresholds based on local conditions or the image’s histogram.

2. Impact of Noise

Noise in an image can negatively affect binarization results. It’s often recommended to apply a filtering technique, such as Gaussian blur, before binarization to reduce noise.

3. Parameter Tuning

Even with adaptive thresholding or Otsu’s method, the results can vary depending on the parameters like block size or the constant used in the calculation. It’s important to choose appropriate parameters based on the characteristics of the image.

Summary

In this episode, we explored image binarization, a fundamental operation in image processing used in a wide range of fields such as edge detection, object recognition, and OCR. From simple fixed thresholding to Otsu’s method and adaptive thresholding, each approach has its own advantages and use cases. It’s essential to select the right method based on the task and the properties of the image.

Next Episode Preview

In the next episode, we will discuss template matching and learn how to detect specific patterns in an image.


Notes

  1. Threshold: The reference value used to classify pixel brightness as either black or white during binarization.
  2. Otsu’s Binarization: A method for automatically determining the optimal threshold by analyzing the histogram distribution.
  3. Adaptive Thresholding: A technique that sets different thresholds for different regions of an image, reducing the impact of lighting variations.
Let's share this post !

Author of this article

株式会社PROMPTは生成AIに関する様々な情報を発信しています。
記事にしてほしいテーマや調べてほしいテーマがあればお問合せフォームからご連絡ください。
---
PROMPT Inc. provides a variety of information related to generative AI.
If there is a topic you would like us to write an article about or research, please contact us using the inquiry form.

Comments

To comment

TOC