MENU

[AI from Scratch] Episode 282: Practical Data Augmentation — How to Increase Image Data

TOC

Recap and Today’s Theme

Hello! In the previous episode, we explained how to build image classification models using CNNs (Convolutional Neural Networks). Using the handwritten digit dataset (MNIST) as an example, we learned how to classify images with CNNs and introduced various techniques to improve model accuracy.

In this episode, we’ll focus on data augmentation. Data augmentation is an essential technique for improving a model’s generalization performance (accuracy on test data), even when the amount of image data is limited. This article will explain the basic concepts of data augmentation and how to implement it in practice.

What is Data Augmentation?

Data augmentation is the process of generating new variations of data by applying transformations to existing image data. By performing operations such as rotation, flipping, translation, scaling, and adding noise, we increase the diversity of the dataset and prevent the model from overfitting (becoming too specialized to the training data).

Benefits of Data Augmentation

  • Improved Generalization: By training the model on a more diverse set of data, accuracy on unseen test data improves.
  • Prevention of Overfitting: The model is less likely to overfit specific patterns, reducing the risk of learning only the training data.
  • Increased Dataset Size: The dataset size can be expanded without the need to collect new data.

Basic Data Augmentation Techniques

Here are several basic techniques for data augmentation. By combining these methods, you can generate various image variations from the original data.

1. Rotation

Rotate the image by a certain angle. For example, small rotations of 15 or 30 degrees generate new data while preserving the original content.

2. Translation

Move the image slightly in the horizontal or vertical direction. This enables the model to handle objects that appear in different positions within the image.

3. Scaling

By scaling the image up or down, you create data that accounts for object recognition at different sizes.

4. Flipping (Horizontal/Vertical)

Flip the image horizontally or vertically. This is particularly useful for symmetrical data (e.g., objects that look similar when flipped).

5. Adding Noise

Add random noise to the image to improve robustness. Since real-world images often contain noise from cameras or sensors, training with noisy data increases noise tolerance.

6. Color Jittering

Randomly adjust the image’s color tone, saturation, brightness, or contrast to simulate varying lighting conditions and environments.

Implementing Data Augmentation with Python and Keras

Here, we’ll demonstrate how to implement data augmentation using Python and Keras. Keras provides a convenient ImageDataGenerator class, which makes data augmentation easy to apply.

1. Installing Necessary Libraries

pip install tensorflow

2. Example of Data Augmentation Implementation

The following code shows how to train an image classification model with data augmentation using the MNIST dataset and Keras.

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize and reshape the data
x_train = x_train.reshape((x_train.shape[0], 28, 28, 1)).astype('float32') / 255
x_test = x_test.reshape((x_test.shape[0], 28, 28, 1)).astype('float32') / 255

# Set up data augmentation
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    horizontal_flip=False
)

# Fit the data generator to the training data
datagen.fit(x_train)

# Build a CNN model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model with augmented data
history = model.fit(datagen.flow(x_train, y_train, batch_size=32),
                    epochs=5, validation_data=(x_test, y_test))

# Display the training results
import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
  • ImageDataGenerator: Keras provides this class for data augmentation, allowing you to easily increase data diversity.
  • Parameter Settings: Parameters like rotation_range and width_shift_range define the types and ranges of data augmentation.
  • flow method: Generates augmented data in batches and feeds it to the model.

Running this code will train the model using augmented data, and you will observe an improvement in accuracy on the test data.

The Effects and Best Practices of Data Augmentation

1. Prevention of Overfitting

Data augmentation increases the variation in training data, preventing the model from overfitting to specific patterns. This is especially effective when training on small datasets.

2. Adaptation to Real-World Conditions

Real-world environments often present diverse lighting conditions, camera angles, and noise. By simulating these factors with data augmentation, the model can maintain stable performance across different conditions.

3. Effective Data Augmentation Settings

To perform effective data augmentation, it is important to consider the following:

  • Appropriate Parameter Settings: Excessive augmentation can generate unrealistic images, negatively affecting learning. Set small rotation angles and fine-tune translation parameters.
  • Augmentation Suitable for Data Type: Choose the augmentation methods based on the nature of the image data. For example, in cases like road signs where orientation matters, vertical flipping may not be appropriate.

Applications of Data Augmentation

1. Autonomous Vehicles

In autonomous vehicles, images vary greatly due to road conditions, weather, and camera positions. Data augmentation simulates these variations to improve model accuracy.

2. Medical Image Analysis

In medical image analysis, data is often limited. Data augmentation increases the variety of medical images, such as X-rays and MRIs, helping to improve diagnostic models.

3. Smartphone Applications

Data augmentation is used in face recognition and object detection apps. Since images captured by smartphones vary in angle and lighting, data augmentation ensures models are trained for a wide range of scenarios.

Summary

In this episode, we explored data augmentation, learning various techniques to increase data and practical implementation examples. Data augmentation is a highly effective method for improving model accuracy, especially when working with limited datasets. In the next episode, we will dive into image classification with transfer learning, where we’ll learn how to adapt pre-trained models to new datasets.

Next Episode Preview

In the next episode, we will explore image classification using transfer learning, discussing how to leverage pre-trained models for new tasks without the need for large datasets, enabling high-accuracy models with minimal effort.


Notes

  • Overfitting: A phenomenon where the model becomes too specialized to the training data, resulting in poor performance on test data.
  • ImageDataGenerator: A tool provided by Keras for easily performing data augmentation and increasing data diversity.
Let's share this post !

Author of this article

株式会社PROMPTは生成AIに関する様々な情報を発信しています。
記事にしてほしいテーマや調べてほしいテーマがあればお問合せフォームからご連絡ください。
---
PROMPT Inc. provides a variety of information related to generative AI.
If there is a topic you would like us to write an article about or research, please contact us using the inquiry form.

Comments

To comment

TOC