MENU

[AI from Scratch] Episode 229: Implementing an RNN — Building a Recurrent Neural Network

TOC

Recap and Today’s Theme

Hello! In the previous episode, we built a CNN (Convolutional Neural Network) to classify handwritten digits. CNNs are powerful tools for image processing as they learn features hierarchically from image data.

This time, we will implement a RNN (Recurrent Neural Network), which is well-suited for time-series data and natural language processing (NLP). RNNs are designed to capture temporal dependencies and sequential relationships in data, making them useful in domains such as speech recognition, text generation, and stock price prediction. Let’s use Keras to build a basic RNN!

What Is an RNN?

A Recurrent Neural Network (RNN) is a network designed to handle sequential data (such as time-series or text data). RNNs have a structure that incorporates past information into the current computation, using the output from previous steps as input for the next step. This allows RNNs to model the temporal dependencies in data.

Basic Structure of RNNs

An RNN’s hidden layers are recursively connected along the time axis. While traditional neural networks perform feedforward operations from input to output, RNNs retain and update information as they process sequential data, making them efficient for handling sequences like time-series or text.

Implementing an RNN with Keras

Now, let’s implement a simple RNN model using Keras. We will use the IMDB dataset included in Keras to build a model that performs sentiment classification (positive or negative) based on movie reviews.

1. Importing the Necessary Libraries

First, import TensorFlow and Keras libraries.

import tensorflow as tf
from tensorflow.keras import layers, models

2. Preparing the Dataset

The IMDB dataset contains movie reviews (text) and labels indicating whether they are positive (1) or negative (0). Since Keras includes the IMDB dataset, it is easy to load.

# Loading the dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.imdb.load_data(num_words=10000)

# Padding sequences to standardize their length
x_train = tf.keras.preprocessing.sequence.pad_sequences(x_train, maxlen=100)
x_test = tf.keras.preprocessing.sequence.pad_sequences(x_test, maxlen=100)
  • num_words=10000: Uses only the top 10,000 most frequent words.
  • pad_sequences(): Standardizes the length of each review to 100 by adding padding (0) for shorter reviews.

3. Building the RNN Model

Next, we’ll use Keras to construct a simple RNN. For this implementation, we’ll use LSTM (Long Short-Term Memory), an extension of RNN that is better at capturing long-term dependencies, making it ideal for text and time-series data.

# Defining the model
model = models.Sequential()
model.add(layers.Embedding(input_dim=10000, output_dim=32, input_length=100))
model.add(layers.LSTM(32))
model.add(layers.Dense(1, activation='sigmoid'))
  • Embedding: Converts text data into vector representations. input_dim is the vocabulary size, output_dim is the dimensionality of the vectors, and input_length specifies the sequence length.
  • LSTM: An RNN variant optimized for learning long-term dependencies. units indicates the number of units in the LSTM layer.
  • Dense: A fully connected layer using the sigmoid function to convert the output to a range between 0 and 1 for binary classification.

4. Compiling the Model

Next, compile the model by setting the loss function, optimizer, and evaluation metrics.

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])
  • optimizer=’adam’: Adam is an efficient optimization algorithm well-suited for RNNs like LSTM.
  • loss=’binary_crossentropy’: Appropriate for binary classification tasks.
  • metrics=[‘accuracy’]: Evaluates the model’s performance based on accuracy.

5. Training the Model

Train the model using the dataset.

history = model.fit(x_train, y_train, epochs=10, batch_size=64, validation_split=0.2)
  • epochs=10: Iterates through the dataset 10 times during training.
  • batch_size=64: The number of samples per gradient update.
  • validation_split=0.2: Uses 20% of the training data for validation.

6. Evaluating the Model

Evaluate the model’s performance using the test dataset.

test_loss, test_accuracy = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_accuracy:.2f}")

7. Making Predictions with the Model

Make predictions using the trained model and the test data.

predictions = model.predict(x_test)

# Displaying the first 5 predictions
for i in range(5):
    print(f"Actual: {y_test[i]}, Predicted: {1 if predictions[i] > 0.5 else 0}")
  • predict(): The model uses the test data to make predictions.
  • predictions[i] > 0.5 outputs the classification result as 0 or 1.

Customizing and Extending the RNN

With Keras, it’s easy to build more complex RNN models.

Using GRU (Gated Recurrent Unit)

GRU is a simplified version of LSTM that can sometimes improve performance. It can be easily swapped in Keras.

model = models.Sequential()
model.add(layers.Embedding(input_dim=10000, output_dim=32, input_length=100))
model.add(layers.GRU(32))
model.add(layers.Dense(1, activation='sigmoid'))

Bidirectional RNN

For text data, considering context from both past and future sequences can be effective. Keras provides the Bidirectional wrapper for constructing bidirectional RNNs.

model = models.Sequential()
model.add(layers.Embedding(input_dim=10000, output_dim=32, input_length=100))
model.add(layers.Bidirectional(layers.LSTM(32)))
model.add(layers.Dense(1, activation='sigmoid'))

Summary

In this episode, we built an RNN (Recurrent Neural Network) using Keras to classify text data based on sentiment. RNNs are powerful tools for capturing dependencies in sequential data, and variants like LSTM and GRU enhance their capabilities. Experiment with these techniques to build models for various sequential data tasks!

Next Episode Preview

Next time, we will discuss saving and loading models. We’ll learn how to save trained models to files and reload them for future use!


Annotations

  • LSTM (Long Short-Term Memory): A type of RNN optimized for learning long-term dependencies, offering more stable learning than traditional RNNs.
  • Embedding Layer: Converts text data into vector representations for input into the network.
Let's share this post !

Author of this article

株式会社PROMPTは生成AIに関する様々な情報を発信しています。
記事にしてほしいテーマや調べてほしいテーマがあればお問合せフォームからご連絡ください。
---
PROMPT Inc. provides a variety of information related to generative AI.
If there is a topic you would like us to write an article about or research, please contact us using the inquiry form.

Comments

To comment

TOC