MENU

[AI from Scratch] Episode 227: Building Models with Keras — Using the High-Level API

TOC

Recap and Today’s Theme

Hello! In the previous episode, we explored the basics of the deep learning framework TensorFlow. Using TensorFlow, we efficiently implemented various machine learning models, from tensor operations to building neural networks.

Today, we will delve into model building using Keras, the high-level API built into TensorFlow. Keras allows for easy definition, training, and evaluation of neural networks with simple and intuitive code. Its flexibility and user-friendliness make it an invaluable tool for learning deep learning. In this episode, we will explain the basics of Keras and demonstrate how to build custom models.

What Is Keras?

Keras is a high-level API designed for defining deep learning models with minimal code. Integrated as part of TensorFlow, Keras has several key features:

  1. Concise and Intuitive Code: You can build neural networks with minimal code using Keras.
  2. Flexibility: Keras provides the flexibility to customize model structures and training processes, supporting both simple and complex models.
  3. High Portability: Closely integrated with TensorFlow, Keras allows for fast training using GPUs and TPUs.

Basic Model Building with Keras

When building a neural network in Keras, the process generally involves three main steps:

  1. Defining the Model
  2. Compiling the Model
  3. Training and Evaluating the Model

1. Defining the Model

There are two primary ways to define a model in Keras: Sequential API and Functional API.

(1) Sequential API

The Sequential API is ideal for simple models, stacking layers in sequence. Below is an example of a handwritten digit classification model using the MNIST dataset.

import tensorflow as tf
from tensorflow.keras import layers

# Defining the model
model = tf.keras.Sequential([
    layers.Dense(128, activation='relu', input_shape=(784,)),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])
  • Dense: Defines a fully connected layer, specifying the number of neurons and the activation function.
  • input_shape: Specifies the input data shape for the first layer.

(2) Functional API

The Functional API provides greater flexibility, enabling the construction of complex structures such as networks with multiple inputs and outputs or networks with skip connections.

from tensorflow.keras import Input, Model

# Input layer
inputs = Input(shape=(784,))
x = layers.Dense(128, activation='relu')(inputs)
x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)

# Defining the model
model = Model(inputs=inputs, outputs=outputs)
  • Input: Defines the input layer of the model.
  • Model: Specifies the inputs and outputs to define the model.

With the Functional API, each layer is defined independently, allowing for flexible and customizable data flow through the model.

2. Compiling the Model

Once the model is built, the next step is compilation, where the loss function, optimization algorithm, and evaluation metrics are set.

# Compiling the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
  • optimizer: Specifies the optimization algorithm (adam is commonly used).
  • loss: Sets the loss function (e.g., categorical_crossentropy or sparse_categorical_crossentropy for classification tasks).
  • metrics: Defines the performance evaluation metric (e.g., accuracy).

3. Training and Evaluating the Model

To train the model, use the training data and the fit method.

# Preparing training and label data
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape((-1, 784)).astype('float32') / 255
x_test = x_test.reshape((-1, 784)).astype('float32') / 255

# Training the model
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)
  • epochs: Specifies the number of times the dataset is iterated through (10 in this example).
  • batch_size: The number of samples used per update.
  • validation_split: The proportion of training data used for validation (20%).

After training, evaluate the model using the test data.

# Evaluating the model
test_loss, test_accuracy = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_accuracy:.2f}")

Building Custom Models and Advanced Usage of Keras

Keras allows not only simple models but also the construction of custom layers and complex networks.

Creating a Custom Layer

To define a new layer in Keras, extend tf.keras.layers.Layer and create a custom layer.

class MyCustomLayer(layers.Layer):
    def __init__(self, units=32):
        super(MyCustomLayer, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='random_normal',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='zeros',
                                 trainable=True)

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

# Defining a model with the custom layer
inputs = Input(shape=(784,))
x = MyCustomLayer(64)(inputs)
x = layers.Activation('relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)

model = Model(inputs=inputs, outputs=outputs)
  • build(): Initializes the weights of the layer.
  • call(): Performs the forward pass computation.

By using custom layers, you can build complex models that go beyond the standard Keras layers.

Saving and Loading Models

Keras allows you to save trained models and reload them for later use.

# Saving the model
model.save('my_model.h5')

# Loading the model
new_model = tf.keras.models.load_model('my_model.h5')
  • save(): Saves the model to a file.
  • load_model(): Loads and reuses the saved model.

Using Callbacks

To monitor the training process and adjust settings automatically, Keras offers callbacks, such as EarlyStopping for stopping training when performance ceases to improve.

# Defining an EarlyStopping callback
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

# Training the model with the callback
model.fit(x_train, y_train, epochs=50, batch_size=32, validation_split=0.2, callbacks=[early_stopping])
  • EarlyStopping: Stops training if the val_loss does not improve for a specified number of epochs (patience).

Summary

In this episode, we explored how to build neural networks using Keras. Keras is a powerful tool that, while simple, is highly flexible, allowing you to create anything from basic models to advanced ones with custom layers. Experiment with various network structures and training methods using Keras!

Next Episode Preview

Next time, we will cover the implementation of CNNs (Convolutional Neural Networks). CNNs are highly effective for image recognition, and we’ll learn the basics of building these models!


Annotations

  • Sequential API: A simple model definition method that stacks layers in sequence.
  • Functional API: An API that allows for flexible and customizable model structures.
Let's share this post !

Author of this article

株式会社PROMPTは生成AIに関する様々な情報を発信しています。
記事にしてほしいテーマや調べてほしいテーマがあればお問合せフォームからご連絡ください。
---
PROMPT Inc. provides a variety of information related to generative AI.
If there is a topic you would like us to write an article about or research, please contact us using the inquiry form.

Comments

To comment

TOC