MENU

[AI from Scratch] Episode 225: Implementing a Neural Network — Basic Neural Network Construction

TOC

Recap and Today’s Theme

Hello! In the previous episode, we discussed hyperparameter tuning to maximize model performance. By using grid search to find the best hyperparameters, we were able to significantly improve the model’s accuracy.

Today, we will explore the basics of constructing a neural network, a core aspect of AI development. Neural networks form the foundation of deep learning and are utilized in various domains such as image recognition, natural language processing, and predictive analytics. We will use Python’s Keras library to build a simple neural network and understand how it works. Let’s get started!

What Is a Neural Network?

A neural network is an algorithm modeled after the neurons in the human brain. It learns patterns from data and performs predictions or classifications. The basic structure of a neural network consists of the following three layers:

  1. Input Layer: Where the features of the data are input.
  2. Hidden Layer: Processes inputs and extracts features. Usually, multiple hidden layers are stacked.
  3. Output Layer: Produces the final output of the model.

Each neuron (node) in these layers is connected to neurons in the previous layer, with each connection assigned a weight. By optimizing these weights, the neural network learns patterns from data and makes predictions.

Constructing a Neural Network with Keras

Keras is a convenient library for building neural networks in Python. It operates on top of TensorFlow and allows you to define, train, and evaluate neural networks with simple code.

1. Installing and Importing Required Libraries

First, install Keras and TensorFlow. Since Keras is integrated with TensorFlow, installing TensorFlow is sufficient.

pip install tensorflow

Next, import the necessary libraries.

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

2. Preparing the Dataset

For this example, we use the MNIST dataset included in Keras, which contains images of handwritten digits (0-9). It’s a great dataset for learning the basics of image classification.

# Loading the dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalizing the data (scaling pixel values from 0-255 to 0-1)
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255

# Reshaping the image data from 28x28 to a 1D vector (784)
x_train = x_train.reshape((-1, 784))
x_test = x_test.reshape((-1, 784))

# Converting labels to categorical (one-hot encoding)
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
  • Normalization: Scales pixel values (0-255) to a range between 0 and 1.
  • Reshaping: Converts 28×28 pixel images to 1D vectors of 784 dimensions.
  • One-Hot Encoding: Converts labels to one-hot vectors for a 10-class classification output (0-9).

3. Building the Neural Network

Next, we use Keras to build a simple neural network by defining the input layer, hidden layers, and output layer sequentially.

# Defining the model
model = keras.Sequential([
    layers.Dense(128, activation='relu', input_shape=(784,)),  # Hidden layer 1
    layers.Dense(64, activation='relu'),                       # Hidden layer 2
    layers.Dense(10, activation='softmax')                     # Output layer
])
  • Sequential: Defines a model that stacks layers sequentially.
  • Dense: Defines a fully connected layer. The units parameter sets the number of neurons, and activation specifies the activation function.
  • ReLU (Rectified Linear Unit): A common activation function used in hidden layers to introduce non-linearity.
  • Softmax: Used in the output layer to output the probability distribution across classes.

4. Compiling the Model

Compile the model by setting the loss function, optimizer, and evaluation metrics.

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
  • optimizer=’adam’: Adam is a popular optimizer known for its efficiency and convergence speed.
  • loss=’categorical_crossentropy’: This loss function is used for multi-class classification problems.
  • metrics=[‘accuracy’]: Evaluates model performance using accuracy.

5. Training the Model

Train the model using the training and test data.

history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)
  • epochs=10: Trains the model for 10 iterations over the entire training data.
  • batch_size=32: Uses 32 samples per gradient update.
  • validation_split=0.2: Reserves 20% of the training data for validation.

6. Evaluating the Model

After training, evaluate the model’s performance using the test data.

test_loss, test_accuracy = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_accuracy:.2f}")
  • evaluate(): Calculates the loss and accuracy on the test data.

7. Making Predictions

Finally, use the model to make predictions on the test data.

predictions = model.predict(x_test)

# Displaying the first 5 predictions
for i in range(5):
    print(f"Actual: {np.argmax(y_test[i])}, Predicted: {np.argmax(predictions[i])}")
  • predict(): Makes predictions using the test data.
  • np.argmax(): Converts the one-hot encoded output back to the original class label.

Summary

In this episode, we explained how to build a basic neural network using Keras. From defining the layers to training and evaluating the model, you should now understand the fundamental flow of constructing a neural network. Keras’s simple API makes building complex models easy, so try experimenting with other datasets!

Next Episode Preview

Next time, we will introduce TensorFlow Basics, exploring the deep learning framework TensorFlow to build more advanced neural networks. Stay tuned!


Annotations

  • ReLU: Stands for Rectified Linear Unit, a widely used activation function in neural networks to introduce non-linearity in hidden layers.
  • Softmax: An activation function used in the output layer to provide a probability distribution across classes, commonly used in multi-class classification tasks.
Let's share this post !

Author of this article

株式会社PROMPTは生成AIに関する様々な情報を発信しています。
記事にしてほしいテーマや調べてほしいテーマがあればお問合せフォームからご連絡ください。
---
PROMPT Inc. provides a variety of information related to generative AI.
If there is a topic you would like us to write an article about or research, please contact us using the inquiry form.

Comments

To comment

TOC