Creating a Perceptron in PyTorch Lightning

Keywords: #ml #PyTorchLightning #pinned

Introduction

In the grand-scheme of things, I am relatively new to ML and particularly neural networks. Like all good nn students that have come before me, I started by learning about a single perceptron. The perceptron is the most basic of neural network components. A perceptron accepts an input of size \(n\) and has \(n + 1\) weights, where the extra weight is to handle a bias (think of the bias as a permanent input of value \(1*weight_{bias}\), or if you prefer to think visually, the bias represents \(c\) in the \(y=mx + c\) rule).

PyTorch Lightning

PyTorch Lightning is a lightweight wrapper around PyTorch which aims to make PyTorch code easier to read, more concise and quicker to write. If you are already with familiar with PyTorch but not PyTorch Lightning then you will be pleasantly surprised by its simplicity and if you are not familiar with PyTorch then you should find PyTorch Lightning code easier to pick up and understand for the first time.

Code

My aim was to create as simple an example as possible. For this reason, I settled on creating a perceptron with one input-weight pair. Its purpose: to double the input.

To start, here are some of the imports I will be using:

import torch
from torch import nn
from torch.utils.data import DataLoader, TensorDataset
import pytorch_lightning as pl
from pytorch_lightning import loggers as pl_loggers

Creating the Dataset

The perceptron needs to learn from data, hence the need to create a dataset. On the first line, I define train_size. This variable describes the amount of data we want to train out model on. Since we are building an extremely simple model, this number can be relatively small. The dataset for this purpose is simple, we want the model to double inputs, so we need to give it numbers as inputs and then we tell it to compare its answer to double that input. In the second line, I use python list comprehension and PyTorch’s randn function to generate a list of pairs where the first component is the input and the second is the expected answer (\(2 \times input\) in our case). The first line is a way of simplifying later stages, we create a TensorDataset using the data we have just created. This will allow us to pass the data easily to a trainer via a DataLoader.

train_size = 20     # Size of training set
train_X = [[i, 2*i] for i in torch.randn(train_size, dtype=torch.double)]
dataset = TensorDataset(torch.tensor(train_X))

PyTorchLightning Model

The class, Times2Model, outlined in the code snippet below defines our PyTorch model. PyTorchLightning allows us to create a PyTorch model by implementing the methods listed in shown in the snippet.

torch.set_default_dtype(torch.double)

class Times2Model(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(1,1)
        self.criterion = nn.MSELoss()

    def forward(self, x):
        return self.fc1(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.criterion(y_hat, y)
        return loss

    def configure_optimizers(self):
        return torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

Init Function

In the initialisation function we declare our model architecture using PyTorch to provide the Fully Connected or Linear layer along with the criterion function which we will use as our loss function. Since we are creating a lone perceptron, our fully connected layer consists of just one node. Our perceptron is learning to double a single given input, the layer needs just that; one input along with a single output (hence the (1,1) pair passed to the Linear layer).

Feed Forward Function

In forward(self, x), we need to define what happens when the model receives an input. Since we have such a simple model, we just pass the input, x, through our fully connected perceptron and return the output.

Note, this definition defines the behaviour of passing input to the model directly such as in model(x).

Training Step

In the training step, the goal is to make a prediction with our model, calculate the error between its prediction and the expected value and then return the error so that the model can adjust its weight accordingly.

Optimiser Configuration

In this function we define which optimiser for the model to use. This optimiser then controls weight adjustment according to the parameters provided.

Trainer

trainer = pl.Trainer(max_epochs=5)
model = Times2Model()
train_loader = DataLoader(train_X)
trainer.fit(model, train_loader)

Manual Evaluation

for i in torch.randn(5, dtype=torch.double):
    print(f"Expected: {2*i}")
    print(f"Actual: {model(torch.tensor([i]))[0]}")

The above snippet generates 5 random samples to test our trained model on. Then it prints the expected value (which we find by simply multiplying by 2) followed by the value outputted by the model.

for weight in model.parameters():
    print(weight)

Finally in this snippet, I print out the single weight (which should be extremely close to two if we have done things correctly!) and the bias (which should be 0).

Complete File

View the complete python script (hosted on github) below:


I hope you enjoyed reading this blog post! Sign up to my newsletter here: