Neural Networks¶
Rock Paper Scissors Hand Pose Classification¶
This tutorial will show you how to train and test a PyTorch based custom neural network model made using DFFML. The dataset we will be using is the rock-paper-scissors-dataset which contains images of hands in Rock/Paper/Scissors poses, each image is a 300x300 RGB image.
The model we’ll be using is PyTorchNeuralNetwork
which is a part of dffml-model-pytorch
, a DFFML plugin which allows you to use PyTorch
via DFFML. We can install it with pip
. We will also be using image loading from
dffml-config-image
and YAML file loading from dffml-config-yaml
for creating our neural network.
$ pip install -U dffml-model-pytorch dffml-config-image dffml-config-yaml
(.venv) C:\Users\username> python -m pip install -U dffml-model-pytorch dffml-config-image dffml-config-yaml -f https://download.pytorch.org/whl/torch_stable.html
Download the dataset and verify with with sha384sum
.
curl -LO https://storage.googleapis.com/laurencemoroney-blog.appspot.com/\{rps,rps-test-set,rps-validation\}.zip
sha384sum -c - << EOF
c6a9119b0c6a0907b782bd99e04ce09a0924c0895df6a26bc6fb06baca4526f55e51f7156cceb4791cc65632d66085e8 rps.zip
fc45a0ebe58b9aafc3cd5a60020fa042d3a19c26b0f820aee630b9602c8f53dd52fd40f35d44432dd031dea8f30a5f66 rps-test-set.zip
375457bb95771ffeace2beedab877292d232f31e76502618d25e0d92a3e029d386429f52c771b05ae1c7229d2f5ecc29 rps-validation.zip
EOF
rps.zip: OK
rps-test-set.zip: OK
rps-validation.zip: OK
Extract the datasets.
$ unzip rps.zip
$ unzip rps-test-set.zip
$ unzip rps-validation.zip -d rps-predict
The dataset for training the model will be in the rps directory. The dataset for testing the model will be in the rps-test-set directory. The images we will be using for prediction on the neural network will be in the rps-predict directory.
Now that we have our dataset ready, we can perform classification of the hand poses to predict whether it is rock, paper or scissors!
We first create the neural network.
- The neural network can be created in 2 ways using DFFML:
By creating a dictionary of layers in YAML or JSON format passing the file via CLI (eg. @model.yaml).
By using the torch module to create the model and passing an instance of the network to the model config.
Command Line¶
We first create a YAML file to define the neural network with all the information about the layers along with the forward method which is passed as list of layers under the model name key:
model.yaml
model:
conv1:
layer_type: Conv2d
in_channels: 3
out_channels: 32
kernel_size: 5
padding: 2
conv2:
layer_type: Conv2d
in_channels: 32
out_channels: 32
kernel_size: 3
padding: 1
conv3:
layer_type: Conv2d
in_channels: 32
out_channels: 16
kernel_size: 3
padding: 1
relu:
layer_type: ReLU
pooling:
layer_type: MaxPool2d
kernel_size: 2
linear:
layer_type: Linear
in_features: 1296
out_features: 3
forward:
model:
# block 1
# Image dimensions at the beginning: torch.Size([batch_size, 32, 150, 150])
- conv1
- relu
- pooling
# block 2
# Image dimensions after block 1: torch.Size([batch_size, 32, 75, 75])
- conv2
- relu
- pooling
# block 3
# Image dimensions after block 2: torch.Size([batch_size, 32, 37, 37])
- conv2
- relu
- pooling
# block 4
# Image dimensions after block 3: torch.Size([batch_size, 32, 18, 18])
- conv3
- relu
- pooling
# fully connected layer
# Image dimensions after block 4: torch.Size([batch_size, 16, 9, 9])
# As the `Linear` layer only accepts 1D Tensors (2D if we take into account the batch_size),
# We need to change the shape of the incoming Tensor i.e. flatten it in this case, so we use
# torch's `view` property which will change the incoming Tensor size to torch.Size([batch_size, 16*9*9]),
# hence the '-1' for the value of batch_size to be inferred from the remaining dimensions which is the
# batch_size specified already in this case. It is good practice not to hard code the batch_size in the network
# as we might want to change it in the future. This is what the Linear layer is fed as "in_features: 1296".
- view:
- -1
- 1296
- linear
To learn more about Tensor Views, visit Tensor Views PyTorch Docs
See also
Sequential layers can also be created by indenting the layers under a key! The layers defined inside the Sequential layer can be used again while defining the forward method in the following syntax: - block1.conv1 More info about PyTorch’s Sequential Layers and other layers used can be found at the Official PyTorch Documentation - torch.nn module
An example of creating Sequential Layers would be:
example_model:
block1:
...
# One of the many Sequential layers in example_model
block2:
conv2:
name: Conv2d
in_channels: 32
out_channels: 16
kernel_size: 3
padding: 1
relu:
name: ReLU
maxpooling:
name: MaxPool2d
kernel_size: 2
block3:
...
linear:
...
forward:
model:
- block1
- block2
- block3
- block1.conv1 # Re-using a single layer inside another `Sequential Layer`
- block2.maxpooling
- view:
- -1
- 1296
- linear
Note
If the forward method is not specified in the YAML file, it is automatically created by appending the top level layers (Sequential or Single) sequentially in the order they were defined in the file.
Train the model.
dffml train \
-model pytorchnet \
-model-features image:int:$((300*300*3)) \
-model-clstype str \
-model-classifications rock paper scissors \
-model-predict label:int:1 \
-model-network @model.yaml \
-model-location rps_model \
-model-loss crossentropyloss \
-model-optimizer Adam \
-model-validation_split 0.2 \
-model-epochs 10 \
-model-batch_size 32 \
-model-imageSize 150 \
-model-enableGPU \
-model-patience 2 \
-sources f=dir \
-source-foldername rps \
-source-feature image \
-source-labels rock paper scissors \
-log debug
INFO:dffml.PyTorchNeuralNetworkContext:Training complete in 1m 42s
INFO:dffml.PyTorchNeuralNetworkContext:Best Validation Accuracy: 1.000000
Assess the model’s accuracy.
dffml accuracy \
-model pytorchnet \
-model-features image:int:$((300*300*3)) \
-model-clstype str \
-model-classifications rock paper scissors \
-model-predict label:int:1 \
-model-network @model.yaml \
-model-location rps_model \
-model-imageSize 150 \
-model-enableGPU \
-features label:int:1 \
-sources f=dir \
-source-foldername rps-test-set \
-source-feature image \
-source-labels rock paper scissors \
-scorer pytorchscore
The output is:
0.8763440860215054
Predict with the trained model.
dffml predict all \
-model pytorchnet \
-model-features image:int:$((300*300*3)) \
-model-clstype str \
-model-classifications rock paper scissors \
-model-predict label:int:1 \
-model-network @model.yaml \
-model-location rps_model \
-model-imageSize 150 \
-model-enableGPU \
-sources f=dir \
-source-foldername rps-predict \
-source-feature image \
-pretty
Some of the Predictions:
Key: scissors7.png
Record Features
+----------------------------------------------------------------------------------------------------------------------------------------------+
| image | [[253, 253, 253], [254, 254, 254], [254, 254, ... (length:300) |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Prediction
+----------------------------------------------------------------------------------------------------------------------------------------------+
| label |
+----------------------------------------------------------------------------------------------------------------------------------------------+
| Value: scissors | Confidence: 0.9904084205627441 |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Key: rock8.png
Record Features
+----------------------------------------------------------------------------------------------------------------------------------------------+
| image | [[254, 254, 254], [253, 253, 253], [253, 253, ... (length:300) |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Prediction
+----------------------------------------------------------------------------------------------------------------------------------------------+
| label |
+----------------------------------------------------------------------------------------------------------------------------------------------+
| Value: rock | Confidence: 1.0 |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Key: paper4.png
Record Features
+----------------------------------------------------------------------------------------------------------------------------------------------+
| image | [[254, 254, 254], [254, 254, 254], [253, 253, ... (length:300) |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Prediction
+----------------------------------------------------------------------------------------------------------------------------------------------+
| label |
+----------------------------------------------------------------------------------------------------------------------------------------------+
| Value: paper | Confidence: 0.9904376864433289 |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Key: paper-hires1.png
Record Features
+----------------------------------------------------------------------------------------------------------------------------------------------+
| image | [[253, 253, 253], [253, 253, 253], [252, 252, ... (length:900) |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Prediction
+----------------------------------------------------------------------------------------------------------------------------------------------+
| label |
+----------------------------------------------------------------------------------------------------------------------------------------------+
| Value: paper | Confidence: 0.8567885756492615 |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Key: scissors-hires2.png
Record Features
+----------------------------------------------------------------------------------------------------------------------------------------------+
| image | [[253, 253, 253], [253, 253, 253], [253, 253, ... (length:900) |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Prediction
+----------------------------------------------------------------------------------------------------------------------------------------------+
| label |
+----------------------------------------------------------------------------------------------------------------------------------------------+
| Value: scissors | Confidence: 0.9906457662582397 |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Key: rock-hires1.png
Record Features
+----------------------------------------------------------------------------------------------------------------------------------------------+
| image | [[254, 254, 254], [253, 253, 253], [251, 251, ... (length:900) |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Prediction
+----------------------------------------------------------------------------------------------------------------------------------------------+
| label |
+----------------------------------------------------------------------------------------------------------------------------------------------+
| Value: rock | Confidence: 1.0 |
+----------------------------------------------------------------------------------------------------------------------------------------------+
Python API¶
import torch.nn as nn
import asyncio
from dffml import train, score, predict, DirectorySource, Features, Feature
from dffml_model_pytorch import PyTorchNeuralNetwork, CrossEntropyLossFunction
from dffml_model_pytorch.pytorch_accuracy_scorer import PytorchAccuracy
# Define the Neural Network
class ConvNet(nn.Module):
"""
Convolutional Neural Network to classify hand gestures in an image as rock, paper or scissors
"""
def __init__(self):
super(ConvNet, self).__init__()
self.conv1 = nn.Conv2d(
in_channels=3, out_channels=32, kernel_size=5, padding=2
)
self.conv2 = nn.Conv2d(
in_channels=32, out_channels=32, kernel_size=3, padding=1
)
self.conv3 = nn.Conv2d(
in_channels=32, out_channels=16, kernel_size=3, padding=1
)
self.relu = nn.ReLU()
self.pooling = nn.MaxPool2d(kernel_size=2)
self.linear = nn.Linear(in_features=16 * 9 * 9, out_features=3)
def forward(self, x):
# block 1
x = self.pooling(self.relu(self.conv1(x)))
# block 2
x = self.pooling(self.relu(self.conv2(x)))
# block 3
x = self.pooling(self.relu(self.conv2(x)))
# block 4
x = self.pooling(self.relu(self.conv3(x)))
# fully connected layer
x = self.linear(x.view(-1, 16 * 9 * 9))
return x
RockPaperScissorsModel = ConvNet()
Loss = CrossEntropyLossFunction()
# Define the dffml model config
model = PyTorchNeuralNetwork(
classifications=["rock", "paper", "scissors"],
features=Features(Feature("image", int, 300 * 300)),
predict=Feature("label", int, 1),
location="rps_model",
network=RockPaperScissorsModel,
epochs=10,
batch_size=32,
imageSize=150,
validation_split=0.2,
loss=Loss,
optimizer="Adam",
enableGPU=True,
patience=2,
)
# Define source for training image dataset
train_source = DirectorySource(
foldername="rps", feature="image", labels=["rock", "paper", "scissors"],
)
# Define source for testing image dataset
test_source = DirectorySource(
foldername="rps-test-set",
feature="image",
labels=["rock", "paper", "scissors"],
)
# Define source for prediction image dataset
predict_source = DirectorySource(foldername="rps-predict", feature="image",)
async def main():
# Train the model
await train(model, train_source)
# Assess the accuracy
scorer = PytorchAccuracy()
acc = await score(model, scorer, test_source)
print("\nTesting Accuracy: ", acc)
# Make Predictions
print(
"\n{:>40} \t {:>10} \t {:>10}\n".format(
"Image filename", "Prediction", "Confidence"
)
)
async for key, features, prediction in predict(model, predict_source):
print(
"{:>40} \t {:>10} \t {:>10}".format(
"rps-predict/" + key,
prediction["label"]["value"],
prediction["label"]["confidence"],
)
)
if __name__ == "__main__":
asyncio.run(main())
The output will be as follows:
DEBUG:dffml.util.AsyncContextManagerList.Sources:Entering: DirectorySource(DirectorySourceConfig(foldername=PosixPath('rps'), feature='image', labels=['rock', 'paper', 'scissors'], save=None))
DEBUG:dffml.PNGConfigLoader:BaseConfig()
DEBUG:dffml.DirectorySource:DirectorySource(DirectorySourceConfig(foldername=PosixPath('rps'), feature='image', labels=['rock', 'paper', 'scissors'], save=None)) loaded 2520 records
DEBUG:dffml.util.AsyncContextManagerListContext.SourcesContext:Entering context: <dffml.source.memory.MemorySourceContext object at 0x7f0fe9170c50>
DEBUG:dffml.PyTorchNeuralNetworkContext:cids(3): {0: 'paper', 1: 'rock', 2: 'scissors'}
DEBUG:dffml.PyTorchNeuralNetworkContext:classifications(3): {'paper': 0, 'rock': 1, 'scissors': 2}
DEBUG:dffml.PyTorchNeuralNetworkContext:Loading model with classifications(3): {'paper': 0, 'rock': 1, 'scissors': 2}
DEBUG:dffml.PyTorchNeuralNetworkContext:Model Summary
ConvNet(
(conv1): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(conv2): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv3): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu): ReLU()
(pooling): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(linear): Linear(in_features=1296, out_features=3, bias=True)
)
DEBUG:dffml.PyTorchNeuralNetworkContext:Training on features: ['image']
INFO:dffml.PyTorchNeuralNetworkContext:------ Record Data ------
INFO:dffml.PyTorchNeuralNetworkContext:x_cols: 2520
INFO:dffml.PyTorchNeuralNetworkContext:y_cols: 2520
INFO:dffml.PyTorchNeuralNetworkContext:-----------------------
INFO:dffml.PyTorchNeuralNetworkContext:Data split into Training samples: 2016 and Validation samples: 504
DEBUG:dffml.PyTorchNeuralNetworkContext:Epoch 1/10
DEBUG:dffml.PyTorchNeuralNetworkContext:----------
DEBUG:dffml.PyTorchNeuralNetworkContext:Training Loss: 1.0265 Acc: 0.4484
DEBUG:dffml.PyTorchNeuralNetworkContext:Validation Loss: 0.6922 Acc: 0.7778
DEBUG:dffml.PyTorchNeuralNetworkContext:
DEBUG:dffml.PyTorchNeuralNetworkContext:Epoch 2/10
DEBUG:dffml.PyTorchNeuralNetworkContext:----------
DEBUG:dffml.PyTorchNeuralNetworkContext:Training Loss: 0.3382 Acc: 0.8988
DEBUG:dffml.PyTorchNeuralNetworkContext:Validation Loss: 0.1446 Acc: 0.9722
DEBUG:dffml.PyTorchNeuralNetworkContext:
DEBUG:dffml.PyTorchNeuralNetworkContext:Epoch 3/10
DEBUG:dffml.PyTorchNeuralNetworkContext:----------
DEBUG:dffml.PyTorchNeuralNetworkContext:Training Loss: 0.0663 Acc: 0.9851
DEBUG:dffml.PyTorchNeuralNetworkContext:Validation Loss: 0.0531 Acc: 0.9901
DEBUG:dffml.PyTorchNeuralNetworkContext:
DEBUG:dffml.PyTorchNeuralNetworkContext:Epoch 4/10
DEBUG:dffml.PyTorchNeuralNetworkContext:----------
DEBUG:dffml.PyTorchNeuralNetworkContext:Training Loss: 0.0332 Acc: 0.9940
DEBUG:dffml.PyTorchNeuralNetworkContext:Validation Loss: 0.0261 Acc: 0.9940
DEBUG:dffml.PyTorchNeuralNetworkContext:
DEBUG:dffml.PyTorchNeuralNetworkContext:Epoch 5/10
DEBUG:dffml.PyTorchNeuralNetworkContext:----------
DEBUG:dffml.PyTorchNeuralNetworkContext:Training Loss: 0.0139 Acc: 0.9955
DEBUG:dffml.PyTorchNeuralNetworkContext:Validation Loss: 0.0060 Acc: 1.0000
DEBUG:dffml.PyTorchNeuralNetworkContext:
INFO:dffml.PyTorchNeuralNetworkContext:Early stopping: Validation Loss didn't improve for 2 consecutive epochs OR maximum accuracy attained.
INFO:dffml.PyTorchNeuralNetworkContext:Training complete in 1m 24s
INFO:dffml.PyTorchNeuralNetworkContext:Best Validation Accuracy: 1.000000
Testing Accuracy: 0.9220430107526881
Image filename Prediction Confidence
rps-predict/scissors7.png scissors 0.9990881681442261
rps-predict/rock8.png rock 1.0
rps-predict/paper4.png paper 0.7040390372276306
rps-predict/paper9.png paper 0.5414079427719116
rps-predict/paper7.png paper 0.9999395608901978
rps-predict/paper-hires1.png paper 0.6969484090805054
rps-predict/scissors-hires2.png scissors 0.9760923385620117
rps-predict/scissors4.png scissors 0.9901471734046936
rps-predict/scissors8.png scissors 0.9570472240447998
rps-predict/rock-hires1.png rock 1.0
rps-predict/paper-hires2.png paper 0.9593355059623718
rps-predict/paper8.png paper 0.9998843669891357
rps-predict/scissors2.png scissors 0.9512919783592224
rps-predict/rock4.png rock 0.9999837875366211
rps-predict/paper1.png paper 0.9846717715263367
rps-predict/scissors5.png scissors 0.9992364645004272
rps-predict/scissors-hires1.png scissors 0.9997430443763733
rps-predict/paper2.png paper 0.9999908208847046
rps-predict/rock1.png rock 0.9999294281005859
rps-predict/scissors3.png paper 0.6644117832183838
rps-predict/scissors1.png scissors 0.9635807275772095
rps-predict/scissors6.png scissors 0.9999761581420898
rps-predict/rock9.png rock 0.9018403887748718
rps-predict/scissors9.png scissors 0.9989670515060425
rps-predict/paper6.png paper 0.9996334314346313
rps-predict/paper3.png paper 0.8676062226295471
rps-predict/paper5.png paper 0.9100719094276428
rps-predict/rock7.png rock 0.9990749359130859
rps-predict/rock5.png rock 0.8894972205162048
rps-predict/rock3.png rock 0.9995730519294739
rps-predict/rock2.png rock 0.991766631603241
rps-predict/rock-hires2.png rock 0.9999849796295166
rps-predict/rock6.png rock 0.8994896411895752
The model predicts the hand poses correctly with great confidence!