Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Tools, Software and IDEs blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
AI and ML blog Accelerating ML inference on X-Ray detection at edge using Raspberry Pi with PyArmNN
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Artificial Intelligence (AI)
  • Arm NN
  • Edge Computing
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Accelerating ML inference on X-Ray detection at edge using Raspberry Pi with PyArmNN

Sandeep Singh
Sandeep Singh
December 9, 2020
9 minute read time.

The COVID-19 pandemic continues to have a devastating effect on the health and well-being of the global population. While researchers around the world are working on a solution, a critical step in the fight against COVID-19 is identified as effective screening of infected patients as early as possible. One of the effective and easy ways to do it by doing X-RAY classifications using “AI” between an infected person vs a healthy person. The proposed model solution is developed to provide accurate diagnostics for binary classification (COVID-19 vs. healthy patients) but in the future can be extended in multi-class classification (COVID19 vs. no-findings vs other diseases such as pneumonia ) and so on. Please note that this blog doesn't claim to be a solution for COVID-19 or a medical solution for the COVID-19 detection. This is just a demonstration on how AI can be used to solve such problems in future and how we can use Arm powered embedded devices to implement such AI solutions. 

Currently, this idea has taken a big global initiative and teams across the globe have come up with an open-source database called COVIDx. This is an open-access benchmark dataset that is being generated comprising of 13,975 CXR images across 13,870 patient cases, with the largest number of publicly available COVID-19 positive. 

This blog is trying to show on developing a simple X-Ray classification model using the pre-trained VGG-16 model and then deploying it on Arm Powered devices such as Hikey-960 or Raspberry Pi 4. Also, with some tweaks, it can be deployed on the Arm AI NPUs (Neural Processing Units) such as Hactar (Ethos-U55). In the  future, Arm powered medical devices will be vital in detecting similar respiratory infectious diseases by using AI at the edge in medical devices and help us in achieving a robust healthcare system.

What is Arm NN and PyArmNN? 

Arm NN is an inference engine for CPUs, GPUs, and NPUs. It executes ML models on-device to make predictions based on input data. Arm NN enables efficient translation of existing neural network frameworks, such as TensorFlow Lite, TensorFlow, ONNX, and Caffe. It allows them to run efficiently and without modification across Arm Cortex-A CPUs, Arm Mali GPUs, and Arm Ethos NPUs.

PyArmNN is a newly developed Python extension for Arm NN SDK (Software Development Kit). PyArmNN is available in Arm NN under armnn/python/pyarmnn folder. Instructions on how to install PyArmNN are also available on the README page.

What do we need?

  • A Raspberry Pi. I am testing with Raspberry Pi 4 with Raspbian 10 OS. The Pi device is powered by an Arm Cortex-A72 processor, which can harness the power of Arm NN SDK for accelerated ML performance.
  • Before you proceed with the project setup, you need to check out and build Arm NN for your Raspberry Pi. Instructions are here.
  • PyArmNN package

Training and validation dataset and setup:

Using COVIDx database, I have trained a custom model based on VGG-16 to do an X-RAY classification with (95%+) accuracy in detecting COVID-19 symptoms patients vs normal patients.

After that, I have deployed the COVID model using Raspberry  Pi device based Arm Cortex-A72 processor Arm (Cortex-A CPU).

Model Used for this X-Ray Classification is VGG16:

VGG16: A diagram about how this model works.

VGG Neural Networks. While previous derivatives of AlexNet focused on smaller window sizes and strides in the first convolutional layer, VGG addresses another very important aspect of CNNs: depth. Let’s go over the architecture of VGG:

  • VGG takes in a 224x224 pixel RGB image. For the ImageNet competition, the authors cropped out the center 224x224 patch in each image to keep the input image size consistent.
  • Convolutional layers - The convolutional layers in VGG use a very small receptive field (3x3, the smallest possible size that still captures left and right and up and down). There are also 1x1 convolution filters which act as a linear transformation of the input, which is followed by a ReLU unit. The convolution stride is fixed to 1 pixel so that the spatial resolution is preserved after convolution.
  • Fully connected layers - VGG has three fully connected layers: the first two have 4096 channels each and the third has 1000 channels, 1 for each class.
  • Hidden layers - All of VGG’s hidden layers use ReLU (a huge innovation from AlexNet that cut training time). VGG does not generally use Local Response Normalization (LRN), as LRN increases memory consumption and training time with no particular increase inaccuracy.

Model doing X-Ray classification:

Stage 1: Importing libraries

__author__      = "Sandeep Singh"
__copyright__   = "GNU GPLv3"

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import AveragePooling2D
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import argparse
import cv2
import os

Stage 2: We will parse command-line arguments and initialize hyperparameters

args = {
    "dataset": 'dataset',
    "plot": 'plot',
    "model": 'model'
}

INIT_LR = 1e-3 #learning rate
EPOCHS = 25 #epochs
BS = 8 #batch size

Stage 3: We would be looping over all the images swap color channels, resize it to be a fixed pixel, resize the data and label and then perform one-hot encoding on the labels.

# Loop over all the image
for imagePath in imagePaths:
    # extract lables
    label = imagePath.split(os.path.sep)[-2]

    # load the image, swap color channels, and resize it to be a fixed
    # 224x224 pixels while ignoring aspect ratio
    image = cv2.imread(imagePath)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image = cv2.resize(image, (224, 224))

    # update data and lable
    data.append(image)
    labels.append(label)

# convert the data and labels to NumPy arrays and scale in [0, 255]
data = np.array(data) / 255.0
labels = np.array(labels)

# perform one-hot encoding on the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
labels = to_categorical(labels)

Stage 4: In this stage, we would be loading the training dataset and perform data augmentation so that we can get more samples. After that, we would load the VGG16 network and remove the fully connect layer and add our own modifications to further customize our model. We train our models using the training dataset and then test it using the validation dataset. One can use their own dataset here to train, test, and validate the model. In this example, we are training the model with the 10 epoch cycles.

# 80:20 training ration
(trainX, testX, trainY, testY) = train_test_split(data, labels,
    test_size=0.20, stratify=labels, random_state=42)

# data augumentation so that we can get more samples
trainAug = ImageDataGenerator(
    rotation_range=15,
    fill_mode="nearest")

# We are using VGG16 network with removing head FC layer
baseModel = VGG16(weights="imagenet", include_top=False,
    input_tensor=Input(shape=(224, 224, 3)))

# construct self head and put as FC over VGG
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(4, 4))(headModel)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(64, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(2, activation="softmax")(headModel)

# place the head FC model on top of the base model
model = Model(inputs=baseModel.input, outputs=headModel)

# loop over all layers in the base model and freeze 
for layer in baseModel.layers:
    layer.trainable = False

# compile our model
print("[INFO] compiling model...")
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="binary_crossentropy", optimizer=opt,
    metrics=["accuracy"])

# train the head of the network
print("[INFO] training head...")
H = model.fit_generator(
    trainAug.flow(trainX, trainY, batch_size=BS),
    steps_per_epoch=len(trainX) // BS,
    validation_data=(testX, testY),
    validation_steps=len(testX) // BS,
    epochs=EPOCHS)

# make predictions on the testing set
print("[INFO] evaluating network...")
predIdxs = model.predict(testX, batch_size=BS)

# for each image in the testing set we need to find the index of the
# label with corresponding largest predicted probability
predIdxs = np.argmax(predIdxs, axis=1)

# show a nicely formatted classification report
print(classification_report(testY.argmax(axis=1), predIdxs,
    target_names=lb.classes_))

# compute the confusion matrix and and use it to derive the raw
# accuracy, sensitivity, and specificity
cm = confusion_matrix(testY.argmax(axis=1), predIdxs)
total = sum(sum(cm))
acc = (cm[0, 0] + cm[1, 1]) / total
sensitivity = cm[0, 0] / (cm[0, 0] + cm[0, 1])
specificity = cm[1, 1] / (cm[1, 0] + cm[1, 1])

# show the confusion matrix, accuracy, sensitivity, and specificity
print(cm)
print("acc: {:.4f}".format(acc))
print("sensitivity: {:.4f}".format(sensitivity))
print("specificity: {:.4f}".format(specificity))

Stage 5: Save the model as "my_model".

from tensorflow.contrib import lite
converter = lite.TFLiteConverter.from_keras_model_file('my_model.h5')
tfmodel = converter.convert()
open ("model.tflite" , "wb") .write(tfmodel)

Stage 6: Plot the training loss and accuracy.

N = EPOCHS
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy on COVID-19 Dataset")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.savefig(args["plot"])

# serialize the model to disk
print("[INFO] saving COVID-19 detector model...")
model.save('my_model.h5') 

This model is ready with an accuracy of over 95% (with the open-source database COVIDx). 

The next stage is loading this model "my_model.h5" and perform the inference with the new X-RAY image of the patients. 

__author__      = "Sandeep Singh"
#package needed to load a model 
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.models import load_model
from imutils import build_montages
from imutils import paths
import numpy as np
import argparse
import random
import cv2
import tensorflow as tf

# Model used here as sandeep_covid-19_model
# keep your test image as in test_image folder
args = {
    "image": 'test_image',
    "model": 'my_model.h5'
}

print("[INFO] loading my pre-trained model")
model = load_model('my_model.h5', custom_objects={
    'Adam': lambda **kwargs: hvd.DistributedOptimizer(keras.optimizers.Adam(**kwargs))
})


imagePaths = list(paths.list_images(args["image"]))
random.shuffle(imagePaths)
imagePaths = imagePaths[:16]
# initialize our list of results
results = []

# loop over image to test kept in test_image directory
for p in imagePaths:
    orig = cv2.imread(p)
    # pre-process our image by converting it from BGR to RGB channel
    # resize it to 224*224 pixel and then scale. 
    #reason for 224*224 is because my_model.h5 is processed with that size image
    image = cv2.cvtColor(orig, cv2.COLOR_BGR2RGB)
    image = cv2.resize(image, (224, 224))
    image = image.astype("float") / 255.0
    
image = img_to_array(image)
image = np.expand_dims(image, axis=0)
# make predictions on the input image
pred = model.predict(image,None)
print(pred)   
pred = pred.argmax(axis=1)[0]
print(pred)   
# an index of zero is the 'infected' label while an index of
# one is the 'uninfected' label
label = "Covid" if pred == 0 else "Normal"
if(pred == 0):
    print("Paitent has COVID-19 Symptoms, Refer medical professsional")
else:
    print("Paitent doesn't have any Symptoms, But if have oral symptoms refer medical professsional")
#@todo: we can plot this image as well

Our next goal is to deploy this model on the Arm powered device like Raspberry Pi using Arm NN and PyArmNN. 

This notebook shows how to develop a python application that classifies images using a TensorFlow Lite quantized model and Arm NN.
The main steps are:

  • Import pyarmnn module
  • Load an input image    
  • Create a parser and load the network
  • Choose backends, create runtime and optimize the model
  • Perform inference
  • Interpret and report the output

Creating a file "detect_patients_pyarmnn.py"

Stage 1 : Import pyarmnn module.

from PIL import Image
import sys
import numpy as np
from os import path

import pyarmnn as ann

print("Working with ARMNN {}".format(ann.ARMNN_VERSION))

Stage 2: Load an image

#Load an image 
parser = argparse.ArgumentParser(
       formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument(
       '--image', help='File path of image file', required=True)
args = parser.parse_args()

Stage 3: Convert the image

The image is loaded as RGB, if the model requires BGR input images (specified in the model_data dictionary).
image = cv2.imread(args.image)
img = img.convert('RGB')
image = cv2.resize(image, (224, 224))
image = np.array(image, dtype=np.float32) / 255.0
print(image.shape)

Stage 4: Load the tflite parser

# TF parsers being used

parser = ann.ITfLiteParser()  
network = parser.CreateNetworkFromBinaryFile('./my_model.tflite')
graph_id = 0
input_names = parser.GetSubgraphInputTensorNames(graph_id)
input_binding_info = parser.GetNetworkInputBindingInfo(graph_id, input_names[0])
input_tensor_id = input_binding_info[0]
input_tensor_info = input_binding_info[1]


print(f"Output tensor info: {output_tensor_info}") 

# Create a runtime object that will perform inference.
options = ann.CreationOptions()
runtime = ann.IRuntime(options)

Stage 5: Choose the backend 

# Backend choices earlier in the list have higher preference.
preferredBackends = [ann.BackendId('CpuAcc'), ann.BackendId('CpuRef')]
opt_network, messages = ann.Optimize(network, preferredBackends, runtime.GetDeviceSpec(), ann.OptimizerOptions())

# Load the optimized network into the runtime.
net_id, _ = runtime.LoadNetwork(opt_network)
print(f"Loaded network, id={net_id}")
Stage 6: Perform the inference
# Create an inputTensor for inference.
input_tensors = ann.make_input_tensors([input_binding_info], [image])

# Get output binding information for an output layer by using the layer name.
output_names = parser.GetSubgraphOutputTensorNames(graph_id)
output_binding_info = parser.GetNetworkOutputBindingInfo(0, output_names[0])
output_tensors = ann.make_output_tensors([output_binding_info])


runtime.EnqueueWorkload(0, input_tensors, output_tensors)
output, output_tensor_info = ann.from_output_tensor(output_tensors[0][1])
print(f"Output tensor info: {output_tensor_info}")
print(output)

Stage 7: Final Output to check the results.
pred = np.argmax(output)
if(pred == 0):
    print("Patient has COVID-19 Symptoms, Refer medical professional")
else:
    print("Patient doesn't have any Symptoms, But if have oral symptoms refer medical professional")
Run the Python Script From the Command Line
python3 detect_patients_pyarmnn.py --image ./images/patient.jpg


[0.9967675, 0.00323252]
Patient has COVID-19 Symptoms, Refer medical professional

In our example, class 0’s possibility is 0.9967675, vs. class 1’s possibility is 0.00323252. COVID-19 symptoms are detected in the previous image. 

This tutorial shows how to use the Arm NN Python APIs to classify images as “Covid” versus “Normal” You also can use it as a starting point to handle other types of neural networks. 

To learn more about Arm NN, read the following resource: 

  • Arm Software Developer Kit (SDK)
Anonymous
AI and ML blog
  • Analyzing Machine Learning models on a layer-by-layer basis

    George Gekov
    George Gekov
    In this blog, we demonstrate how to analyze a Machine Learning model on a layer-by-layer basis.
    • October 31, 2022
  • How audio development platforms can take advantage of accelerated ML processing

    Mary Bennion
    Mary Bennion
    Join DSP Concepts and Alif Semiconductor at Arm DevSummit 2022 to discuss ML techniques commonly used for audio. Discover the features and benefits of the Audio Weaver platform.
    • October 24, 2022
  • How to Deploy PaddlePaddle on Arm Cortex-M with Arm Virtual Hardware

    Liliya Wu
    Liliya Wu
    This blog introduces how to deploy a PP-OCRv3 English text recognition model on Arm Cortex-M55 processor with Arm Virtual Hardware.
    • August 31, 2022