Hybrid Computer Vision Recognition for Security Applications

September 2, 2020

6 minute read time.

***Content written in this blog by Elaine Lin from ZKTeco***

The development of computer vision recognition has greatly benefited from recent advancements in data storage, computing power, and algorithm execution. Further contributing to its rapid development is the broad range of hardware and software technologies available, including Arm 32-bit & 64-bit Neon instructions and Arm 64-bit SIMD instructions. We leverage these technologies to further enhance the performance of our near-infrared & visible light hybrid biometric matching platform, which consists of various facial, palm, vehicle features, and iris recognition devices.

Knowledge Distilled Embedded Facial Recognition

Following the Knowledge Distillation method, we optimized our facial recognition technology for an Arm-based embedded platform. Knowledge Distillation method consists of training a larger facial feature extraction network with high accuracy, followed by designing a smaller network that meets the operational speed requirements. The small network learns from the large network maintaining accuracy. By significantly reducing the number of parameters, we dramatically increase the performance by up to 20x. We minimize network parameters’ storage to a quarter of their original size through the training quantification technique, which uses int8 type data for parameter storage. Many embedded inference frameworks such as TFLite, NCNN, MNN, TNN, Tengine, and so on, can support int8 data-type and provide high and efficient performance when combined with the Arm processor's Neon SIMD instructions.

By combining these methodologies and embedding Arm-based general-purpose application processors in our devices, we have achieved extremely fast-matching high-precision facial recognition. Which can be used in various business applications, including time & attendance, access control, and entrance control systems.

Embedded Near-infrared and Visible-light Palm Recognition

Facial recognition technology provides people fast, secure, and convenient access to physical barriers and electronic devices, including employee time clocks. However, facial recognition technology is sometimes not acceptable to customers having privacy concerns. In this case, an excellent alternative to facial recognition is near-infrared and visible-light palm recognition technology. Palm recognition provides as much fast, secure, and convenient access while not having any privacy concerns since people’s palm prints are seldom found in the public domain.

Deployed on the Arm embedded platform, we have developed near-infrared palm and visible-light palm recognition systems that can be used either separately or in combination. The combined “bimodal” configuration is highly competitive with existing face recognition systems in terms of both precision and accuracy while not raising privacy concerns.

Figure 1. Palm vein image

We first trained a palm detection model using an improved RetinaNet algorithm to detect the palm position and to capture 9-key points from the image. Then following 9-key points detection, we utilize affine transformation to align the palm image to the standard dimension of 224 x 196. This image is then inputted to the feature extraction network. To increase the speed, we use a half-width MobileNetV2 as the feature extraction network. We train this network using hundreds of thousands of palm vein images to extract the features efficiently. Finally, we fabricated various fake palm prostheses. We utilize these fake prosthetic images as negative samples and utilize real palm images as positive samples. This method effectively “trains” an anti-spoofing model, which prevents prostheses attacks once the palm recognition device is deployed in the commercial market.

We have achieved great success in performing neural network inferences on the MNN framework with optimized SIMD instructions on AArch64. The entire process can run within 100ms on a typical Arm chip, such as MT6739 (Arm Cortex-A53).

Visible-light palm recognition utilizes the uniqueness and lifelong print texture patterns on each palm.

Figure 2. Visible-light palm image

The visible-light recognition of palms is similar to near-infrared recognition where the algorithm detects the outline first, and then aligns and extracts the features from the image. We included an anti-spoofing network in between the process. Since the visible-light palm is more easily disturbed by Ambient Light when the palm posture changes, the design of the network structure has more variations.

We optimized the hybrid palm recognition system for various business applications, including time & attendance, access control, and entrance control. The combination of the two algorithms dramatically improves the recognition accuracy and overall performance of the whole system.

Vehicle Feature Recognition

Transportation services have become more intelligent. With the rapid development of deep-learning technology in recent years, deep neural networks have become the most important practical tool to perform complex visual tasks. For example, vehicle characterization, vehicle detection, vehicle tracking. However, today's deep neural networks are highly complex and demand high computational and storage capacity. These demanding requirements have limited the performance capability of deep-learning models on embedded devices.

However, we can solve this problem by optimizing the deep learning model. The optimization can effectively reduce the number of model parameters and computing workload, making it fit for real-world deployed devices. We implement this deep-learning-based vehicle feature recognition on Hi3516AV200 (CPU for Arm architecture) to demonstrate the effects brought about by the model optimization.

Figure 3. Deployment of a vehicle feature recognition model on Hi3516AV200, Arm Cortex-A7

The key vehicle features include logo, model, and body color. The vehicle feature recognition model is trained based on the Darknet framework: a C language open-source framework with excellent portability. Deep learning models trained by Darknet port well to Arm devices to perform the relevant vision tasks.

The following principles guide the optimization of vehicle recognition tasks:

1) Ensure that the recognition accuracy meets the requirements of the task.

2) Compress the network parameters and reduce the model size by adjusting the network structure and the number of convolutional cores.

3) Ensure that the model works in real-time.

Model	Size (MB)	Parameter (million)	Speed (fps)	Preciseness (%)
Initial	59.7	14.925	4	99.5
Optimized	0.58	0.145	17	99.4

Table 1. Comparison of models before and after optimization.

We have fine-tuned the network structure for network parameters and made large model size and computational optimizations to adapt to the computing power and memory of Arm devices while maintaining the accuracy of the recognition.

See the following the effect of the vehicle feature recognition model for the real-world application:

Figure 4. Arm-based vehicle feature recognition demo

As seen in Table 1 and Figure 4, our optimized model is easily deployable to Arm devices and achieves high-performance results.

Embedded Iris Recognition

When someone says, “you have beautiful eyes” they’re probably referring to your iris. It sits between the white sclera and the pupil. The iris includes as much as 65% of texture information about the eye, despite only occupying 55% of the eye’s surface area. The iris consists of numerous crypts, wrinkles, and pigmented spots, and is unique. Genetic factors determine the iris’s formation - the expression of human DNA decides iris’ biological form, color, and actual appearance. After the first eight months of human growth, one’s iris normally reaches sufficient size and enters a relatively stable period. The iris can remain unchanged for decades. This uniqueness and stability of the iris make it a strong foundation for identity verification.

Traditionally, most iris recognition algorithms run on PC platforms due to their sizable computational workload. We have ported our iris recognition algorithm to an embedded Arm platform on a single-core (1.2GHz) CPU. Feature extraction runs in 200ms, and the recognition of 5000 irises takes less than 200ms, such as RK3288 (Arm Cortex-A17).

Safety and Security amid Convenience

We believe that every computer vision technology should be convenient, especially for biometric recognition use cases. Only by improving the safety and security of the application while providing satisfactory user experience, can computer vision technology be promoted more broadly throughout the world. Multimodal, touchless, and anti-spoofing is the primary trend of the future development of computer vision recognition technology. Based on these trends, we have released multiple hybrid recognition solutions including but not limited to:

hybrid near-infrared & visible-light touchless facial recognition
hybrid near-infrared facial & near-infrared palm recognition
hybrid visible-light facial & near-infrared palm recognition
hybrid visible-light facial & iris recognition, and
vehicle recognition & facial recognition.

[CTAToken URL = "https://www.zkteco.com" target="_blank" text="Vist ZKTeco" class ="green"]

If you have any questions, please do not hesitate to contact Elaine Lin on elaine.lin@zkteco.com.

AI and ML blog

Getting started with the Corstone-320 FVP for Arm Ethos-U85 NPU and Cortex-M85 processor

Zineb Labrut

The Corstone-320 FVP emulation platform offers a powerful foundation for developing advanced ML applications. Learn how to develop and test ML applications without physical hardware.
- November 22, 2024
Getting started with PyTorch, ExecuTorch, and Ethos-U85 in three easy steps

Robert Elliott

Get started in exploring ExecuTorch on Arm Ethos-U85 by checking out the code and tools Arm has released to seamlessly deploy AI models on IoT solutions built on Arm.
- October 24, 2024
Unleashing the Power of AI on Mobile: LLM Inference for Llama 3.2 Quantized Models with ExecuTorch and KleidiAI

Gian Marco Iodice

Learn how Arm and Meta have collaborated to enable AI developers to deploy quantized Llama models on Arm CPUs using ExecuTorch and KleidiAI.
- October 24, 2024

AI and ML blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded blog

Graphics, Gaming, and VR blog

High Performance Computing (HPC) blog

Infrastructure Solutions blog

Internet of Things (IoT) blog

Operating Systems blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Hybrid Computer Vision Recognition for Security Applications

Knowledge Distilled Embedded Facial Recognition

Embedded Near-infrared and Visible-light Palm Recognition

Vehicle Feature Recognition

Embedded Iris Recognition

Safety and Security amid Convenience

Getting started with the Corstone-320 FVP for Arm Ethos-U85 NPU and Cortex-M85 processor

Getting started with PyTorch, ExecuTorch, and Ethos-U85 in three easy steps

Unleashing the Power of AI on Mobile: LLM Inference for Llama 3.2 Quantized Models with ExecuTorch and KleidiAI