Optimized hand pose estimation CrossInfoNet-based architecture for embedded devices

Loading...
Thumbnail Image

Downloads

0

Date issued

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Nature

Location

Signature

Abstract

We present CrossInfoMobileNet, a hand pose estimation convolutional neural network based on CrossInfoNet, specifically tuned to mobile phone processors through the optimization, modification, and replacement of computationally critical CrossInfoNet components. By introducing a state-of-the-art MobileNetV3 network as a feature extractor and refiner, replacing ReLU activation with a better performing H-Swish activation function, we have achieved a network that requires 2.37 times less multiply-add operations and 2.22 times less parameters than the CrossInfoNet network, while maintaining the same error on the state-of-the-art datasets. This reduction of multiply-add operations resulted in an average 1.56 times faster real-world performance on both desktop and mobile devices, making it more suitable for embedded applications. The full source code of CrossInfoMobileNet including the sample dataset and its evaluation is available online through Code Ocean.

Description

Subject(s)

convolutional neural network, feature extractor, hand pose estimation

Citation

Machine Vision and Applications. 2022, vol. 33, issue 5, art. no. 78.