Rozpoznání lidské aktivity

Abstract

Convolutional neural networks prove to be a powerful tool for the problem of image understanding and therefore they are also used for video understanding. The video contains additional temporary information compared to the image, which is represented by a sequence of frames. The main theme of this work is the human activity recognition in video. In this work I describe the individual steps of this problem and implement several models of 3D convolutional neural networks inspired by well known architectures for image recognition. The models are trained on the KTH dataset, where I use the OpenPose system to detect the human body in video images. Finally, I compare the results of all implemented models.

Description

Subject(s)

convolution neural network, image understanding, video understanding, human activity recognition, 3D convolution neural network, KTH dataset, OpenPose

Citation