Role optického rozpoznávání textu pomocí hlubokého učení v konceptu Průmysl 4.0

Abstract

This thesis deals with the practical implementation of optical text recognition methods using neural networks. In the practical implementation, two architectures for text localization in image data are chosen and combined with models for text recognition in localized data. At the beginning of the thesis, a theoretical analysis of the tools and metrics used in the practical implementation is performed. The practical implementation describes the methods of data preprocessing and augmentation, and the implementation of the chosen neural network architectures. DBNet and U-Net architectures are chosen for the text localization task, multi-class classification architectures are chosen for the text recognition task, one with conventional methods for extracting features from localized text, and one with LSTM layers. For the thesis is chosen a dataset of SMD components. As a result, the thesis compares learned models of the selected architectures using the selected metrics for text localization and text recognition task from the localized image.

Description

Subject(s)

optical character recognition, OCR, Industry 4.0, deep learning, text localization, TensorFlow

Citation