Zpracování textu pomocí hlubokých neuronových sítí

Prokop, Vojtěch

Zpracování textu pomocí hlubokých neuronových sítí

Files

PRO0255_FEI_N2647_2612T025_2022.pdf (4.29 MB)

PRO0255_FEI_N2647_2612T025_2022_zadani.pdf (48.05 KB)

PRO0255_FEI_N2647_2612T025_2022_priloha.7z (488.12 MB)

PRO0255_FEI_N2647_2612T025_2022_posudek_vedouci_Platos_Jan.pdf (54.18 KB)

PRO0255_FEI_N2647_2612T025_2022_posudek_oponent_Dvorsky_Jiri.pdf (54.35 KB)

Date issued

2022

Authors

Prokop, Vojtěch

Publisher

Vysoká škola báňská – Technická univerzita Ostrava

Abstract

The thesis deals with the description of the different blocks of the natural language processing process from the preparation of text data, pre-processing, to the design of models that solve the classification problem over the language corpus. In the theoretical part, models ranging from classical machine learning approaches to the widely used Transformer architecture are described in detail. It is the models that are based on this architecture, their structure and performance that is the main domain of this thesis. In the practical part, experiments are performed over the different approaches and then their results are compared. Three approaches are used, text vectorization and the subsequent use of classical models, the use of neural network architectures up to the Transformer architecture and lastly the use of a derivative of the BERT model in conjunction with a deep forward network. Over all of these models, the quality of accuracy was investigated for the authorship problem, where, given an unknown text, the model estimated a possible author with some confidence.

Subject(s)

natural language processing, artificial neural networks, authorship identification, Transformer architecture, BERT model, ELECTRA, DistilBERT

Item identifier

http://hdl.handle.net/10084/147324

Collections

Vysokoškolské kvalifikační práce Fakulty elektrotechniky a informatiky / Theses and dissertations of Faculty of Electrical Engineering and Computer Science (FEI)

Show full item record

Zpracování textu pomocí hlubokých neuronových sítí

Files

Downloads

Date issued

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Location

Signature

Abstract

Description

Delayed publication

Available after

Subject(s)

Citation

Item identifier

Collections