Zpracování textu pomocí hlubokých neuronových sítí

Prokop, Vojtěch

dc.contributor.advisor	Platoš, Jan
dc.contributor.author	Prokop, Vojtěch
dc.date.accessioned	2022-09-01T07:20:32Z
dc.date.available	2022-09-01T07:20:32Z
dc.date.issued	2022
dc.identifier.other	OSD002
dc.identifier.uri	http://hdl.handle.net/10084/147324
dc.description.abstract	Diplomová práce se zabývá popisem jednotlivých bloků procesu zpracování přirozeného jazyka od přípravy textových dat, předzpracování, až po návrh modelů, které řeší klasifikační problém nad jazykovým korpusem. V teoretické části jsou detailněji popsány modely od klasických přístupů strojového učení, až po hojně využívanou architekturu Transformer. Právě modely, které jsou založeny na této architektuře, jejich struktura a výkonost je hlavním doménou této diplomové práce. V praktické části jsou provedeny experimenty nad různými přístupy a následně srovnány jejich výsledky. Jsou použity 3 přístupy, vektorizace textu a následné využití klasických modelů, využití architektur neuronových sítí až po architekturu Transformer a v poslední řadě využití derivátu BERT modelu ve spojení s hlubokou dopřednou sítí. Nad všemi těmito modely byla zkoumána kvalita přesnosti u problému autorství, kde k neznámému textu model odhadoval s určitou přesností možného autora.	cs
dc.description.abstract	The thesis deals with the description of the different blocks of the natural language processing process from the preparation of text data, pre-processing, to the design of models that solve the classification problem over the language corpus. In the theoretical part, models ranging from classical machine learning approaches to the widely used Transformer architecture are described in detail. It is the models that are based on this architecture, their structure and performance that is the main domain of this thesis. In the practical part, experiments are performed over the different approaches and then their results are compared. Three approaches are used, text vectorization and the subsequent use of classical models, the use of neural network architectures up to the Transformer architecture and lastly the use of a derivative of the BERT model in conjunction with a deep forward network. Over all of these models, the quality of accuracy was investigated for the authorship problem, where, given an unknown text, the model estimated a possible author with some confidence.	en
dc.format.extent	4499552 bytes
dc.format.mimetype	application/pdf
dc.language.iso	cs
dc.publisher	Vysoká škola báňská – Technická univerzita Ostrava	cs
dc.subject	zpracování přirozeného jazyka	cs
dc.subject	neuronové sítě	cs
dc.subject	určení autorství	cs
dc.subject	Transformer architektura	cs
dc.subject	BERT model	cs
dc.subject	ELECTRA	cs
dc.subject	DistilBERT	cs
dc.subject	natural language processing	en
dc.subject	artificial neural networks	en
dc.subject	authorship identification	en
dc.subject	Transformer architecture	en
dc.subject	BERT model	en
dc.subject	ELECTRA	en
dc.subject	DistilBERT	en
dc.title	Zpracování textu pomocí hlubokých neuronových sítí	cs
dc.title.alternative	Text Processing using Neural Networks	en
dc.type	Diplomová práce	cs
dc.contributor.referee	Dvorský, Jiří
dc.date.accepted	2022-05-31
dc.thesis.degree-name	Ing.
dc.thesis.degree-level	Magisterský studijní program	cs
dc.thesis.degree-grantor	Vysoká škola báňská – Technická univerzita Ostrava. Fakulta elektrotechniky a informatiky	cs
dc.description.department	460 - Katedra informatiky	cs
dc.thesis.degree-program	Informační a komunikační technologie	cs
dc.thesis.degree-branch	Informatika a výpočetní technika	cs
dc.description.result	výborně	cs
dc.identifier.sender	S2724
dc.identifier.thesis	PRO0255_FEI_N2647_2612T025_2022
dc.rights.access	openAccess

Soubory tohoto záznamu

Název:: PRO0255_FEI_N2647_2612T025_2022.pdf
Velikost:: 4.291Mb
Formát:: PDF
Popis:: Text práce

Zobrazit/otevřít

Název:: PRO0255_FEI_N2647_2612T025_202 ...
Velikost:: 48.04Kb
Formát:: PDF
Popis:: Zadání

Zobrazit/otevřít

Název:: PRO0255_FEI_N2647_2612T025_202 ...
Velikost:: 488.1Mb
Formát:: Neznámý
Popis:: Příloha

Zobrazit/otevřít

Název:: PRO0255_FEI_N2647_2612T025_202 ...
Velikost:: 54.18Kb
Formát:: PDF
Popis:: Posudek vedoucího – Platoš, Jan

Zobrazit/otevřít

Název:: PRO0255_FEI_N2647_2612T025_202 ...
Velikost:: 54.34Kb
Formát:: PDF
Popis:: Posudek oponenta – Dvorský, Jiří

Zobrazit/otevřít

Tento záznam se objevuje v následujících kolekcích

Vysokoškolské kvalifikační práce Fakulty elektrotechniky a informatiky / Theses and dissertations of Faculty of Electrical Engineering and Computer Science (FEI) [13253]
Kolekce obsahuje vysokoškolské kvalifikační práce Fakulty elektrotechniky a informatiky.

Zobrazit minimální záznam