Zpracování textu pomocí hlubokých neuronových sítí
Loading...
Files
Downloads
12
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská – Technická univerzita Ostrava
Location
Signature
Abstract
This paper explores the use of deep neural networks, particularly transformer models, in the field of natural language processing (NLP). It provides a general overview of how neural networks work, describing its basic principles, applications and motivations for their use in NLP. The transformer model is described in more detail in order to explain its innovative architecture, including its attention mechanism, and to explain its advantages.
The paper aims to contribute to~knowledge about the operation of transformers, to evaluate their impact on~advances in NLP solutions, and to serve as a reference for the field of natural language processing using deep neural networks. The output of the work includes freely available reference solutions to the problems, or dataset, prepared for masked language modeling and next-sentence prediction tasks, similar to the dataset used to~pre-train current state-of-the-art models such as BERT.
Furthermore, the implementation of the sample transformer model, including its pre-training, is included and its performance in the extractive question answering task is investigated.
In particular, individual tasks on~which the performance of language models and the issue of correct data preprocessing are demonstrated are solved by tuning the disilBERT model.
Description
Subject(s)
machine learning, artificial neural networks, deep learning, natural language processing, text classification, named entity recognition, extractive question answering, model transformer, attention mechanism, large language models, embedding vectors, distilBERT, model training, fine tuning, data preprocessing, hyperparametrization