CFP Search: vyhledávání v Call for Papers

Abstract

This thesis focuses on natural language processing for classification of text documents. For classifier training were selected data collections obtained from WikiCFP and DBWorld. The introduction of the thesis deals with the acquisition of these data, their basic analysis and preprocessing to create a clean dataset. This is followed by a detailed description of the methods for creating a vector representation of the text and an overview of suitable models for automatic classification, ranging from classical approaches, through neural networks to current state-of-the-art techniques. Selected models from each category have been implemented and tested over specified data sources. The next part of the work deals with testing on specified data sources. Results are analysed, visualised, and evaluated. Created models have been implemented to practical web application.

Description

Subject(s)

text classification, natural language processing, machine learning, neural networks, artificial neural networks, deep learning, web application, Call for Papers, WikiCFP, DBWorld

Citation