Detekce klíčových slov v odborných článcích

Blažek, Ondřej

Detekce klíčových slov v odborných článcích

Files

BLA0045_FEI_B2647_2612R025_2013.pdf (3.69 MB)

BLA0045_FEI_B2647_2612R025_2013_priloha.zip (568.72 KB)

BLA0045_FEI_B2647_2612R025_2013_posudek_vedouci_Kudelka_Milos.pdf (49.41 KB)

BLA0045_FEI_B2647_2612R025_2013_posudek_oponent_Horak_Zdenek.pdf (49.61 KB)

Downloads

59

Date issued

2013

Authors

Blažek, Ondřej

Publisher

Vysoká škola báňská - Technická univerzita Ostrava

Abstract

The subject of this thesis is one typical role of a scientific discipline called text mining. Specifically it is a keyword spotting documents, which can be used for example for the distribution of documents into categories. The theoretical part is divided into two parts where the first part is devoted to the basic concepts and explains them in this issue. This is essentially a way to properly represent documents in a vector space. The second part deals with the exploration of existing methods for determining the categories of documents and keywords detection on the basis of those categories are merged. An important part of the work is its own implementation, which describes the steps of my process. For example we can find here steps to create a vector that will represent the document and clustering a set of documents into a given number of categories, based on their similarity. This clustering is used as a tool for categorization, which subsequently due to frequency analysis, keywords of categories are detected.

Description

Import 26/06/2013

Subject(s)

categorization , thematization , text mining , key words

Item identifier

http://hdl.handle.net/10084/99015

Collections

Vysokoškolské kvalifikační práce Fakulty elektrotechniky a informatiky / Theses and dissertations of Faculty of Electrical Engineering and Computer Science (FEI)

Show full item record

Detekce klíčových slov v odborných článcích

Files

Downloads

Date issued

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Location

Signature

Abstract

Description

Delayed publication

Available after

Subject(s)

Citation

Item identifier

Collections