Obsah webových stránek a jeho efektivní zpracování

The thesis shows the possibility of integrating natural language processing methods into the browser environment for web content analysis. The process of text analysis involves a specific sequence of steps based on the knowledge of the language. Keywords are extracted from publicly available documents. The most frequent terms are then used in word learning to expand the user's vocabulary directly in the browser environment. Although the extension offers translation into different languages, the textual analysis focuses only on the English language, for which all natural language processing methods are also adapted. In addition to building your own dictionary, the application also offers automatic testing. In addition to the application itself, the practical part also includes an evaluation of the current status of the application and offers an overview of further possible extensions for a better quality of the offered services.

Subject(s)

browser extension, HTML, lemmatization, NLP, search, stematization, stop word, summarization, text analysis, tokenization, translator, vocabulary, web

Item identifier

http://hdl.handle.net/10084/147504

Collections

Vysokoškolské kvalifikační práce Fakulty elektrotechniky a informatiky / Theses and dissertations of Faculty of Electrical Engineering and Computer Science (FEI)

Show full item record

Obsah webových stránek a jeho efektivní zpracování

Files

Downloads

Date issued

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Location

Signature

Abstract

Description

Delayed publication

Available after

Subject(s)

Citation

Item identifier

Collections