Analýza efektivity algoritmů pro klasifikaci nevyžádané a ověřené pošty

Abstract

The bachelor's thesis focuses on the analysis of email filters created using machine learning. The theoretical part describes key concepts related to email communication and filtering methods. The practical part includes preprocessing of textual data, conversion of words into numerical values and subsequent creation of classifiers. The goal of the work was to find an optimal model with the highest possible success rate for spam detection. The result showed that the suitable model depends on the size of the input dataset. This thesis can be used as a foundation for the development of a more efficient filter in the Czech language.

Description

Subject(s)

Machine learning, Email classification, Classifiers, Vectorization, Natural Language Processing

Citation