Nástroje pro detekci a prevenci přeučení modelů v úlohách strojového učení

Abstract

This thesis focuses on the problem of overfitting in machine learning models. It identifies, analyzes and explores methods for detecting and preventing overfitting, including regulization, early-stopping and cross-validation. The thesis describes the basic principles and properties of machine learning models, in particular the high dependency of the fitting ability on the task data. Experiments on simulated and real data illustrate the effectiveness of said techniques. Important manifestations of overfitting as well as appropriate learning in complex models with a high number of parameters are also highlighted. Results and conclusions from the experiments are presented, providing useful insights for the effective use of machine learning models in practice.

Description

Subject(s)

Machine Learning, overfitting, Artificial Neural Networks, XGBoost, regularization, SGD, learning rate, cross-validation, loss function, training data, testing data, validation data, capacity of a model, hyper-parameters

Citation