Statistické prostředky pro popis závislostí v datech

Abstract

This thesis focuses on statistical tools for describing dependencies between variables in data. It gradually introduces different types of correlation coefficients, starting with the most well-known and widely used ones such as Pearson's and Spearman's coefficients, to more specialised methods suitable for specific combinations of variable types – such as biserial and tetrachoric correlations. The thesis also focuses on contingency and association measures for categorical variables. The final chapter is devoted to spurious correlations and Simpson’s paradox, which illustrate the risk of misinterpreting statistical relationships. The aim of the thesis is not only to describe individual the various statistical tools, but also to demonstrate their correct practical application, to highlight potential pitfalls in interpretation of the result and emphasise the importance of choosing an appropriate method based on the nature of the data.

Description

Subject(s)

correlation analysis, Pearson correlation coefficient, Spearman correlation coefficient, coefficient of determination, partial correlation, biserial correlation, polyserial correlation, tetrachoric correlation, polychoric correlation, measures of association, spurious correlation, Simpson's paradox

Citation