Solvers and their implementations for machine learning problems and applications

Abstract

Machine learning is a set of statistical techniques and optimization algorithms used to create models based on training data without explicitly programming instructions in the decision algorithm. The goal is to achieve the generalization ability of a model from training data properties (features) for the most accurate prediction on unseen samples. Note that machine learning is considered as a subfield of artificial intelligence. This work focuses mainly on studying the applicability of the classical machine learning models for solving complex and data-intensive applications, choosing appropriate optimization algorithms and their adaptation for parallel training of models on supercomputer systems, and last but not least on designing and programming a workflow involving efficient analysis, fusion and transformation for big data. Attention is also paid to storing the data in HDF5 file format for efficient parallel I/O operations on supercomputer systems. A significant part of the work is devoted to supervised machine learning, specifically Support Vector Machines (SVM) classification models, their adaptation for semantic segmentation of multispectral-temporal satellite images, and subsequent use for wildfire localization in Alaska in a time horizon of one year. This application was addressed in collaboration with two world-leading research institutes in the USA, namely Argonne and Oak Ridge National Laboratories. The open-source tool called PermonSVM was used to train such segmentation models; implementation of this software is another integral part of this doctoral thesis. PermonSVM also supports training probabilistic models using Platt's scaling combined with models of the SVM type. The solvers MPRGP, SMALXE and their variants implemented in the PermonQP software package were used and adapted to solve an underlying optimization problem associated with training the models. These solvers have been developed and optimized for quadratic programming problems. They are further developed by Professor Dostal's group at the Department of Applied Mathematics (VSB -- Technical University of Ostrava) and the Institute of Geonics of the Czech Academy of Sciences. The second part of this thesis focuses on unsupervised machine learning. Specifically, a review of methods related to vector quantification based on Lloyd-type algorithms and spectral clustering is conducted. Furthermore, a parallel implementation of the vector quantification methods in the C++ programming language and a statistical approach based on the Bartlett's test of homogeneity of variances for estimating the multiplicity of zero eigenvalues related to the Laplace matrix are presented. The multiplicity of zero eigenvalues of this matrix corresponds to the number of components of the similarity graph (equals number of zero eigenvalues associated with graph Laplacian matrix); these components could represent objects in an image scene for example. In the practical part, two applications are introduced. The first focuses on detecting brittle and ductile fractures on steel sample (API 5L X-70) using vector quantization techniques. The second application shows employing spectral clustering for image segmentation without annotated data.

Description

Subject(s)

machine learning, quadratic programming, duality, Support Vector Machines, wildfires localization in Alaska, big data analysis, parallel model training, semantic segmentation, vector quantification, brittle and ductile fracture, spectral clustering

Citation