Monitoring HPC cluster a IT infrastruktury v IT4Innovations
Loading...
Downloads
10
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská - Technická univerzita Ostrava
Location
Signature
Abstract
The aim of this work is implementation of new monitoring systems and consolidation with existing ones already deployed at IT4Innovations (National supercomputing center IT4Innovations) to deliver centralized HPC clusters and infrastructure monitoring solutions.
The Icinga2 monitoring tool is used for implementation of centralised monitoring. The whole solution is deployed using the configuration tools Puppet and Ansible, based on the location of monitoring servers.
A three-tier, clustered monitoring system has been created. The components of the monitoring system fulfil requirements for high availability and load-balancing. Monitoring servers are cordoned into zones based on aimed clusters or infrastructure. Centrally accessed web frontend is available for system administrators. The monitoring solution is deployed in a fully automated manner, using configuration tools, so the possibility of fast delivery into the production environment with minimal need of manual work is ensured.
Description
Subject(s)
Ansible, availability, cluster, distributed monitoring, GIT, HA, high availability, HPC, Icinga, Icinga2, IT4Innovations, load-balancing, monitoring probes, monitoring, Puppet, supercomputer