dc.contributor.author | Říha, Lubomír | |
dc.contributor.author | Brzobohatý, Tomáš | |
dc.contributor.author | Markopoulos, Alexandros | |
dc.contributor.author | Jarošová, Marta | |
dc.contributor.author | Kozubek, Tomáš | |
dc.contributor.author | Horák, David | |
dc.contributor.author | Hapla, Václav | |
dc.date.accessioned | 2016-10-11T08:34:09Z | |
dc.date.available | 2016-10-11T08:34:09Z | |
dc.date.issued | 2016 | |
dc.identifier.citation | Parallel Computing. 2016, vol. 57, p. 154-166. | cs |
dc.identifier.issn | 0167-8191 | |
dc.identifier.issn | 1872-7336 | |
dc.identifier.uri | http://hdl.handle.net/10084/112147 | |
dc.description.abstract | This paper describes the implementation, performance, and scalability of our communica- tion layer developed for Total FETI (TFETI) and Hybrid Total FETI (HTFETI) solvers. HTFETI is based on our variant of the Finite Element Tearing and Interconnecting (FETI) type do- main decomposition method. In this approach a small number of neighboring subdomains is aggregated into clusters, which results in a smaller coarse problem. To solve the origi- nal problem TFETI method is applied twice: to the clusters and then to the subdomains in each cluster. The current implementation of the solver is focused on the performance optimization of the main CG iteration loop, including: implementation of communication hiding and avoid- ing techniques for global communications; optimization of the nearest neighbor commu- nication - multiplication with a global gluing matrix; and optimization of the parallel CG algorithm to iterate over local Lagrange multipliers only. The performance is demonstrated on a linear elasticity 3D cube and real world bench- marks. | cs |
dc.format.extent | 2269088 bytes | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | cs |
dc.publisher | Elsevier | cs |
dc.relation.ispartofseries | Parallel Computing | cs |
dc.relation.uri | http://dx.doi.org/10.1016/j.parco.2016.05.002 | cs |
dc.rights | ©2016 Elsevier B.V. All rights reserved. | cs |
dc.subject | FETI | cs |
dc.subject | hybrid total FETI | cs |
dc.subject | total FETI | cs |
dc.subject | domain decomposition | cs |
dc.subject | scalability | cs |
dc.subject | HPC | cs |
dc.title | Implementation of the efficient communication layer for the highly parallel total FETI and hybrid total FETI solvers | cs |
dc.type | article | cs |
dc.identifier.doi | 10.1016/j.parco.2016.05.002 | |
dc.relation.projectid | info:eu-repo/grantAgreement/EC/FP7/610741/EU//EXA2CT | cs |
dc.rights.access | closedAccess | |
dc.type.version | publishedVersion | cs |
dc.type.status | Peer-reviewed | cs |
dc.description.source | Web of Science | cs |
dc.description.volume | 57 | cs |
dc.description.lastpage | 166 | cs |
dc.description.firstpage | 154 | cs |
dc.identifier.wos | 000383307100012 | |