dc.contributor.author | Strakoš, Petr | |
dc.contributor.author | Jaroš, Milan | |
dc.contributor.author | Říha, Lubomír | |
dc.contributor.author | Kozubek, Tomáš | |
dc.date.accessioned | 2024-04-24T09:18:25Z | |
dc.date.available | 2024-04-24T09:18:25Z | |
dc.date.issued | 2023 | |
dc.identifier.citation | Journal of Imaging. 2023, vol. 9, issue 11, art. no. 254. | cs |
dc.identifier.issn | 2313-433X | |
dc.identifier.uri | http://hdl.handle.net/10084/152569 | |
dc.description.abstract | This paper presents a parallel implementation of a non-local transform-domain filter
(BM4D). The effectiveness of the parallel implementation is demonstrated by denoising image series
from computed tomography (CT) and magnetic resonance imaging (MRI). The basic idea of the filter
is based on grouping and filtering similar data within the image. Due to the high level of similarity
and data redundancy, the filter can provide even better denoising quality than current extensively
used approaches based on deep learning (DL). In BM4D, cubes of voxels named patches are the
essential image elements for filtering. Using voxels instead of pixels means that the area for searching
similar patches is large. Because of this and the application of multi-dimensional transformations,
the computation time of the filter is exceptionally long. The original implementation of BM4D is only
single-threaded. We provide a parallel version of the filter that supports multi-core and many-core
processors and scales on such versatile hardware resources, typical for high-performance computing
clusters, even if they are concurrently used for the task. Our algorithm uses hybrid parallelisation
that combines open multi-processing (OpenMP) and message passing interface (MPI) technologies
and provides up to 283× speedup, which is a 99.65% reduction in processing time compared to the
sequential version of the algorithm. In denoising quality, the method performs considerably better
than recent DL methods on the data type that these methods have yet to be trained on. | cs |
dc.language.iso | en | cs |
dc.publisher | MDPI | cs |
dc.relation.ispartofseries | Journal of Imaging | cs |
dc.relation.uri | https://doi.org/10.3390/jimaging9110254 | cs |
dc.rights | © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. | cs |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | cs |
dc.subject | volumetric data | cs |
dc.subject | image denoising | cs |
dc.subject | parallel implementation | cs |
dc.subject | medical imaging | cs |
dc.subject | high-performance computing | cs |
dc.title | Speed up of volumetric non-local transform-domain filter utilising HPC architecture | cs |
dc.type | article | cs |
dc.identifier.doi | 10.3390/jimaging9110254 | |
dc.rights.access | openAccess | cs |
dc.type.version | publishedVersion | cs |
dc.type.status | Peer-reviewed | cs |
dc.description.source | Web of Science | cs |
dc.description.volume | 9 | cs |
dc.description.issue | 11 | cs |
dc.description.firstpage | art. no. 254 | cs |
dc.identifier.wos | 001113330800001 | |