Using synthetic data for pretraining partial discharge detection in overhead transmission lines

Klein, Lukáš

doi:10.1038/s41598-025-32642-2

Using synthetic data for pretraining partial discharge detection in overhead transmission lines

dc.contributor.author	Klein, Lukáš
dc.contributor.author	Fulneček, Jan
dc.contributor.author	Kabot, Ondřej
dc.contributor.author	Dvorský, Jiří
dc.contributor.author	Prokop, Lukáš
dc.date.accessioned	2026-04-29T08:57:58Z
dc.date.available	2026-04-29T08:57:58Z
dc.date.issued	2025
dc.description.abstract	Accurate detection of partial discharges (PDs) in medium-voltage overhead transmission lines is critical for preemptive maintenance and avoiding costly outages, yet it is challenged by scarce labeled data and pervasive electromagnetic interference. This paper investigates a hybrid simulation-and-data-driven framework in which synthetically generated PD signals are used to pretrain deep neural networks and are subsequently fine-tuned on a limited set of real overhead-line measurements. The synthetic pipeline systematically varies PD repetition rates, amplitude distributions, vegetation-contact scenarios, and noise conditions, producing diverse time-series and spectrogram-like representations that approximate real operating environments. We conduct a comprehensive ablation study across multiple architectures—Convolutional Neural Networks (CNNs), a Vision Transformer (ViT), and a Long Short-Term Memory (LSTM) network—and analyze their sensitivity to granular sweeps of synthetic-data parameters. CNN-based models decisively outperform ViT and LSTM counterparts on the spectrogram-based classification task, while ViT and LSTM fail to learn meaningful representation. For the successful CNNs, pretraining on carefully parameterized synthetic datasets—particularly those reflecting higher PD activity, such as our Datasets 3 and 4—consistently improves downstream performance on real data, boosting the Matthews Correlation Coefficient (MCC) on imbalanced, cost-sensitive test sets by roughly 10–20% compared with training from scratch. At the same time, we show that poorly aligned synthetic data can degrade generalization, underscoring the need for accurate noise calibration and domain-aligned simulation. Overall, the results confirm that (i) architectural choice is pivotal for PD detection in overhead lines and (ii) well-designed synthetic data is a powerful, practical lever for achieving reliable and cost-effective PD monitoring when real labeled data are limited.
dc.description.firstpage	art. no. 45079
dc.description.issue	1
dc.description.source	Web of Science
dc.description.volume	15
dc.identifier.citation	Scientific Reports. 2025, vol. 15, issue 1, art. no. 45079.
dc.identifier.doi	10.1038/s41598-025-32642-2
dc.identifier.issn	2045-2322
dc.identifier.uri	http://hdl.handle.net/10084/158520
dc.identifier.wos	001651181400021
dc.language.iso	en
dc.publisher	Springer Nature
dc.relation.ispartofseries	Scientific Reports
dc.relation.uri	https://doi.org/10.1038/s41598-025-32642-2
dc.rights	Copyright © 2025, The Author(s)
dc.rights.access	openAccess
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	partial discharge detection
dc.subject	synthetic data
dc.subject	deep learning
dc.subject	overhead transmission lines
dc.subject	machine learning
dc.title	Using synthetic data for pretraining partial discharge detection in overhead transmission lines
dc.type	article
dc.type.status	Peer-reviewed
dc.type.version	publishedVersion
local.files.count	1
local.files.size	3921844
local.has.files	yes

Files

Original bundle

Now showing 1 - 1 out of 1 results

Name:: 2045-2322-2025v15i1an45079.pdf
Size:: 3.74 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 out of 1 results

Name:: license.txt
Size:: 718 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Publikační činnost VŠB-TUO ve Web of Science / Publications of VŠB-TUO in Web of Science
OpenAIRE
Publikační činnost Centra energetických jednotek pro využití netradičních zdrojů energie (9370)
Publikační činnost Katedry informatiky / Publications of Department of Computer Science (460)
Články z časopisů s impakt faktorem / Articles from Impact Factor Journals