Using synthetic data for pretraining partial discharge detection in overhead transmission lines

dc.contributor.authorKlein, Lukáš
dc.contributor.authorFulneček, Jan
dc.contributor.authorKabot, Ondřej
dc.contributor.authorDvorský, Jiří
dc.contributor.authorProkop, Lukáš
dc.date.accessioned2026-04-29T08:57:58Z
dc.date.available2026-04-29T08:57:58Z
dc.date.issued2025
dc.description.abstractAccurate detection of partial discharges (PDs) in medium-voltage overhead transmission lines is critical for preemptive maintenance and avoiding costly outages, yet it is challenged by scarce labeled data and pervasive electromagnetic interference. This paper investigates a hybrid simulation-and-data-driven framework in which synthetically generated PD signals are used to pretrain deep neural networks and are subsequently fine-tuned on a limited set of real overhead-line measurements. The synthetic pipeline systematically varies PD repetition rates, amplitude distributions, vegetation-contact scenarios, and noise conditions, producing diverse time-series and spectrogram-like representations that approximate real operating environments. We conduct a comprehensive ablation study across multiple architectures—Convolutional Neural Networks (CNNs), a Vision Transformer (ViT), and a Long Short-Term Memory (LSTM) network—and analyze their sensitivity to granular sweeps of synthetic-data parameters. CNN-based models decisively outperform ViT and LSTM counterparts on the spectrogram-based classification task, while ViT and LSTM fail to learn meaningful representation. For the successful CNNs, pretraining on carefully parameterized synthetic datasets—particularly those reflecting higher PD activity, such as our Datasets 3 and 4—consistently improves downstream performance on real data, boosting the Matthews Correlation Coefficient (MCC) on imbalanced, cost-sensitive test sets by roughly 10–20% compared with training from scratch. At the same time, we show that poorly aligned synthetic data can degrade generalization, underscoring the need for accurate noise calibration and domain-aligned simulation. Overall, the results confirm that (i) architectural choice is pivotal for PD detection in overhead lines and (ii) well-designed synthetic data is a powerful, practical lever for achieving reliable and cost-effective PD monitoring when real labeled data are limited.
dc.description.firstpageart. no. 45079
dc.description.issue1
dc.description.sourceWeb of Science
dc.description.volume15
dc.identifier.citationScientific Reports. 2025, vol. 15, issue 1, art. no. 45079.
dc.identifier.doi10.1038/s41598-025-32642-2
dc.identifier.issn2045-2322
dc.identifier.urihttp://hdl.handle.net/10084/158520
dc.identifier.wos001651181400021
dc.language.isoen
dc.publisherSpringer Nature
dc.relation.ispartofseriesScientific Reports
dc.relation.urihttps://doi.org/10.1038/s41598-025-32642-2
dc.rightsCopyright © 2025, The Author(s)
dc.rights.accessopenAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectpartial discharge detection
dc.subjectsynthetic data
dc.subjectdeep learning
dc.subjectoverhead transmission lines
dc.subjectmachine learning
dc.titleUsing synthetic data for pretraining partial discharge detection in overhead transmission lines
dc.typearticle
dc.type.statusPeer-reviewed
dc.type.versionpublishedVersion
local.files.count1
local.files.size3921844
local.has.filesyes

Files

Original bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
2045-2322-2025v15i1an45079.pdf
Size:
3.74 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
license.txt
Size:
718 B
Format:
Item-specific license agreed upon to submission
Description: