Using synthetic data for pretraining partial discharge detection in overhead transmission lines
Loading...
Downloads
Date issued
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Nature
Location
Signature
License
Abstract
Accurate detection of partial discharges (PDs) in medium-voltage overhead transmission lines is critical for preemptive maintenance and avoiding costly outages, yet it is challenged by scarce labeled data and pervasive electromagnetic interference. This paper investigates a hybrid simulation-and-data-driven framework in which synthetically generated PD signals are used to pretrain deep neural networks and are subsequently fine-tuned on a limited set of real overhead-line measurements. The synthetic pipeline systematically varies PD repetition rates, amplitude distributions, vegetation-contact scenarios, and noise conditions, producing diverse time-series and spectrogram-like representations that approximate real operating environments. We conduct a comprehensive ablation study across multiple architectures—Convolutional Neural Networks (CNNs), a Vision Transformer (ViT), and a Long Short-Term Memory (LSTM) network—and analyze their sensitivity to granular sweeps of synthetic-data parameters. CNN-based models decisively outperform ViT and LSTM counterparts on the spectrogram-based classification task, while ViT and LSTM fail to learn meaningful representation. For the successful CNNs, pretraining on carefully parameterized synthetic datasets—particularly those reflecting higher PD activity, such as our Datasets 3 and 4—consistently improves downstream performance on real data, boosting the Matthews Correlation Coefficient (MCC) on imbalanced, cost-sensitive test sets by roughly 10–20% compared with training from scratch. At the same time, we show that poorly aligned synthetic data can degrade generalization, underscoring the need for accurate noise calibration and domain-aligned simulation. Overall, the results confirm that (i) architectural choice is pivotal for PD detection in overhead lines and (ii) well-designed synthetic data is a powerful, practical lever for achieving reliable and cost-effective PD monitoring when real labeled data are limited.
Description
Delayed publication
Available after
Subject(s)
partial discharge detection, synthetic data, deep learning, overhead transmission lines, machine learning
Citation
Scientific Reports. 2025, vol. 15, issue 1, art. no. 45079.
Item identifier
Collections
Publikační činnost VŠB-TUO ve Web of Science / Publications of VŠB-TUO in Web of Science
OpenAIRE
Publikační činnost Centra energetických jednotek pro využití netradičních zdrojů energie (9370)
Publikační činnost Katedry informatiky / Publications of Department of Computer Science (460)
Články z časopisů s impakt faktorem / Articles from Impact Factor Journals
OpenAIRE
Publikační činnost Centra energetických jednotek pro využití netradičních zdrojů energie (9370)
Publikační činnost Katedry informatiky / Publications of Department of Computer Science (460)
Články z časopisů s impakt faktorem / Articles from Impact Factor Journals