A sanitization approach for hiding sensitive itemsets based on particle swarm optimization

dc.contributor.authorLin, Jerry Chun-Wei
dc.contributor.authorLiu, Qiankun
dc.contributor.authorFournier-Viger, Philippe
dc.contributor.authorHong, Tzung-Pei
dc.contributor.authorVozňák, Miroslav
dc.contributor.authorZhan, Justin
dc.date.accessioned2016-07-21T11:05:32Z
dc.date.available2016-07-21T11:05:32Z
dc.date.issued2016
dc.description.abstractPrivacy-preserving data mining (PPDM) has become an important research field in recent years, as approaches for PPDM can discover important information in databases, while ensuring that sensitive information is not revealed. Several algorithms have been proposed to hide sensitive information in databases. They apply addition and deletion operations to perturb an original database and hide the sensitive information. Finding an appropriate set of transactions/itemsets to be perturbed for hiding sensitive information while preserving other important information is a NP-hard problem. In the past, genetic algorithm (GA)-based approaches were developed to hide sensitive itemsets in an original database through transaction deletion. In this paper, a particle swarm optimization (PSO)-based algorithm called PSO2DT is developed to hide sensitive itemsets while minimizing the side effects of the sanitization process. Each particle in the designed PSO2DT algorithm represents a set of transactions to be deleted. Particles are evaluated using a fitness function that is designed to minimize the side effects of sanitization. The proposed algorithm can also determine the maximum number of transactions to be deleted for efficiently hiding sensitive itemsets, unlike the state-of-the-art GA-based approaches. Besides, an important strength of the proposed approach is that few parameters need to be set, and it can still find better solutions to the sanitization problem than GA-based approaches. Furthermore, the pre-large concept is also adopted in the designed algorithm to speed up the evolution process. Substantial experiments on both real-world and synthetic datasets show that the proposed PSO2DT algorithm performs better than the Greedy algorithm and GA-based algorithms in terms of runtime, fail to be hidden (F-T-H), not to be hidden (N-T-H), and database similarity (DS).cs
dc.description.firstpage1cs
dc.description.lastpage18cs
dc.description.sourceWeb of Sciencecs
dc.description.volume53cs
dc.identifier.citationEngineering Applications of Artificial Intelligence. 2016, vol. 53, p. 1-18.cs
dc.identifier.doi10.1016/j.engappai.2016.03.007
dc.identifier.issn0952-1976
dc.identifier.issn1873-6769
dc.identifier.urihttp://hdl.handle.net/10084/111902
dc.identifier.wos000378180800001
dc.language.isoencs
dc.publisherElseviercs
dc.relation.ispartofseriesEngineering Applications of Artificial Intelligencecs
dc.relation.urihttp://dx.doi.org/10.1016/j.engappai.2016.03.007cs
dc.rights© 2016 Elsevier Ltd. All rights reserved.cs
dc.subjectPPDMcs
dc.subjectsanitizationcs
dc.subjectevolutionary computationcs
dc.subjectsensitive itemsetscs
dc.subjectPSOcs
dc.titleA sanitization approach for hiding sensitive itemsets based on particle swarm optimizationcs
dc.typearticlecs
dc.type.statusPeer-reviewedcs

Files

License bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: