An efficient algorithm to mine high average-utility itemsets

dc.contributor.authorLin, Jerry Chun-Wei
dc.contributor.authorLi, Ting
dc.contributor.authorFournier-Viger, Philippe
dc.contributor.authorHong, Tzung-Pei
dc.contributor.authorZhan, Justin
dc.contributor.authorVozňák, Miroslav
dc.date.accessioned2016-07-12T06:28:39Z
dc.date.available2016-07-12T06:28:39Z
dc.date.issued2016
dc.description.abstractWith the ever increasing number of applications of data mining, high-utility itemset mining (HUIM) has become a critical issue in recent decades. In traditional HUIM, the utility of an itemset is defined as the sum of the utilities of its items, in transactions where it appears. An important problem with this definition is that it does not take itemset length into account. Because the utility of larger itemset is generally greater than the utility of smaller itemset, traditional HUIM algorithms tend to be biased toward finding a set of large itemsets. Thus, this definition is not a fair measurement of utility. To provide a better assessment of each itemset’s utility, the task of high average-utility itemset mining (HAUIM) was proposed. It introduces the average utility measure, which considers both the length of itemsets and their utilities, and is thus more appropriate in real-world situations. Several algorithms have been designed for this task. They can be generally categorized as either level-wise or pattern-growth approaches. Both of them require, however, the amount of computation to find the actual high average-utility itemsets (HAUIs). In this paper, we present an efficient average-utility (AU)-list structure to discover the HAUIs more efficiently. A depth-first search algorithm named HAUI-Miner is proposed to explore the search space without candidate generation, and an efficient pruning strategy is developed to reduce the search space and speed up the mining process. Extensive experiments are conducted to compare the performance of HAUI-Miner with the state-of-the-art HAUIM algorithms in terms of runtime, number of determining nodes, memory usage and scalability.cs
dc.description.firstpage233cs
dc.description.issue2cs
dc.description.lastpage243cs
dc.description.sourceWeb of Sciencecs
dc.description.volume30cs
dc.identifier.citationAdvanced Engineering Informatics. 2016, vol. 30, issue 2, p. 233-243.cs
dc.identifier.doi10.1016/j.aei.2016.04.002
dc.identifier.issn1474-0346
dc.identifier.issn1873-5320
dc.identifier.urihttp://hdl.handle.net/10084/111822
dc.identifier.wos000376694600011
dc.language.isoencs
dc.publisherElseviercs
dc.relation.ispartofseriesAdvanced Engineering Informaticscs
dc.relation.urihttp://dx.doi.org/10.1016/j.aei.2016.04.002cs
dc.rights© 2016 Elsevier Ltd. All rights reserved.cs
dc.subjecthigh average-utility itemsetscs
dc.subjectlist structurecs
dc.subjectdata miningcs
dc.subjectHAUIMcs
dc.titleAn efficient algorithm to mine high average-utility itemsetscs
dc.typearticlecs
dc.type.statusPeer-reviewedcs

Files

License bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: