Zobrazit minimální záznam

dc.contributor.advisorSnášel, Václav
dc.contributor.authorOweis, Nour Easa
dc.date.accessioned2016-11-01T09:39:12Z
dc.date.available2016-11-01T09:39:12Z
dc.date.issued2016
dc.identifier.otherOSD002cs
dc.identifier.urihttp://hdl.handle.net/10084/112232
dc.descriptionImport 02/11/2016cs
dc.description.abstractBackground: Big Data mining is an analytic process utilized to discover the hidden knowledge and patterns from a massive, complex, and multidimensional dataset. Single processors memory and CPU resources are very limited in this aspect, which makes the algorithm performance ineffective. Association rule mining (ARM) is traditionally used to uncover hidden knowledge in data sets. However, they were unable to handle huge big data sets. Therefore, scalable and parallel strategies for ARM based on Big Data approaches are needed. Example of this approach is parallel association rule mining algorithm based on MapReduce by using lift interestingness measure (LIM) Methods: This thesis proposes two algorithms for data mining and optimization. The first is parallel association rule mining algorithm based on MapReduce by using LIM (MapReduce Lift Association Rule (MRLAR)), to provide high scalability over parallel execution. The second is reduce dimensionality by using multiple data reduction techniques including principle component analysis (PCA), singular value decomposition (SDD), semi-discrete decomposition (SVD), applied to reduce the data into fewer dimensions as pre-processing techniques for data optimization. Results: The MRLAR was found to directly extract the association rule and type of correlation between Lift Hand Side (LHS) and Right Hand Side (RHS) in the ARM (Lift) without the need for additional computation on the confidence measure. It also provided the following advantages: High scalability by utilizing parallel execution (MapReduce), support big data, one scan dataset, no more post-processing techniques and fault tolerance. The study also proposed an algorithm for data reduction using PCA, SVD, and SDD. The SVD was also found to have better accuracy and less time execution than SDD. Conclusions: The MRLAR performed effectively in data mining. The data reduction techniques enhanced the pre-processing of data by dimensionality reduction.en
dc.description.abstractBackground: Big Data mining is an analytic process utilized to discover the hidden knowledge and patterns from a massive, complex, and multidimensional dataset. Single processors memory and CPU resources are very limited in this aspect, which makes the algorithm performance ineffective. Association rule mining (ARM) is traditionally used to uncover hidden knowledge in data sets. However, they were unable to handle huge big data sets. Therefore, scalable and parallel strategies for ARM based on Big Data approaches are needed. Example of this approach is parallel association rule mining algorithm based on MapReduce by using lift interestingness measure (LIM) Methods: This thesis proposes two algorithms for data mining and optimization. The first is parallel association rule mining algorithm based on MapReduce by using LIM (MapReduce Lift Association Rule (MRLAR)), to provide high scalability over parallel execution. The second is reduce dimensionality by using multiple data reduction techniques including principle component analysis (PCA), singular value decomposition (SDD), semi-discrete decomposition (SVD), applied to reduce the data into fewer dimensions as pre-processing techniques for data optimization. Results: The MRLAR was found to directly extract the association rule and type of correlation between Lift Hand Side (LHS) and Right Hand Side (RHS) in the ARM (Lift) without the need for additional computation on the confidence measure. It also provided the following advantages: High scalability by utilizing parallel execution (MapReduce), support big data, one scan dataset, no more post-processing techniques and fault tolerance. The study also proposed an algorithm for data reduction using PCA, SVD, and SDD. The SVD was also found to have better accuracy and less time execution than SDD. Conclusions: The MRLAR performed effectively in data mining. The data reduction techniques enhanced the pre-processing of data by dimensionality reduction.cs
dc.format93 s. : il.cs
dc.format.extent1526228 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherVysoká škola báňská - Technická univerzita Ostravacs
dc.subjectBig Dataen
dc.subjectData Miningen
dc.subjectAssociation Ruleen
dc.subjectMapReduceen
dc.subjectLift Interesting Measurementen
dc.subjectData Reductionen
dc.subjectSVDen
dc.subjectSSDen
dc.subjectPCA.en
dc.subjectBig Datacs
dc.subjectData Miningcs
dc.subjectAssociation Rulecs
dc.subjectMapReducecs
dc.subjectLift Interesting Measurementcs
dc.subjectData Reductioncs
dc.subjectSVDcs
dc.subjectSSDcs
dc.subjectPCA.cs
dc.titleParallel Association Rule Mining Algorithm Based on MapReduce by Using Lift Interestingness Measure for Big Dataen
dc.title.alternativeParalelní algoritmy pro dolování pravidel založených na MapReduce a míře významnosti pro Big Datacs
dc.typeDisertační prácecs
dc.identifier.signature201600190cs
dc.identifier.locationÚK/Sklad diplomových prací
dc.contributor.refereeAbraham, Ajithcs
dc.contributor.refereeOuddane, Nabilcs
dc.contributor.refereeKrömer, Pavelcs
dc.date.accepted2016-06-08
dc.thesis.degree-namePh.D.
dc.thesis.degree-levelDoktorský studijní programcs
dc.thesis.degree-grantorVysoká škola báňská - Technická univerzita Ostrava. Fakulta elektrotechniky a informatikycs
dc.description.department460 - Katedra informatiky
dc.thesis.degree-programInformatika, komunikační technologie a aplikovaná matematikacs
dc.thesis.degree-branchInformatikacs
dc.description.resultvyhovělcs
dc.identifier.senderS2724cs
dc.identifier.thesisOWE001_FEI_P1807_1801V001_2016
dc.rights.accessopenAccess


Soubory tohoto záznamu

Tento záznam se objevuje v následujících kolekcích

Zobrazit minimální záznam