Identification of triple negative breast cancer genes using rough set based feature selection algorithm & ensemble classifier

dc.contributor.authorPatil, Sujata
dc.contributor.authorBalmuri, Kavitha Rani
dc.contributor.authorFrnda, Jaroslav
dc.contributor.authorParameshachari, B. D.
dc.contributor.authorKonda, Srinivas
dc.contributor.authorNedoma, Jan
dc.date.accessioned2023-02-07T06:37:36Z
dc.date.available2023-02-07T06:37:36Z
dc.date.issued2022
dc.description.abstractIn recent decades, microarray datasets have played an important role in triple negative breast cancer (TNBC) detection. Microarray data classification is a challenging process due to the presence of numerous redundant and irrelevant features. Therefore, feature selection becomes irreplaceable in this research field that eliminates non-required feature vectors from the system. The selection of an optimal number of features significantly reduces the NP hard problem, so a rough set-based feature selection algorithm is used in this manuscript for selecting the optimal feature values. Initially, the datasets related to TNBC are acquired from gene expression omnibuses like GSE45827, GSE76275, GSE65194, GSE3744, GSE21653, and GSE7904. Then, a robust multi-array average technique is used for eliminating the outlier samples of TNBC/non-TNBC which helps enhancing classification performance. Further, the pre-processed microarray data are fed to a rough set theory for optimal gene selection, and then the selected genes are given as the inputs to the ensemble classification technique for classifying low-risk genes (non-TNBC) and high-risk genes (TNBC). The experimental evaluation showed that the ensemble-based rough set model obtained a mean accuracy of 97.24%, which superior related to other comparative machine learning techniques.cs
dc.description.firstpageart. no. 54cs
dc.description.sourceWeb of Sciencecs
dc.description.volume12cs
dc.identifier.citationHuman-Centric Computing and Information Sciences. 2022, vol. 12, art. no. 54.cs
dc.identifier.doi10.22967/HCIS.2022.12.054
dc.identifier.issn2192-1962
dc.identifier.urihttp://hdl.handle.net/10084/149070
dc.identifier.wos000890282100001
dc.language.isoencs
dc.publisherKorea Information Processing Societycs
dc.relation.ispartofseriesHuman-Centric Computing and Information Sciencescs
dc.relation.urihttps://doi.org/10.22967/HCIS.2022.12.054cs
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.cs
dc.rights.accessopenAccesscs
dc.rights.urihttp://creativecommons.org/licenses/by-nc/3.0/cs
dc.subjectensemble classifiercs
dc.subjectmachine-learning techniquecs
dc.subjectmicroarray datacs
dc.subjectrobust multi-array average techniquecs
dc.subjectrough set theorycs
dc.subjecttriple negative breast cancercs
dc.titleIdentification of triple negative breast cancer genes using rough set based feature selection algorithm & ensemble classifiercs
dc.typearticlecs
dc.type.statusPeer-reviewedcs
dc.type.versionpublishedVersioncs

Files

Original bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
2192-1962-2022v12an54.pdf
Size:
1005.43 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
license.txt
Size:
718 B
Format:
Item-specific license agreed upon to submission
Description: