Diagnosis of Breast Cancer Subtypes using the Selection of Effective Genes from Microarray Data

Authors

  • Derhami, Vali Computer Engineering Department, Faculty of Engineering, Yazd University, Yazd, Iran
  • Sheikhpour, Razieh Department of Computer Engineering, Faculty of Engineering, Ardakan University, Ardakan, Iran
Abstract:

Introduction: Early diagnosis of breast cancer and the identification of effective genes are important issues in the treatment and survival of the patients. Gene expression data obtained using DNA microarray in combination with machine learning algorithms can provide new and intelligent methods for diagnosis of breast cancer. Methods: Data on the expression of 9216 genes from 84 patients across 5 different types of cancer was obtained using microarray technology. In this study, we proposed a feature selection method based on the correlation between abnormal expression of genes and cancer for diagnosis of breast cancer. Then, we used K-nearest neighbor (KNN), support vector machine (SVM), and naive Bayesian (NB) classifiers to evaluate the performance of the proposed method in the selection of relevant genes. Results: The proposed feature selection method coupled with the KNN classifier predicted all types of cancer with 100% accuracy and using 38 of the 9216 genes. The proposed method could also identify the genes associated with each class. Moreover, the proposed feature selection method coupled with NB and SVM classifiers achieved accuracy rates of 90% and 96.67% using 17 and 22 genes, respectively. Conclusion: The results of this study demonstrated that the proposed feature selection method has better performance compared with other methods. The proposed method is able to distinguish the genes involved in each cancer class and detect overexpression or underexpression of selected genes, which can be used by physicians and researchers in the field of health care.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...

full text

the study of aaag repeat polymorphism in promoter of errg gene and its association with the risk of breast cancer in isfahan region

چکیده: سرطان پستان دومین عامل مرگ مرتبط با سرطان در خانم ها است. از آنجا که سرطان پستان یک تومور وابسته به هورمون است، می تواند توسط وضعیت هورمون های استروئیدی شامل استروژن و پروژسترون تنظیم شود. استروژن نقش مهمی در توسعه و پیشرفت سرطان پستان ایفا می کند و تاثیر خود را روی بیان ژن های هدف از طریق گیرنده های استروژن اعمال می کند. اما گروه دیگری از گیرنده های هسته ای به نام گیرنده های مرتبط به ا...

15 صفحه اول

Prediction of Breast Cancer Metastasis Using Fuzzy Models based on Data from Iranian Breast Cancer Patients

Introduction: The metastasis of breast cancer, the spread of cancer to different body parts, is considered as one of the most important factors responsible for the majority of deaths caused by breast cancer in women. Diagnosing the breast cancer metastasis at the earliest stages helps to choose the best treatment and improve the quality of life for patients. Method: In the present fundamental r...

full text

Prediction of Breast Cancer Metastasis Using Fuzzy Models based on Data from Iranian Breast Cancer Patients

Introduction: The metastasis of breast cancer, the spread of cancer to different body parts, is considered as one of the most important factors responsible for the majority of deaths caused by breast cancer in women. Diagnosing the breast cancer metastasis at the earliest stages helps to choose the best treatment and improve the quality of life for patients. Method: In the present fundamental r...

full text

assessment of the efficiency of s.p.g.c refineries using network dea

data envelopment analysis (dea) is a powerful tool for measuring relative efficiency of organizational units referred to as decision making units (dmus). in most cases dmus have network structures with internal linking activities. traditional dea models, however, consider dmus as black boxes with no regard to their linking activities and therefore do not provide decision makers with the reasons...

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 12  issue 1

pages  39- 47

publication date 2019-06

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

No Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023