Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models

Faridah Hani Mohamed Salleh, Suhaila Zainudin, Mohd Firdaus Raih

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Gene regulatory networks (GRN) reconstruction is the process of identifying gene regulatory interactions from experimental data through computational analysis. GRN reconstruction-related works have boosted many major discoveries in finding drug targets for the treatment of human diseases, including cancer. However, reconstructing GRNs from gene expression data is a challenging problem due to high-dimensionality and very limited number of observations data, severe multicollinearity and the tendency of generating cascade errors. These problems lead to the reduced performance of GRN inference methods, hence resulting in the method being unreliable for scientific usage. We propose a method called P-CALS (Principal Component Analysis and Partial Least Squares) that is derived from the combination of PCA (Principal Component Analysis) with PLS (Partial Least Squares). The performance of P-CALS is assessed to the genome-scale GRN of E. coli, S. cerevisiae and an in-silico datasets. We discovered that P-CALS achieved satisfactory results as all of the sub-networks from diverse datasets achieved AUROC values above 0.5 and gene relationships were discovered at the most complex network tested in the experiments.

Original languageEnglish
Title of host publication2018 IEEE Conference on Big Data and Analytics, ICBDA 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages129-134
Number of pages6
ISBN (Electronic)9781538671283
DOIs
Publication statusPublished - 29 Jan 2019
Event2018 IEEE Conference on Big Data and Analytics, ICBDA 2018 - Langkawi, Kedah, Malaysia
Duration: 21 Nov 201822 Nov 2018

Publication series

Name2018 IEEE Conference on Big Data and Analytics, ICBDA 2018

Conference

Conference2018 IEEE Conference on Big Data and Analytics, ICBDA 2018
CountryMalaysia
CityLangkawi, Kedah
Period21/11/1822/11/18

Fingerprint

Genes
Principal component analysis
Complex networks
Gene
Gene expression
Escherichia coli
Partial least squares
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Information Systems and Management

Cite this

Mohamed Salleh, F. H., Zainudin, S., & Raih, M. F. (2019). Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models. In 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018 (pp. 129-134). [8629777] (2018 IEEE Conference on Big Data and Analytics, ICBDA 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICBDAA.2018.8629777
Mohamed Salleh, Faridah Hani ; Zainudin, Suhaila ; Raih, Mohd Firdaus. / Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models. 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 129-134 (2018 IEEE Conference on Big Data and Analytics, ICBDA 2018).
@inproceedings{0b0739b204454a1087a1f186baeee318,
title = "Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models",
abstract = "Gene regulatory networks (GRN) reconstruction is the process of identifying gene regulatory interactions from experimental data through computational analysis. GRN reconstruction-related works have boosted many major discoveries in finding drug targets for the treatment of human diseases, including cancer. However, reconstructing GRNs from gene expression data is a challenging problem due to high-dimensionality and very limited number of observations data, severe multicollinearity and the tendency of generating cascade errors. These problems lead to the reduced performance of GRN inference methods, hence resulting in the method being unreliable for scientific usage. We propose a method called P-CALS (Principal Component Analysis and Partial Least Squares) that is derived from the combination of PCA (Principal Component Analysis) with PLS (Partial Least Squares). The performance of P-CALS is assessed to the genome-scale GRN of E. coli, S. cerevisiae and an in-silico datasets. We discovered that P-CALS achieved satisfactory results as all of the sub-networks from diverse datasets achieved AUROC values above 0.5 and gene relationships were discovered at the most complex network tested in the experiments.",
author = "{Mohamed Salleh}, {Faridah Hani} and Suhaila Zainudin and Raih, {Mohd Firdaus}",
year = "2019",
month = "1",
day = "29",
doi = "10.1109/ICBDAA.2018.8629777",
language = "English",
series = "2018 IEEE Conference on Big Data and Analytics, ICBDA 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "129--134",
booktitle = "2018 IEEE Conference on Big Data and Analytics, ICBDA 2018",
address = "United States",

}

Mohamed Salleh, FH, Zainudin, S & Raih, MF 2019, Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models. in 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018., 8629777, 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018, Institute of Electrical and Electronics Engineers Inc., pp. 129-134, 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018, Langkawi, Kedah, Malaysia, 21/11/18. https://doi.org/10.1109/ICBDAA.2018.8629777

Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models. / Mohamed Salleh, Faridah Hani; Zainudin, Suhaila; Raih, Mohd Firdaus.

2018 IEEE Conference on Big Data and Analytics, ICBDA 2018. Institute of Electrical and Electronics Engineers Inc., 2019. p. 129-134 8629777 (2018 IEEE Conference on Big Data and Analytics, ICBDA 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models

AU - Mohamed Salleh, Faridah Hani

AU - Zainudin, Suhaila

AU - Raih, Mohd Firdaus

PY - 2019/1/29

Y1 - 2019/1/29

N2 - Gene regulatory networks (GRN) reconstruction is the process of identifying gene regulatory interactions from experimental data through computational analysis. GRN reconstruction-related works have boosted many major discoveries in finding drug targets for the treatment of human diseases, including cancer. However, reconstructing GRNs from gene expression data is a challenging problem due to high-dimensionality and very limited number of observations data, severe multicollinearity and the tendency of generating cascade errors. These problems lead to the reduced performance of GRN inference methods, hence resulting in the method being unreliable for scientific usage. We propose a method called P-CALS (Principal Component Analysis and Partial Least Squares) that is derived from the combination of PCA (Principal Component Analysis) with PLS (Partial Least Squares). The performance of P-CALS is assessed to the genome-scale GRN of E. coli, S. cerevisiae and an in-silico datasets. We discovered that P-CALS achieved satisfactory results as all of the sub-networks from diverse datasets achieved AUROC values above 0.5 and gene relationships were discovered at the most complex network tested in the experiments.

AB - Gene regulatory networks (GRN) reconstruction is the process of identifying gene regulatory interactions from experimental data through computational analysis. GRN reconstruction-related works have boosted many major discoveries in finding drug targets for the treatment of human diseases, including cancer. However, reconstructing GRNs from gene expression data is a challenging problem due to high-dimensionality and very limited number of observations data, severe multicollinearity and the tendency of generating cascade errors. These problems lead to the reduced performance of GRN inference methods, hence resulting in the method being unreliable for scientific usage. We propose a method called P-CALS (Principal Component Analysis and Partial Least Squares) that is derived from the combination of PCA (Principal Component Analysis) with PLS (Partial Least Squares). The performance of P-CALS is assessed to the genome-scale GRN of E. coli, S. cerevisiae and an in-silico datasets. We discovered that P-CALS achieved satisfactory results as all of the sub-networks from diverse datasets achieved AUROC values above 0.5 and gene relationships were discovered at the most complex network tested in the experiments.

UR - http://www.scopus.com/inward/record.url?scp=85062777322&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062777322&partnerID=8YFLogxK

U2 - 10.1109/ICBDAA.2018.8629777

DO - 10.1109/ICBDAA.2018.8629777

M3 - Conference contribution

AN - SCOPUS:85062777322

T3 - 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018

SP - 129

EP - 134

BT - 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Mohamed Salleh FH, Zainudin S, Raih MF. Reconstruction of Large-Scale Gene Regulatory Networks Using Regression-based Models. In 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018. Institute of Electrical and Electronics Engineers Inc. 2019. p. 129-134. 8629777. (2018 IEEE Conference on Big Data and Analytics, ICBDA 2018). https://doi.org/10.1109/ICBDAA.2018.8629777