Quality Control of Gene Expression Data Allows Accurate Quantification of Differentially Expressed Biological Pathways

Reed, Ellen, Ferrari, Enrico and Soloviev, Mikhail (2023) Quality Control of Gene Expression Data Allows Accurate Quantification of Differentially Expressed Biological Pathways. Current Bioinformatics, 18 (5). pp. 409-427. ISSN 1574-8936

Full content URL: https://doi.org/10.2174/1574893618666230221141815

Documents
Quality Control of Gene Expression Data Allows Accurate Quantification of Differentially Expressed Biological Pathways
Author's accepted manuscript
[img] PDF
cbio-template - first draft_for self-archiving.pdf - Whole Document
Restricted to Repository staff only until 17 April 2024.

32MB
Item Type:Article
Item Status:Live Archive

Abstract

Background: Gene expression signatures provide a promising diagnostic tool for many diseases, including cancer. However, there remain multiple issues related to the quality of gene expression data, which may impede the analysis and interpretation of differential gene expression in cancer.

Objective: We aimed to address existing issues related to the quality of gene expression data and to devise improved quality control (QC) and expression data processing procedures.

Methods: Linear regression analysis was applied to gene expression datasets generated from diluted and pre-mixed matched breast cancer and normal breast tissue samples. Datapoint outliers were identified and removed, and accurate expression values corresponding to cancer and normal tissues were recalculated.

Results: We achieved a 27% increase in the number of identifiable differentially regulated genes and a similar reduction in the number of false positives identified from microarray DEG data. Our approach reduced technical errors and improved the accuracy and precision of determining the degree of DEG but did not remove biological outliers, such as naturally variably expressed genes. We also determined the linear dynamic range of microarray assay directly from expression data, which allowed accurate quantification of differentially expressed entire pathways.

Conclusion: The improved QC allowed accurate discrimination of genes by the degree of their upregulation, which helped to reveal an intricate and highly tuned network of biological pathways and their regulation in cancer. We were able, for the first time, to quantify the degree of transcriptional upregulation of entire individual biological pathways upregulated in breast cancer. It can be concluded that the vast majority of DEG data that are publicly available today may have been generated using sub-optimal experimental design, lacking preparations required for genuinely accurate and quantitative analysis.

Keywords:gene expression, microarrays, differential gene expression, biological pathways, breast cancer, quality control
Subjects:C Biological Sciences > C400 Genetics
Divisions:College of Science > School of Life and Environmental Sciences > Department of Life Sciences
ID Code:54813
Deposited On:22 May 2023 15:45

Repository Staff Only: item control page