ABSTRACT
Microarrays nowadays have an almost ubiquitous presence in modern biological research The extent and versatility of the techniques that are available for analysis and interpretation of microarray experiments can be somehow bewildering to the interested biologists. Functional genomics involves the highthroughput analysis of large datasets of information derived from various biological experiments. Microarray technology makes this possible by monitoring the emitting fluorescence reflecting the expression levels of thousands of genes simultaneously, which are bound to the oligonucleotide probes specific for each of the putative gene sequences comprising the total genome of the investigated organism, under a particular condition.. This chapter is a brief overview of the basic concepts involved in a microarray experiment; and it aspires to provide a concise overview of key issues regarding the various steps of implementation of this promising experimental methodology. In this sense, the chapter gives a feeling for what the data actually represent, and will provide information on the various computational methods that one can employ to derive meaningful results from such experiments.
- Quackenbush, J., Microarray Data Normalization and Transformation, Nat. Genetics 32 (2002) 496- 501.Google Scholar
Cross Ref
- Quackenbush J., Microarray Analysis and Tumor Classification, N. Engl. J. Med. 354 (2006) 2463- 2472.Google Scholar
Cross Ref
- Sorlie, T., Tibshirani, R., Parker, J., Hastie, T., Marron, J.S., Nobel, A., Deng, S., Johnsen, H., Pesich, R., Geisler, S., Demeter, J., Perou, C.M., Lonning, P.E., Brown, P.O., Borresen-Dale, A.L., Botstein, D., Repeated Observation of Breast Tumor Subtypes in Independent Gene Expression Data Sets, Proc. Nat. Acad. Sci. 100 (2003) 8418-8423.Google Scholar
Cross Ref
- Mariadason, J.M., Arango, D., Shi, Q., Wilson, A.J., Corner, G.A., Nicholas, C., Aranes, M.J., Lesser, M., Schwartz, E.L., Augenlicht, L.H., Gene Expression Profiling-Based Prediction of Response of Colon Carcinoma Cells to 5-Fluorouracil and Camptothecin, Cancer Res. 63 (2003) 8791-8812.Google Scholar
- Allison, D.B., Cui, X., Page, G.P., Sabripour, M., Microarray Data Analysis: from Disarray to Consolidation and Consensus, Nat. Rev. Genet. 7 (2006) 55-65.Google Scholar
Cross Ref
- Babu, M., Introduction to Microarray Data Analysis, Computational Genomics: Theory and Applications, (Ed: R. Grant) Horizon Press (2004).Google Scholar
- Rosenweig, B.A., Pine, P.S., Domon, E.O., Morris, S.M., Chen, J.J., Sistare, F., Dye-Bias Correction in Dual-Labeled cDNA Microarray Gene Expression Measurements, Environ. Health Perspect. 112 (2004) 480-487.Google Scholar
Cross Ref
- Perou, C.M., Sorlie, T., Eisen, M.B., van de Rijn, M., Jeffrey, S.S., Rees, C.A., Pollack, J.R., Ross, D.T., Johnsen, H., Akslen, L.A., Fluge, O., Pergamenschikov, A., Williams, C., Zhu, S.X., Lonning, P.E., Borresen-Dale, A.L., Brown, P.O., Botstein, D., Molecular Portraits of Human Breast Tumours. Nature 406 (2000) 747-752.Google Scholar
- Gibson, G., Muse, S.V., A Primer of Genome Science, Sinauer Associates, Inc. Sunderland, MA (2002).Google Scholar
- Knudsen, S., Guide to Analysis of DNA Microarray Data. (Ed: N.J. Hoboken) John Wiley & Sons, Inc., (2004).Google Scholar
- Tarca, A.L., Romero, R., Draghici, S., Analysis of Microarray Experiments of Gene Expression Profiling, American Journal of Obstetrics and Gynecology 195 (2006) 373-388.Google Scholar
Cross Ref
- Speed, T., Statistical Analysis of Gene Expression Microarray Data, Chapman & Hall/CRC, (2003).Google Scholar
- Chatziioannou, A., Moulos, P., Kolisis, F., Aidinis, V., ANDROMEDA: a Pipeline for Versatile 2- Colour cDNA Microarray Data Analysis Implemented in MATLAB, (2007) submitted.Google Scholar
- Bilban, M., Buehler, L.K., Head, S., Desoye, G., Quaranta, V., Normalizing DNA Microarray Data, Curr. Issues Mol. Biol. 4 (2002) 57-64.Google Scholar
- Tseng, G.C., Oh, M.K., Rohlin, L., Liao, J.C., Wong, W.H., Issues in cDNA Microarray Analysis: Quality Filtering, Channel Normalization, Models of Variations and Assessment of Gene Effects, Nucleic Acids Research 29 (2001) 2549-2557.Google Scholar
Cross Ref
- Cleveland, W.S., Grosse, E., Shyu, W.M.: Local Regression Models, Statistical Models in S, (Eds: J.M. Chambers, T.J. Hastie), Wadsworth & Brooks/Cole Dormand, J.R. (1992).Google Scholar
- Hoffmann, R., Seidl, T., Dugas, M., Profound Effect of Normalization on Detection of Differentially Expressed Genes in Oligonucleotide Microarray Data Analysis, Genome Biol. 3 (2002) RESEARCH0033.Google Scholar
- Quackenbush, J., Computational Analysis of Microarray Data, Nat. Rev. Genet. 2 (2001) 418-427.Google Scholar
Cross Ref
- Finkelstein, D.B., Ewing, R., Gollub, J., Sterky, F., Somerville, S., Cherry, J.M., Iterative Linear Regression by Sector, Methods of Microarray Data Analysis, (Eds. S.M. Lin, K.F. Johnson), Cambridge, MA: Kluwer Academic (2002) 57-68.Google Scholar
- Hegde, P., Qi, R., Abernathy, K., Gay, C., Dharap, S., Gaspard, R., Earle- Hughes, J., Snesrud, E., Lee, N., Quackenbush, J., A Concise Guide to cDNA Microarray Analysis, Biotechniques 29 (2000) 548- 562.Google Scholar
Cross Ref
- Durbin, B.P., Hardin, J.S., Hawkins, D.M., Rocke, D.M.: A Variance Stabilizing Transformation for Gene-Expression Microarray Data, Bioinformatics 18 (2002) S105-S110.Google Scholar
Cross Ref
- Yang, Y.H., Dudoit, S., Luu, P., Lin, D.M., Peng, V., Ngai, J., Speed, T.P., Normalization for cDNA Microarray Data: a Robust Composite Method Addressing Single and Multiple Slide Systematic Variation, Nucleic Acids Research 30 (2002) e15.Google Scholar
Cross Ref
- Cui, X., Kerr, M.K., Churchill, G.A., Transformations for cDNA Microarray Data, Stat. Appl. Genet. Mol. Biol. 2 (2003) Article4.Google Scholar
Cross Ref
- Wu, Z., Irizarry, R., Gentleman, R.C., Murillo, F.M., Spencer, F., A Model Based Background Adjustment for Oligonucleotide Expression Arrays, Collection of Biostatistics Research Archive (2004) Article1.Google Scholar
- Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis, K.J., Scherf, U., Speed, T.P., Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data, Biostatistics 4 (2003) 249-264.Google Scholar
Cross Ref
- Affymetrix, Statistical Algorithms Description Document, Affymetrix, Inc., (2002).Google Scholar
- Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B., Missing Value Estimation Methods for DNA Microarrays, Bioinformatics 17 (2001) 520-525.Google Scholar
Cross Ref
- Nguyen, D.V., Wang, N., Carrol, R.J., Evaluation of Missing Value Estimation for Microarray Data, Journal of Data Science 2 (2004) 347-370.Google Scholar
- Dudoit, S., Yang, Y.H., Speed, T., Callow, M.J., Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments, Statistica Sinica 12 (2002) 111-139.Google Scholar
- Kerr, M.K., Martin, M., Churchill, G.A., Analysis of Variance for Gene Expression Microarray Data, J. Computational Biol. 7 (2000), 819-837.Google Scholar
Cross Ref
- Ideker, T., Thorsson, V., Siehel, A.F., Hood, L.E.: Testing for Differentially Expressed Genes by Maximum Likelihood Analysis of Microarray Data. J. Comput. Biol. 7 (2000) 805-817.Google Scholar
- Conover, W.J., Practical Nonparametric Statistics, Wiley (1980).Google Scholar
- Tusher, V.G., Tibshirani, R., Chum G.: Significance Analysis of Microarrays Applied to the Ionizing Radiation Response. Proc. Nat. Acad. Sci. 98 (2001) 5116-5121.Google Scholar
Cross Ref
- Pan, W., A Comparative Review of Statistical Methods for Discovering Differentially Expressed Genes in Replicated Microarray Experiments. Bioinformatics 18 (2002) 546-554.Google Scholar
Cross Ref
- Kim, S.Y., Lee, J.W., Sohn, I.S., Comparison of Various Statistical Methods for Identifying Differential Gene Expression in Replicated Microarray Data, Statistical Methods in Medical Research 15 (2006) 3- 20.Google Scholar
Cross Ref
- Yauk, C.L., Berndt, M.L., Williams, A., Douglas, G.R., Comprehensive Comparison of Six Microarray Technologies, Nucleic Acids Research 32 (2004) e124.Google Scholar
Cross Ref
- Canales, R.D., Luo, Y., Willey, J.C., Austermiller, B., Barbacioru, C.C., Boysen, C., Hunkapiller, K., Jensen, R.D., Knight, C.R., Lee, K.Y., Ma, Y., Maqsodi, B., Papallo, A., Peters, E.H., Poulter, K., Ruppel, P.L., Samaha, R.R., Shi, L., Yang, W., Zhang, L., Goodsaid, F.M., Evaluation of cDNA Microarray Results with Quantitative Gene Expression Platforms, Nat. Biotech. 24 (2006) 1115-1122.Google Scholar
Cross Ref
- Guo, L., Lobenhofer, E.K., Wang, C., Shippy, R., Harris, S.C., Zhang, L., Mei, N.,Chen, T., Herman, D., Goodsaid, F.M., Hurban, P., Phillips, K.L., Xu, J., Deng, X., Sun, Y.A., Tong, W., Dragan, Y.P., Shi, L., Rat Toxigonecomic Study Reveals Analytical Consistency Across Microarray Platforms, Nat. Biotech. 24 (2006) 1162-1169.Google Scholar
Cross Ref
- Dudoit, S., Shaffer, J.P., Boldrick, J.C., Multiple Hypothesis Testing in Microarray Experiments, Statistical Science 18 (2003) 71-103.Google Scholar
Cross Ref
- Benjamini, Y., Hochberg, Y., Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing, J. R. Statist. Soc. 57 (1995) 289-300.Google Scholar
- Storey, J.D., Tibshirani, R., Statistical significance for genomewide studies, Proc. Nat. Acad. Sci. 100 (2003) 9440-9445.Google Scholar
Cross Ref
- Gentleman, R., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B. Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A.J., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J.Y.H., Zhang, J., Bioconductor: Open Software Development for Computational Biology and Bioinformatics, Genome Biology 5 (2004) R80.Google Scholar
Cross Ref
- Saeed A.I., Sharov, V., White, J., Li, J., Liang, W., Bhagabati, N., Braisted, J., Klapa, M., Currier, T., Thiagarajan, M., Sturn, A., Snuffin, M., Rezantsev, A., Popov, D., Ryltsov, A., Kostukovich, E., Borisovsky, I., Liu, Z., Vinsavich, A., Trush, V., Quackenbush, J., TM4: a free, open-source system for microarray data management and analysis, Biotechniques 34 (2003) 374-378.Google Scholar
Cross Ref
- Chatziioannou, A., Moulos, P., Kolisis, F., Aidinis, V., ANDROMEDA: a Pipeline for Versatile 2- colour cDNA Microarray Data Analysis Implemented in MATLAB, submitted (2007).Google Scholar
- Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., Mesirov, J.P.: Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles. Proc. Nat. Acad. Sci. 102 (2005) 15545-15550.Google Scholar
- Fortunel, N.O., Otu, H.H., Ng, H.H., Chen, J., Mu, X., Chevassut, T., Li, X., Joseph, M., Bailey, C., Hatzfeld, J. A., et al., Comment on "'Stemness': Transcriptional Profiling of Embryonic and Adult Stem Cells" and "A Stem Cell Molecular Signature", Science 302 (2003) 393.Google Scholar
Cross Ref
- Newton, J., Analysis of Microarray Gene Expression Data Using Machine Learning Techniques, Technical Report, University of Alberta, Canada, (2002).Google Scholar
- Kaufman, L., Rousseeuw, P.J., Finding Groups in Data: An Introduction to Cluster Analysis, New York, Wiley (1990).Google Scholar
- Hastie, T., Tibshirani, R., Eisen, M.B., Alizadeh, A., Levy, R., Staudt, L., Chan, W.C., Botstein, D., Brown, P.: 'Gene Shaving' as a Method for Identifying Distinct Sets of Genes with Similar Expression Patterns, Genome Biology 1 (2002) research0003.1-0003.21.Google Scholar
- Kohonen, T., Self-Organizing Maps, Springer, Berlin (1995). Google Scholar
Digital Library
- Torkkola, K., Gardner, R.M., Kaysser-Kranich, T., Ma, C., Self-Organizing Maps in Mining Gene Expression Data, Information Sciences 139 (2001) 79-96. Google Scholar
Digital Library
- Fu, L., Medico, E., FLAME, a Novel Fuzzy Clustering Method for the Analysis of DNA Microarray Data, BMC Bioinformatics 8 (2007) 3.Google Scholar
Cross Ref
- Dobbin, K., Simon, R., Comparison of Microarray Designs for Class Comparison and Class Discovery, Bioinformatics 18 (2002) 1438-1445.Google Scholar
Cross Ref
- Hastie, T., Tibshirani, R., Friedman, J., Elements of Statistical Learning, Springer-Verlag (2001).Google Scholar
- Simon, R., Supervised Analysis When the Number of Candidate Features (p) Greatly Exceeds The Number of Cases (n), ACM SIGKDD Explorations Newsletter 5 (2003) 3-36. Google Scholar
Digital Library
- Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection, Journal of Machine Learning Research 3 (2003) 1157-1182. Google Scholar
Digital Library
- Li, T., Zhang, C., Ogihara, M.: A Comparative Study of Feature Selection and Multiclass Classification Methods for Tissue Classification Based on Gene Expression, Bioinformatics 20 (2005) 2429-2437. Google Scholar
Digital Library
- Amaldi, E., Kann, V., On the Approximation of Minimizing non-zero Variables or Unsatisfied Relations in Linear Systems, Theoretical Computer Science 209 (1998) 237-260. Google Scholar
Digital Library
- Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection, Artificial Intelligence 97 (1997) 273- 324. Google Scholar
Digital Library
- Liu, H., Jinyan, L., Limsoon, W., A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns, Genome Informatics 13 (2002) 51- 60.Google Scholar
- Xing, E.P., Jordan, M.I., Karp, R.M., Feature Selection for High-Dimensional Genomic Microarray Data, Proceedings of the 18th International Conference on Machine Learning (2001) 601-608. Google Scholar
Digital Library
- Jolliffe, I.T.: Principal Component Analysis, Springer (2002).Google Scholar
- Raychaudhuri, S., Stuart, J.M., Altman, R.B.: Principal Component Analysis to Summarize Microarray Experiments: Application to Sporulation Time Series, Pac. Symp. Biocomput. 5 (2000) 452-463.Google Scholar
- Parmigiani, G., Garett, E.S., Irizarry, R.A., Zeger, S.L., The Analysis of Gene Expression Data: Methods and Software, Springer (2003).Google Scholar
- Hilsenbeck, S.G., Friedrichs, W.E., Schiff, R., O'Connell, P., Hansen, R.K., Osborne, C.K., Fuqua, S.A.W.: Statistical Analysis of Array Expression Data as Applied to the Problem of Tamoxifen Resistance, J. Natl. Cancer Institute 91 (1999) 453-459.Google Scholar
Cross Ref
- Yeung, K.W., Ruzzo, W.L.: Principal Component Analysis for Clustering Gene Expression Data, Bioinformatics 17 (2001) 763-774.Google Scholar
Cross Ref
- Liu, A., Zhang, Y., Gehan, E., Clarke, R: Block Principal Component Analysis with Application to Gene Microarray Data Classification, Stat. Med. 21 (2002) 3465-3474.Google Scholar
Cross Ref
- Tian, L., Greenberg, S.A., Kong, S.K., Altschuler, J., Kohane, S.A., Park, P.J.: Discovering Statistically Significant Pathways in Expression Profiling Studies, Proc. Nat. Acad. Sci. 102 (2005) 13544-13549.Google Scholar
Cross Ref
- Khatri, P., Draghici, S.: Ontological Analysis of Gene Expression Data: Current Tools, Limitations and Open Problems, Bioinformatics 21 (2005) 3587-3595. Google Scholar
Digital Library
- The Gene Ontology Consortium, Gene Ontology: Tool for the Unification of Biology, Nature Genet. 25 (2000) 25-29.Google Scholar
Cross Ref
- Dahlquist, K.D., Salomonis, N., Vranizan, K., Lawlor, S.C., Conklin, B.R.: GenMAPP, a New Tool for Viewing and Analyzing Microarray Data on Biological Pathways, Nat. Genet. 31 (2002) 19-20.Google Scholar
Cross Ref
- Nikitin, A., Egorov, S., Daraselia, N., Mazo, I.: Pathway Studio-The Analysis and Navigation of Molecular Networks, Bioinformatics 19 (2003) 2155-2157.Google Scholar
Cross Ref
- Doniger, S.W., Salomonis, N., Dahlquist, K.M.,Vranizan, K., Lawlor, S.C., Conklin, B.R., MAPPFinder: using Gene Ontology and GenMAPP to Create a Global Gene Expression Profile from Microarray Data, Genome Biology 4 (2003) R:7.Google Scholar
Cross Ref
- Beißbarth, T., Speed, T.P., GOstat: Find Statistically Overrepresented Gene Ontologies Within a Group of Genes, Bioinformatics 20 (2004) 1464-1465. Google Scholar
Digital Library
- Lee, H.K., Braynen, W., Keshav, K., Pavlidis, P., ErmineJ: Tool for Functional Analysis of Gene Expression Data Sets, BMC Bioinformatics 6 (2005) 269.Google Scholar
Cross Ref
- Pavlidis, P., Lewis, D.P., Noble, W.S.: Exploring Gene Expression Data with Class Score, Pac. Symp. Biocomput. 7 (2002) 474-485.Google Scholar
- Ramakrishnan, N., Antoniotti, M., Mishra, B.: Reconstructing Formal Temporal Models of Cellular Events using the GO Process Ontology, Bio-Ontologies SIG Meeting, ISMB 2005 Detroit, U.S.A. (2005).Google Scholar
- Bild, A., Potti, A., Nevins, J.R.: Linking Oncogenic Pathways with Therapeutic Opportunities, Nat. Rev. Cancer 6 (2006) 735-741.Google Scholar
Cross Ref
- Kim, S.Y., Volsky, D.J.: PAGE: Parametric Analysis of Gene Set Enrichment, BMC Bioinformatics 6 (2005) 144.Google Scholar
Cross Ref
- Argyropoulos, C.,, Chatziioannou, A.A, Nikiforidis G.,, Moustakas A.,, Kollias, G., and Aidinis, V. Operational criteria for selecting a cDNA microarray data normalization algorithm. Oncology reports. 15 Spec no.
4, (2006) 983-996.Google Scholar
- Hoffmann R, Seidl T, Dugas M (2002). Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis. Genome Biol 3:RESEARCH0033.Google Scholar
Index Terms
- Interpretation of gene expression microarray experiments
Recommendations
A probabilistic approach for automated discovery of perturbed genes using expression data from microarray or RNA-Seq
BackgroundIn complex diseases, alterations of multiple molecular and cellular components in response to perturbations are indicative of disease physiology. While expression level of genes from high-throughput analysis can vary among patients, the common ...
Mining pathway signatures from microarray data and relevant biological knowledge
High-throughput technologies such as DNA microarray are in the process of revolutionising the way modern biological research is being done. Bioinformatics tools are becoming increasingly important to assist biomedical scientists in their quest in ...
An expert system to identify co-regulated gene groups from time-lagged gene clusters using cell cycle expression data
Motivation: The analysis of time series gene expression data can provide us with the opportunity to find co-regulated genes that show a similar expression patterns under a contiguous subset of experimental conditions. However, these co-regulated genes ...




Comments