Asymptotic analysis of a two-way semilinear model for microarray data. (English) Zbl 1086.62122
Summary: The cDNA microarray technology is a tool for monitoring gene expression levels on a large scale and has been widely used in functional genomics. A basic question in analyzing microarray data is proper normalization to ensure meaningful down-stream analyses. We propose a two-way semilinear model for microarray data with two important features. First, it does not require pre-selection of constantly expressed genes or the assumptions that either the percentage of differentially expressed genes is small or there is symmetry in the expression levels of up- and down- regulated genes. Second, when used for dection of differentially expressed genes, it incorporates variations due to normalization in the assessment of uncertainty in the estimated differences in gene expressions.
The proposed model presents novel and challenging theoretical questions in the area of semiparametric statistics due to the presence of infinitely many nonparametric components. We provide theoretical justification that unbiased statistical inference is possible in the two-way semilinear model when self calibration is needed with a large number of parameters. We also prove that the nonparametric optimal rate of convergence can be achieved in estimating the normalization curves under appropriate conditions.

62P10 Applications of statistics to biology and medical sciences; meta analysis
62G20 Asymptotic properties of nonparametric inference
62G08 Nonparametric regression and quantile regression