ABSTRACT
The method of Sorted L-One Penalized Estimation, abbreviated as SLOPE, is a novel sparse regression method for model selection introduced in a sequence of recent papers, [4], [3] and [7] by Bogdan, van den Berg, Sabatti, Su and Candes. It estimates the coefficients of a linear model that possibly has more unknown parameters than observations. In many settings the SLOPE method is shown to successfully control the false discovery rate (the proportion of the irrelevant among all selected predictors) at a user specified level. In this paper we evaluate its performance on genetic data, and show its superiority over LASSO which is a related and popular method. Often in genetic data sets, group structures among the predictor variables are given as prior knowledge, such as SNPs in a gene or genes in a pathway. Following this motivation we extend SLOPE in the spirit of Group LASSO to Group SLOPE, a method that can handle group structures between the predictor variables, which are ubiquitous in real genetic data. Our simulation results show that the proposed Group SLOPE method is capable of controlling the false discovery rate at a specified level. Moreover, our simulations show that compared to Group LASSO, Group SLOPE in general achieves a higher power as well as a lower false discovery rate.
References
- A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183--202, 2009. Google Scholar
Digital Library
- Y. Benjamini and Y. Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289--300, 1995.Google Scholar
Cross Ref
- M. Bogdan, E. van den Berg, C. Sabatti, W. Su, and E. J. Candes. SLOPE -- Adaptive Variable Selection via Convex Optimization. ArXiv e-prints, July 2014.Google Scholar
- M. Bogdan, E. van den Berg, W. Su, and E. Candes. Statistical estimation and testing via the sorted L1 norm. ArXiv e-prints, Oct. 2013.Google Scholar
- P. Breheny and J. Huang. Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statistics and Computing, 25(2):173--187, 2015. Google Scholar
Digital Library
- D. Brzyski, M. Bogdan, and W. Su. Group slope -- adaptive selection of groups of predictors. Preprint, Aug. 2015.Google Scholar
- E. Candes and W. Su. SLOPE is Adaptive to Unknown Sparsity and Asymptotically Minimax. ArXiv e-prints, Mar. 2015.Google Scholar
- S. Cao, H. Qin, H.-W. Deng, and Y.-P. Wang. A unified sparse representation for sequence variant identification for complex traits. Genetic epidemiology, 38(8):671--679, 2014.Google Scholar
Cross Ref
- M. C. Cetin and A. Erar. Variable selection with akaike information criteria: a comparative study. Hacettepe Journal of Mathematics and Statistics, 31:89--97, 2002.Google Scholar
- R.-H. Chung, W.-Y. Tsai, C.-H. Hsieh, K.-Y. Hung, C. A. Hsiung, and E. R. Hauser. Seqsimla2: Simulating correlated quantitative traits accounting for shared environmental effects in user-specified pedigree structure. Genetic Epidemiology, 39(1):20--24, 2015.Google Scholar
Cross Ref
- P. L. Combettes and J.-C. Pesquet. Proximal splitting methods in signal processing. In Fixed-point algorithms for inverse problems in science and engineering, pages 185--212. Springer New York, 2011.Google Scholar
Digital Library
- J. Friedman, T. Hastie, and R. Tibshirani. A note on the group lasso and a sparse group lasso. ArXiv e-prints, Jan. 2010.Google Scholar
- J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1):1--22, 2010.Google Scholar
Cross Ref
- T. J. Hastie, R. J. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning : Data Mining, Inference, and Prediction. Springer series in statistics. Springer, New York, 2009. Autres impressions : 2011 (corr.), 2013 (7e corr.).Google Scholar
- J. Huang, P. Breheny, and S. Ma. A selective review of group selection in high-dimensional models. Statist. Sci., 27(4):481--499, 11 2012.Google Scholar
Cross Ref
- S. F. Schaffner, C. Foo, S. Gabriel, D. Reich, M. J. Daly, and D. Altshuler. Calibrating a coalescent simulation of human genome sequence variation. Genome Research, 15(11):1576--1583, 2005.Google Scholar
Cross Ref
- T. Sun and C.-H. Zhang. Scaled sparse linear regression. Biometrica, 99(4):879--898, 2012.Google Scholar
Cross Ref
- R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267--288, 1994.Google Scholar
- M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1):49--67, 2006.Google Scholar
Index Terms
Identification of significant genetic variants via SLOPE, and its extension to group SLOPE




Comments