close

Вход

Забыли?

вход по аккаунту

?

03610918.2017.1395040

код для вставкиСкачать
Communications in Statistics - Simulation and
Computation
ISSN: 0361-0918 (Print) 1532-4141 (Online) Journal homepage: http://www.tandfonline.com/loi/lssp20
Double shrunken selection operator
B. Yüzbaşı & M. Arashi
To cite this article: B. Yüzbaşı & M. Arashi (2017): Double shrunken selection
operator, Communications in Statistics - Simulation and Computation, DOI:
10.1080/03610918.2017.1395040
To link to this article: http://dx.doi.org/10.1080/03610918.2017.1395040
Accepted author version posted online: 23
Oct 2017.
Submit your article to this journal
Article views: 7
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=lssp20
Download by: [University of Florida]
Date: 27 October 2017, At: 07:04
ACCEPTED MANUSCRIPT
Double shrunken selection operator
B. Yüzbaşı1∗and M. Arashi2
1
Downloaded by [University of Florida] at 07:04 27 October 2017
2
Department of Econometrics, Inonu University, Malatya, Turkey
Department of Statistics, Shahrood University of Technology, Iran
Abstract: The least absolute shrinkage and selection operator (LASSO) of
Tibshirani (1996) is a prominent estimator which selects significant (under
some sense) features and kills insignificant ones. Indeed the LASSO shrinks
features larger than a noise level to zero. In this paper, we force LASSO to
be shrunken more by proposing a Stein-type shrinkage estimator emanating
from the LASSO, namely the Stein-type LASSO. The newly proposed estimator proposes good performance in risk sense numerically. Variants of this
estimator have smaller relative MSE and prediction error, compared to the
LASSO, in the analysis of prostate cancer data set.
Key words and phrases: Double shrinking; Linear regression model; LASSO;
MSE; Prediction error; Stein-type shrinkage estimator
AMS Classification: 62G08, 62J07, 62G20
1
Introduction
It is well-known that the least squares estimator (LSE) in the linear regression model,
is unbiased with minimum variance. However, dealing with sparse linear models, it is
deficient from prediction accuracy and/or interpretation. As a remedy, one may use the
∗
Corresponding author. Email: b.yzb@hotmail.com
1
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
least absolute shrinkage and selection operator (LASSO) estimator of Tibshirani (1996).
It defines a continuous shrinking operation that can produce coefficients that are exactly
“zero” and is competitive with subset selection and ridge regression retaining good properties of both the estimators. LASSO simultaneously estimates and selects the coefficients
of a given linear regression model. Recently, Saleh and Raheem (2015a) have proposed
an improved LASSO estimation technique based on Stein-rule, where they use uncertain
prior information on parameters of interest.
Baranchik (1970) introduced a family of minimax estimators that contains the James
Downloaded by [University of Florida] at 07:04 27 October 2017
and Stein (1961). Ali and Saleh (1990) considered the preliminary and shrinkage estimations under more a general setting of a p-variate normal distribution with unknown
covariance matrix. Ahmed et al. (2015) showed how using the expansions for the coverage
probability of a confidence set centered at the James-Stein estimator can be used for a
construction of confidence region with constant confidence level. Chang (2015) suggested
doubly shrinkage estimators for larger sparse covariance matrices. Ahmed (2001) studied
the asymptotic properties of Stein-type estimators in various contexts. Ahmed et. al
(2007) introduced shrinkage, pretest and absolute penalty estimators in partially linear
models. Ahmed and Fallahpour (2012) considered the estimation problem for the quasilikelihood model in presence of non-sample information. See Saleh (2006) and Ahmed
(2014) for a comprehensive overview on shrinkage estimation with uncertain prior information. Saleh and Raheem (2015a) illustrated superiority of a set of LASSO-based
shrinkage estimators over the classical LASSO estimator. Hansen (2016) numerically
compared the L2 -risk of shrinkage and LASSO estimators and highlighted some of the
limitations of LASSO. Roozbeh and Arashi (2016) and Yüzbaşı and Ahmed (2016) developed shrinkage ridge estimators in partial linear models. Very recently, Yüzbaşı et.
al (2017) proposed pretest and shrinkage estimators in ridge regression linear models and
compare their performance with some penalty estimators, including LASSO. Other related
studies include Asar (2017), Saleh and Raheem (2015b).
In this paper, we present a Steinian LASSO-type estimator by double shrinking the
features. Specifically, following James and Stein (1961) and Stein (1981), we propose a
2
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
set of Stein-type LASSO estimators. For definition of James-Stein estimator, see Groβ
(2012). We will illustrate how the proposed set of estimators perform well compared to the
LASSO. In all comparisons, we use the L2 -risk measure of closeness, i.e., for any estimator
bn of the vector-parameter θ, the L2 -loss function is given by L(θ; θ)
b = kθ
b − θk2 and the
θ
h
i
b .
associated L2 -risk is evaluated by limn→∞ E nL(θ; θ)
In what follows, we propose the set of Stein-type LASSO estimators and evaluate the
performance of the proposed estimators, compared to the LASSO, via a Monte Carlo simulation study. We further investigate the superiority of the proposed estimators compared
Downloaded by [University of Florida] at 07:04 27 October 2017
to the LASSO using the prostate cancer data set.
2
Linear Model and Estimators
Consider the linear regression model
Yi = β0 + β1 x1i + . . . + βp xpi + i = β0 + x>
i β + i ,
i = 1, . . . , n,
(2.1)
where 1 , . . . , n are i.i.d. random variables with mean 0 and variance σ 2 .
Without loss of generality, we will assume that the covariates are centered to have
P
mean 0 and take βb0 = n−1 nj=1 Yi = Ȳ and replace Yi in (2.1) by Yi − Ȳ to eliminate β0 .
Then, we also assume Ȳ = 0 to better concentrate on the estimation of β = (β1 , . . . , βp )> .
Following Knight and Fu (2000), we consider the bridge estimator of β by minimizing
the penalized least squares criterion
p
n
2 λn X
1X
>
Yi − x i β +
|βj |γ ,
n i=1
n j=1
(2.2)
for a given λn with γ > 0.
In consequent study, we only focus on the special case γ = 1, resulting the LASSO of
Tibshirani (1996). We will provide some notes about the use of (2.2) in conclusions.
3
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
2.1
Stein-type LASSO
Following Stein (1981), we define the following set of general shrinkage estimators emanating from the LASSO estimator as
bS = β
b L + g(β
b L ),
β
n
n
n
(2.3)
for some function g : Rp → Rp . Assume the following regularity conditions hold:
Downloaded by [University of Florida] at 07:04 27 October 2017
(A1) C n =
(A2)
1
n
1
n
Pn
i=1
xi x>
i → C, where C is a non-negative definite matrix.
max1≤i≤n x>
i xi → 0.
(A3) The function g(s) = g(s1 , . . . , sp ) is such that ∂g/∂si is continuous almost everywhere and E|(∂/∂Si )g(S)| < ∞, i = 1, . . . , p with S = (S1 , . . . , Sp ).
(A4) The function g(·) is homogeneous of order −1.
√
bS
Proposition 1 Assume (A1)-(A4) and λn = O( n). Then, the shrinkage estimator β
n
has smaller L2 -risk than the LASSO, for all g(·) satisfying the following inequality
L
b )k2 + 2σ 2 tr CE [∇g(Z)] < 0,
nkg(β
n
almost everywhere in g.
(2.4)
where ∇g(S) = (∂g(S)/∂S1 , . . . , ∂g(S)/∂Sp )> and Z ∼ Np (0, σ 2 C −1 ).
Proof. Consider the difference in L2 -risk given by
h
i
h
i
b L − βk2 − lim E nkβ
b S − βk2
lim E nkβ
n
n
n→∞
n h
i n→∞ h
io
L 2
L
b )k + 2E n(β
b − β)> g(β
b L)
= − lim E nkg(β
n
n
n
D =
n→∞
(2.5)
√
√
Since λn is n-consistent, i.e., λn = O( n), using Theorem 1 of Knight and Fu (2000),
√ bL
√ bL
D
n(β n − β) → Np (0, σ 2 C −1 ). Let Z = n(β
n − β). Using homogeneity of g(·) and
Lemma 1 of Liu (1994), we obtain
i
L
L
>
b
b
lim E n(β n − β) g(β n ) =
h
n→∞
4
lim
n→∞
√
√
nE Z > g( nZ)
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
=
lim E Z > g(Z)
n→∞
2
= σ lim tr CE [∇g(Z)]
n→∞
(2.6)
Substituting (2.6) in (2.5) gives
n
o
b L )k2 + 2σ 2 tr CE [∇g(Z)]
D = − lim E nkg(β
n
n→∞
Since the expected value of a negative random variable is negative, the result follows.
Downloaded by [University of Florida] at 07:04 27 October 2017
Remark 1 In Proposition 1, the condition (2.4) can be rewritten as
L
b )k2 + 2σ 2 tr CE [∇g(Z)] < 0,
lim nkg(β
n
almost everywhere in g
n→∞
since σ 2 limn→∞ tr CE [∇g(Z)] = σ 2 tr CE [∇g(Z)].
Remark 2 The result of Liu (1994), which was used in the proof of Proposition1, is
based on the identity of Stein (1981). Hence, the statistical properties of the shrinkage
b S can be derived under √n-consistency of λn and the results of Stein (1981).
estimator β
n
Therefore, we concentrate on Stein-type shrinkage estimators including the well-known
Baranchik (1970) as relevant candidates.
2
b L )> (X > X)β
b L /b
Now, let a = (n − p)(p − 2)/(n − p + 2), Wn = (β
b2 is a consistent
n
n σ and σ
estimator of σ 2 and X = (x1 , . . . , xn )> . According to Remark 2, the well-known SteinL
L
b ) = −aW −1 β
b , for small enough
type shrinkage estimator is obtained if we take g(β
n
n
n
a. However, incorporating such function in (2.3), gives an estimator with undesirable
properties. Apparently as soon as Wn < a, the proposed estimator changes the sign of
LASSO. On the other hand, the new estimator does not scale LASSO component-wise.
b L = (βbL , . . . , βbL )> , we define the Stein-type LASSO (SL) estimator with
Hence, for β
n
1n
pn
form
b SL =
β
n
>
L
1 − aWn−1 βbjn
|j = 1, . . . , p .
5
(2.7)
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
To avoid negative values, the positive part of SL, namely positive rule Stein-type LASSO
(PRSL) will be defined as
b PRSL =
β
n
1 − aWn−1
+
L
βbjn
|j = 1, . . . , p
>
,
(2.8)
where b+ = max(0, b).
Then, the L2 -risk difference is given by
Downloaded by [University of Florida] at 07:04 27 October 2017
b SL ) − R(β; β
b PRSL )
D1 = R(β; β
n
n
2 X L
−1 2
= − lim n
I (Wn < a) βbjn
E 1 − aWn
n→∞
j
i
X h
L bL
−1
b
+2 lim n
E 1 − aWn I (Wn < a) βjn (βjn − βj )
n→∞
j
< 0,
since for values Wn < a, 1 − aWn−1 < 0 and the expected value of a positive random
variable is always positive. Hence the positive part of SL has uniformly smaller L2 -risk
compared to SL.
Following Baranchik (1970), we also investigate the performance of the following alternative candidates
b SL2 =
β
n
a
1−
Wn + 1
L
βbjn
|j = 1, . . . , p
>
(2.9)
and
b SL3
β
n
=
ar (Wn )
1−
Wn
L
βbjn
|j
where r(x) is a concave function w.r.t x, i.e., r(x) =
√
>
= 1, . . . , p
(2.10)
x or r(x) = log |x|. These two later
estimators are only considered in the real example.
In forthcoming section, we investigate the performance of the PRSL estimator compared to the LASSO, via a Monte Carlo simulation.
6
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
3
Simulation
In this section we conduct a Monte Carlo simulation study to evaluate the performance
of the PRSL with respect to the LASSO of Tibshirani (1996).
We generate the vector of responses from following model:
Yi = β1 x1i + . . . + βp xpi + i , i = 1, . . . , n,
(3.11)
where E(i |xi ) = 0 and E(2i ) = 1. Furthermore, we generated the predictors xij and
Downloaded by [University of Florida] at 07:04 27 October 2017
errors i from N (0, 1). We consider the sample size n ∈ {50, 100} and the number of
predictor variables p ∈ {10, 20, 30}. We used the similar scheme of Hansen (2007, 2016).
√
Hence, we consider the regression coefficients are set βj = c 2αj −α/2 with α = 0.1, 0.5, 1
for j = 1, . . . , p. The larger values of α indicates that the coefficients βj decline more
quickly with j. Also, the value of c controls the population R2 = c2 /(1 + c2 ), and is
selected on a 20-point grid in [0, R2 ].
The number of simulations is initially varied. Finally, each realization is repeated 1000
times to obtain stable results. For each realization, we calculated the MSE of suggested
estimators. All computations were conducted using the software R.
b ∗ was evaluated by using MSE criterion, scaled by
The performance of an estimator β
n
the MSE of LASSO so that the values of relative MSE (RMSE), is given by
∗
b
∗ MSE β
n
b
L .
RMSE β n =
b
MSE β
(3.12)
n
If the RMSE is less than one, then it indicates performance superior to the LASSO.
The results are reported graphically in Figure 1 for the ease of comparison. The
figure has six panel plots which correspond to three values of α for n = 50, 100 and
p = 10, 20, 30, and presents the RMSE values of the estimators in Equation 3.12 as a
function of the population R2 . According to these plots, we can see clear trends. For
example, in Figure 1(b), if the R2 varies from 0 to 0.1, then the PRSL has the smallest
RMSE when α = 0.1, which indicates that it performs better than LASSO, followed by
7
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
1.0
1.2
0.8
1.0
0.2
0.4
0.6
0.0
0.8
0.0
0.2
0.4
0.6
0.4
0.8
0.0
0.2
0.4
0.6
R2
R2
(d) n = 100, p = 10
(e) n = 100, p = 20
(f) n = 100, p = 30
0.6
0.8
1.2
1.0
0.8
0.2
0.4
0.6
0.8
0.4
0.2
0.2
0.0
LASSO
PRSL(α = 0.1)
PRSL(α = 0.5)
PRSL(α = 1)
0.0
0.4
0.6
0.6
0.4
R2
LASSO
PRSL(α = 0.1)
PRSL(α = 0.5)
PRSL(α = 1)
0.0
0.4
0.2
0.2
0.8
RMSE
RMSE
0.6
RMSE
0.8
0.8
1.0
1.0
1.2
R2
LASSO
PRSL(α = 0.1)
PRSL(α = 0.5)
PRSL(α = 1)
0.0
LASSO
PRSL(α = 0.1)
PRSL(α = 0.5)
PRSL(α = 1)
0.2
0.2
LASSO
PRSL(α = 0.1)
PRSL(α = 0.5)
PRSL(α = 1)
0.0
0.4
0.6
0.6
RMSE
RMSE
0.8
0.8
0.6
RMSE
0.4
0.2
LASSO
PRSL(α = 0.1)
PRSL(α = 0.5)
PRSL(α = 1)
0.0
Downloaded by [University of Florida] at 07:04 27 October 2017
(c) n = 50, p = 30
1.2
(b) n = 50, p = 20
1.0
(a) n = 50, p = 10
0.0
0.2
R2
0.4
0.6
0.8
R2
Figure 1: The RMSEs of suggested estimator for different values of α when R2 ∈ [0, 0.8]
the PRSL when α = 0.5 and α = 1. On the other hand, for the intermediate values of R2 ,
the performance of PRSL is less efficient than the performance of LASSO. As summary,
the performance of PRSL is more efficient than LASSO for the small values of population
R2 , and it looses its efficiency when we increase in small amounts R2 , and finally the
relative performance of all estimators become almost similar when R2 is close to 0.8.
4
Prostate Data
Prostate data came from the study of Stamey et. al (1989) about correlation between
the level of prostate specific antigen (PSA), and a number of clinical measures in men
who were about to receive radical prostatectomy. The data consist of 97 measurements
on the following variables: log cancer volume (lcavol), log prostate weight (lweight), age
(age), log of benign prostatic hyperplasia amount (lbph), log of capsular penetration (lcp),
seminal vesicle invasion (svi), Gleason score (gleason), and percent of Gleason scores 4 or
5 (pgg45). The idea is to predict log of PSA (lpsa) from these measured variables.
A descriptions of the variables in this dataset is given in Table 1.
8
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
Downloaded by [University of Florida] at 07:04 27 October 2017
Variables
lpsa
lcavol
lweight
age
lbph
svi
lcp
gleason
pgg45
Table 1: Discription of the variables of prostate data
Description
Remarks
Log of prostate specific antigen (PSA)
Response
Log cancer volume
Log prostate weight
Age
Age in years
Log of benign prostatic hyperplasia amount
Seminal vesicle invasion
Log of capsular penetration
Gleason score
A numeric vector
Percent of Gleason scores 4 or 5
Table 2: Estimation coefficients of the variables
√ of prostate data
LASSO PRSL
SL2 SL3(r(x) = x) SL3(r(x) = log |x|)
coef
2.478 2.294 2.303
0.852
1.691
lcavol
0.472 0.437 0.438
0.162
0.322
lweight
0.186 0.173 0.173
0.064
0.127
age
0.000 0.000 0.000
0.000
0.000
lbph
0.000 0.000 0.000
0.000
0.000
svi
0.368 0.340 0.342
0.126
0.251
lcp
0.000 0.000 0.000
0.000
0.000
gleason
0.000 0.000 0.000
0.000
0.000
pgg45
0.000 0.000 0.000
0.000
0.000
PE
3.316 2.533 2.540
2.338
1.112
RPE
1.000 0.764 0.766
0.705
0.335
Our results are based on 1000 case resampled bootstrap samples. Since there is no
noticeable variation for larger number of replications, we did not consider further values.
The performance of an estimator is evaluated by its prediction error (PE) via 10-fold cross
validation (CV) for each bootstrap replicate. In order to easily compare, we also calculated
the relative prediction error (RPE) of an estimator with respect to the prediction error of
the LASSO. If the RPE of an estimator is less than one, then its performance is superior
to the LASSO. In Table 2, we report both the estimation coefficient and the PEs of the
five methods. According to these results, all suggested estimators outperform the LASSO.
Figure 2 shows each estimates as a function of standardized bound s = |β|/max|β|.
The vertical line represents the model for sb = 0.44, the optimal value selected “one
standard error” rule with 10-fold CV, in which we choose the most parsimonious model
whose error is no more than one standard error above the error of the best model. So, all
methods gave non-zero coefficients to lcavol, lweight and svi. Also, Figure 3 shows box
9
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
LASSO
0.8
Predictors
lcavol
0.6
lweight
Coeffecients
age
lbph
svi
0.2
lcp
gleason
0.0
pgg45
0.00
0.25
0.50
s
0.75
1.00
Coeffecients
SL2
Coeffecients
PRSL
0.6
0.4
0.2
0.0
0.6
0.4
0.2
0.0
0.00
0.25
0.50
s
0.75
1.00
0.00
SL3 when r(x)= x
0.25
0.50
s
0.75
1.00
SL3 when r(x)=log( x )
0.5
0.4
0.3
0.2
0.1
0.0
−0.1
Coeffecients
Coeffecients
Downloaded by [University of Florida] at 07:04 27 October 2017
0.4
0.2
0.1
0.0
0.00
0.25
0.50
s
0.75
1.00
0.00
0.25
0.50
s
0.75
1.00
Figure 2: The estimation of coefficients versus s tuning parameter of each methods. Here
s is selected via 10-fold CV. The vertical line sb = 0.44 is selected by “one standard error”
rule.
10
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
LASSO
1.00
Values
0.75
0.25
0.00
age
gleason
lbph
lcavol
lcp
Predictors
PRSL
1.00
lweight
pgg45
svi
SL2
1.00
0.75
Values
Values
0.75
0.50
0.50
0.25
0.25
0.00
0.00
age gleason lbph
lcavol
lcp
Predictors
lweight pgg45
svi
age gleason lbph
SL3 when r(x)= x
lcavol
lcp
Predictors
lweight pgg45
svi
SL3 when r(x)=log( x )
0.4
0.6
0.3
Values
Values
Downloaded by [University of Florida] at 07:04 27 October 2017
0.50
0.2
0.4
0.1
0.2
0.0
0.0
−0.1
age gleason lbph
lcavol
lcp
Predictors
lweight pgg45
svi
age gleason lbph
lcavol
lcp
Predictors
lweight pgg45
svi
Figure 3: Box plots of 1000 bootstrap values of the listed methods coefficient estimates
for the eight predictors in the prostate cancer example
11
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
plots of 1000 bootstrap replications of each methods with sb = 0.44. And, the results are
consistent with Tibshirani (1996).
5
Conclusions
In this paper, we employed the shrinkage idea of Stein (1981) to shrink the LASSO of
Tibshirani (1996) more. Hence, under the concept of double shrinking, we proposed a double shrinkage estimator namely Stein-type LASSO. Some other similar double shrinkage
Downloaded by [University of Florida] at 07:04 27 October 2017
estimators including the positive part of Stein-type LASSO also proposed as alternative
options. Performance analysis of the proposed estimators investigated through a MonteCarlo simulation as well as a real data analysis. The new set of estimators propose smaller
L2 -risk compared to the LASSO. Moreover, the prostate cancer data analysis illustrated
that the Stein-type LASSO estimators have smaller prediction error compared to the
LASSO.
Regarding the function g(·) in (2.3), numerical analysis illustrated that concave and
differentiable functions behave superiorly. Further, our proposal will also work for the
minimizer of (2.2) for all values γ > 0, including the ridge regression estimator and
subset selector. Hence, the proposed methodology can be applied for other estimators.
Apart from this, there are many competitors to the LASSO in the context of variable
selection, where we only focused on LASSO for the purpose of defining double shrinking
idea. For further research, one can use this method to define double shrunken estimator
other than the Stein-type LASSO. As such one can define the Stein-type SCAD (Fan and
Li, 2001) adaptive LASSO (Zou, 2006), MCP (Zhang, 2010) and group LASSO (Yuan
and Lin, 2006) estimators.
Acknowledgments
We would like to thank two anonymous referees for their valuable and constructive comments which significantly improved the presentation of the paper and led to put many
details.
12
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
References
Ahmed, S.E. (2014). Penalty, Shrinkage and Pretest Strategies: Variable Selection and Estimation,
Springer, New York.
Ahmed, S.E. (2001). Shrinkage estimation of regression coefficients from censored data with multiple
observations. In Ahmed, S. E. and Reid, N., (Eds.), Lecture Notes in Statistics, 148, 103–120. SpringerVerlag, New York.
Ahmed, S.E., Kareev, I., Suraphee, S., Volodin, A., and Volodin, I. (2015) Confidence sets based on the
positive part James-Stein estimator with the asymptotically constant coverage probability, J. Statist.
Downloaded by [University of Florida] at 07:04 27 October 2017
Comp. Sim., 85(12), 2506-2513.
Ahmed, S.E. and Fallahpour, S. (2012). Shrinkage Estimation Strategy in Quasi-Likelihood Models,
Statist. Probab. Lett., 82, 2170–2179.
Ahmed, S.E., Doksum, K.A., Hossain, S. and You, J. (2007). Shrinkage, pretest and absolute penalty
estimators in partially linear models, Aust. New Zealand J. Statist., 49(4), 435-454.
Asar, Y. (2017). Some new methods to solve multicollinearity in logistic regression, Comm. Statist. Sim.
Comp., 46, 2576-2586.
Ali, A. M., and Ehanses Saleh, M. A. (1990). Estimation of the mean vector of a multivariate normal
distribution under symmetry. Journal of Statistical Computation and Simulation, 35(3-4), 209-226.
Baranchik, A.J. (1970). A family of minimax estimators of the mean of a multivariate normal distribution,
Ann. Math. Statist., 41(2), 642-645.
Chang, S. -M. (2015). Double shrinkage estimators for large sparse covariance matrices, J. Statist. Comp.
Sim., 85(8), 1497-1511
Fan, J., and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties,
J. Amer. Statist. Assoc., 96(456), 1348-1360.
Groβ, J. (2012). Linear regression, Springer Science & Business Media.
Hansen, B. E. (2007). Least squares model averaging, Econometrica, 75(4), 1175-1189.
Hansen, B. E. (2016). The risk of James–Stein and Lasso shrinkage, Econometric Rev., 35, 1456-1470.
James, W. and Stein, C. (1961). Estimation of quadratic loss, Proc. Fourth Berkeley Symp. Math. Statist.
Prob., 1, 361-379.
13
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
Knight, K. and Fu, W. (2000). Asymptotics for LASSO-type estimators, Ann. Statist., 10, 1356-1378.
Liu, J.S. (1994). Siegel’s formula via Stein’s identities, Statist. Prob. Lett., 21, 247-251.
Roozbeh, M. and Arashi, M. (2016). Shrinkage ridge regression in partial linear models, Comm. Statist.
Sim. Comp., 45(20), 6022-6044.
Saleh. A. K. Md. Ehsanes. (2006). Theory of Preliminary Test and Stein-Type Estimation with Applications, Wiley; United Stated of America.
Saleh, A. K. Md. Ehsanes and Raheem, E. (2015a). Improved LASSO, arXiv:1503.05160v1, 1-46.
Downloaded by [University of Florida] at 07:04 27 October 2017
Saleh, A. K. Md. Ehsanes and Raheem, E. (2015b). Penalty, Shrinkage, and Preliminary Test Estimators
under Full Model Hypothesis, arXiv:1503.06910, 1-28.
Stamey, T.A., Kabalin, J.N., McNeal, J.E., Johnstone, I.M., Freiha, F., Redwine, E.A. and Yang, N.
(1989). Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate: II.
radical prostatectomy treated patients, J Urology, 141(5), 1076-1083.
Stein, C. (1981). Estimation of the mean of a multivariate normal distribution, Ann. Statist., 9, 1135-1151.
Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO, J. R. Stat. Soc. Ser. B. Stat.
Methodol., 58(1), 267-288.
Yüzbaşı, B., Ahmed, S.E. and Gungor, M. (2017). Improved Penalty Strategies in Linear Regression
Models, REVSTAT–Statist. J., 15(2), 251-276.
Yüzbaşı, B., and Ahmed, S.E. (2016). Shrinkage and penalized estimation in semi-parametric models
with multicollinear data, J. Stat. Comput. Simul., 86(17), 3543-3561.
Yuan, M., and Lin, Y. (2006). Model selection and estimation in regression with grouped variables, J. R.
Stat. Soc. Ser. B. Stat. Methodol., 68(1), 49-67.
Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty, Ann. Statist.,
38(2), 894-942.
Zou, H. (2006). The adaptive lasso and its oracle properties, J. Amer. Statist. Assoc., 101(476), 1418-1429.
14
ACCEPTED MANUSCRIPT
Документ
Категория
Без категории
Просмотров
0
Размер файла
620 Кб
Теги
2017, 03610918, 1395040
1/--страниц
Пожаловаться на содержимое документа