Function that performs an 'automatic' search for the optimal penalty
parameter for the ridgeP
call by employing Brent's method to
the calculation of a cross-validated negative log-likelihood score.
optPenalty.kCVauto(
Y,
lambdaMin,
lambdaMax,
lambdaInit = (lambdaMin + lambdaMax)/2,
fold = nrow(Y),
cor = FALSE,
target = default.target(covML(Y)),
type = "Alt"
)
Data matrix
. Variables assumed to be represented by columns.
A numeric
giving the minimum value for the penalty
parameter.
A numeric
giving the maximum value for the penalty
parameter.
A numeric
giving the initial (starting) value for
the penalty parameter.
A numeric
or integer
specifying the number of
folds to apply in the cross-validation.
A logical
indicating if the evaluation of the LOOCV score
should be performed on the correlation scale.
A target matrix
(in precision terms) for Type I ridge
estimators.
A character
indicating the type of ridge estimator to be
used. Must be one of: "Alt", "ArchI", "ArchII".
An object of class list
:
A numeric
giving the optimal value for the penalty parameter.
A
matrix
representing the precision matrix of the chosen type (see
ridgeP
) under the optimal value of the penalty parameter.
The function determines the optimal value of the penalty parameter by application of the Brent algorithm (1971) to the \(K\)-fold cross-validated negative log-likelihood score (using a regularized ridge estimator for the precision matrix). The search for the optimal value is automatic in the sense that in order to invoke the root-finding abilities of the Brent method, only a minimum value and a maximum value for the penalty parameter need to be specified as well as a starting penalty value. The value at which the \(K\)-fold cross-validated negative log-likelihood score is minimized is deemed optimal. The function employs the Brent algorithm as implemented in the optim function.
When cor = TRUE
correlation matrices are used in the
computation of the (cross-validated) negative log-likelihood score, i.e.,
the \(K\)-fold sample covariance matrix is a matrix on the correlation
scale. When performing evaluation on the correlation scale the data are
assumed to be standardized. If cor = TRUE
and one wishes to used the
default target specification one may consider using target =
default.target(covML(Y, cor = TRUE))
. This gives a default target under the
assumption of standardized data.
Under the default setting of the fold-argument, fold = nrow(Y)
, one
performes leave-one-out cross-validation.
Brent, R.P. (1971). An Algorithm with Guaranteed Convergence for Finding a Zero of a Function. Computer Journal 14: 422-425.
## Obtain some (high-dimensional) data
p = 25
n = 10
set.seed(333)
X = matrix(rnorm(n*p), nrow = n, ncol = p)
colnames(X)[1:25] = letters[1:25]
## Obtain regularized precision under optimal penalty using K = n
OPT <- optPenalty.kCVauto(X, lambdaMin = .001, lambdaMax = 30); OPT
#> $optLambda
#> [1] 13.42467
#>
#> $optPrec
#> A 25 x 25 ridge precision matrix estimate with lambda = 13.424670
#> a b c d e f …
#> a 0.755139352 -0.003028252 -0.04933164 -0.002421719 0.013977632 0.001114962 …
#> b -0.003028252 0.806519739 -0.01581207 -0.006536546 -0.024427366 0.005763776 …
#> c -0.049331636 -0.015812067 0.75928308 -0.030454558 0.020626619 0.012856311 …
#> d -0.002421719 -0.006536546 -0.03045456 0.801308975 0.020062895 0.001788722 …
#> e 0.013977632 -0.024427366 0.02062662 0.020062895 0.767347416 -0.005726924 …
#> f 0.001114962 0.005763776 0.01285631 0.001788722 -0.005726924 0.811360616 …
#> … 19 more rows and 19 more columns
#>
OPT$optLambda # Optimal penalty
#> [1] 13.42467
OPT$optPrec # Regularized precision under optimal penalty
#> A 25 x 25 ridge precision matrix estimate with lambda = 13.424670
#> a b c d e f …
#> a 0.755139352 -0.003028252 -0.04933164 -0.002421719 0.013977632 0.001114962 …
#> b -0.003028252 0.806519739 -0.01581207 -0.006536546 -0.024427366 0.005763776 …
#> c -0.049331636 -0.015812067 0.75928308 -0.030454558 0.020626619 0.012856311 …
#> d -0.002421719 -0.006536546 -0.03045456 0.801308975 0.020062895 0.001788722 …
#> e 0.013977632 -0.024427366 0.02062662 0.020062895 0.767347416 -0.005726924 …
#> f 0.001114962 0.005763776 0.01285631 0.001788722 -0.005726924 0.811360616 …
#> … 19 more rows and 19 more columns
## Another example with standardized data
X <- scale(X, center = TRUE, scale = TRUE)
OPT <- optPenalty.kCVauto(X, lambdaMin = .001, lambdaMax = 30, cor = TRUE,
target = default.target(covML(X, cor = TRUE))); OPT
#> $optLambda
#> [1] 1.706836
#>
#> $optPrec
#> A 25 x 25 ridge precision matrix estimate with lambda = 1.706836
#> a b c d e f …
#> a 0.873376735 0.005931253 -0.10948686 0.02092830 0.02758587 -0.03107073 …
#> b 0.005931253 0.861084085 -0.05863234 -0.06126328 -0.11026791 0.01354590 …
#> c -0.109486857 -0.058632335 0.92644719 -0.10928825 0.04836945 0.03566900 …
#> d 0.020928298 -0.061263276 -0.10928825 0.87328836 0.05739944 -0.00832902 …
#> e 0.027585872 -0.110267905 0.04836945 0.05739944 0.90730968 -0.04183658 …
#> f -0.031070733 0.013545902 0.03566900 -0.00832902 -0.04183658 0.86457495 …
#> … 19 more rows and 19 more columns
#>
OPT$optLambda # Optimal penalty
#> [1] 1.706836
OPT$optPrec # Regularized precision under optimal penalty
#> A 25 x 25 ridge precision matrix estimate with lambda = 1.706836
#> a b c d e f …
#> a 0.873376735 0.005931253 -0.10948686 0.02092830 0.02758587 -0.03107073 …
#> b 0.005931253 0.861084085 -0.05863234 -0.06126328 -0.11026791 0.01354590 …
#> c -0.109486857 -0.058632335 0.92644719 -0.10928825 0.04836945 0.03566900 …
#> d 0.020928298 -0.061263276 -0.10928825 0.87328836 0.05739944 -0.00832902 …
#> e 0.027585872 -0.110267905 0.04836945 0.05739944 0.90730968 -0.04183658 …
#> f -0.031070733 0.013545902 0.03566900 -0.00832902 -0.04183658 0.86457495 …
#> … 19 more rows and 19 more columns
## Another example using K = 5
OPT <- optPenalty.kCVauto(X, lambdaMin = .001, lambdaMax = 30, fold = 5); OPT
#> $optLambda
#> [1] 14.40738
#>
#> $optPrec
#> A 25 x 25 ridge precision matrix estimate with lambda = 14.407379
#> a b c d e f …
#> a 0.691478602 -0.003227029 -0.03222970 -0.002217116 0.009474089 0.001236740 …
#> b -0.003227029 0.691403158 -0.01655489 -0.009966873 -0.027470081 0.010617198 …
#> c -0.032229702 -0.016554888 0.69242538 -0.029799763 0.014524643 0.014598928 …
#> d -0.002217116 -0.009966873 -0.02979976 0.691409558 0.020899754 0.002791406 …
#> e 0.009474089 -0.027470081 0.01452464 0.020899754 0.691973904 -0.006678442 …
#> f 0.001236740 0.010617198 0.01459893 0.002791406 -0.006678442 0.691322457 …
#> … 19 more rows and 19 more columns
#>
OPT$optLambda # Optimal penalty
#> [1] 14.40738
OPT$optPrec # Regularized precision under optimal penalty
#> A 25 x 25 ridge precision matrix estimate with lambda = 14.407379
#> a b c d e f …
#> a 0.691478602 -0.003227029 -0.03222970 -0.002217116 0.009474089 0.001236740 …
#> b -0.003227029 0.691403158 -0.01655489 -0.009966873 -0.027470081 0.010617198 …
#> c -0.032229702 -0.016554888 0.69242538 -0.029799763 0.014524643 0.014598928 …
#> d -0.002217116 -0.009966873 -0.02979976 0.691409558 0.020899754 0.002791406 …
#> e 0.009474089 -0.027470081 0.01452464 0.020899754 0.691973904 -0.006678442 …
#> f 0.001236740 0.010617198 0.01459893 0.002791406 -0.006678442 0.691322457 …
#> … 19 more rows and 19 more columns