Functions to find the optimal ridge and fusion penalty parameters via leave-one-out cross validation. The functions support leave-one-out cross-validation (LOOCV), \(k\)-fold CV, and two forms of approximate LOOCV. Depending on the used function, general numerical optimization or a grid-based search is used.
optPenalty.fused.grid(
Ylist,
Tlist,
lambdas = 10^seq(-5, 5, length.out = 15),
lambdaFs = lambdas,
cv.method = c("LOOCV", "aLOOCV", "sLOOCV", "kCV"),
k = 10,
verbose = TRUE,
...
)
optPenalty.fused.auto(
Ylist,
Tlist,
lambda,
cv.method = c("LOOCV", "aLOOCV", "sLOOCV", "kCV"),
k = 10,
verbose = TRUE,
lambda.init,
maxit.ridgeP.fused = 1000,
optimizer = "optim",
maxit.optimizer = 1000,
debug = FALSE,
optim.control = list(trace = verbose, maxit = maxit.optimizer),
...
)
optPenalty.fused(
Ylist,
Tlist,
lambda = default.penalty(Ylist),
cv.method = c("LOOCV", "aLOOCV", "sLOOCV", "kCV"),
k = 10,
grid = FALSE,
...
)
A list
of \(G\) matrices of data with \(n_g\) samples
in the rows and \(p\) variables in the columns corresponding to \(G\)
classes of data.
A list
of \(G\) of p.d. class target matrices of size
\(p\) times \(p\).
A numeric
vector of positive ridge penalties.
A numeric
vector of non-negative fusion penalties.
character
giving the cross-validation (CV) to use.
The allowed values are "LOOCV"
, "aLOOCV"
, "sLOOCV"
,
"kCV"
for leave-one-out cross validation (LOOCV), appproximate
LOOCV, special LOOCV, and k-fold CV, respectively.
integer
giving the number of approximately equally sized
parts each class is partioned into for \(k\)-fold CV. Only use if
cv.method
is "kCV"
.
logical
. If TRUE
, progress information is
printed to the console.
For optPenalty.fused
, arguments are passed to
optPenalty.fused.grid
or optPenalty.fused.auto
depending on
the value of grid
. In optPenalty.fused.grid
, arguments are
passed to ridgeP.fused
. In optPenalty.fused.auto
, arguments
are passed to the optimizer.
A symmetric character
matrix
encoding the class
of penalty matrices to cross-validate over. The diagonal elements
correspond to the class-specific ridge penalties whereas the off-diagonal
elements correspond to the fusion penalties. The unique elements of lambda
specify the penalties to determine by the method specified by
cv.method
. The penalties can be fixed if they are coercible to
numeric values, such as e.g. "0"
, "2.71"
or "3.14"
.
Fusion between pairs can be "left out"" using either of ""
,
NA
, "NA"
, or "0"
. See default.penalty
for help on the construction hereof and more details. Unused and can be
omitted if grid == TRUE
.
A numeric
penalty matrix
of initial values
passed to the optimizer. If omitted, the function selects a starting values
using a common ridge penaltiy (determined by 1D optimization) and all
fusion penalties to zero.
A integer
giving the maximum number of
iterations allowed for each fused ridge fit.
character
. Either "optim"
or "nlm"
determining which optimizer to use.
A integer
giving the maximum number of
iterations allowed in the optimization procedure.
logical
. If TRUE
additional output from the
optimizer is appended to the output as an attribute.
A list
of control arguments for
optim
.
logical.
Should a grid based search be used? Default is
FALSE
.
optPenalty.fused.auto
returns a list
:
A
list
of the precision estimates for the optimal parameters.
The estimated optimal fused penalty matrix.
The unique entries of the lambda
. A more
concise overview of lambda
The value of the loss function in the estimated optimum.
optPenalty.fused.LOOCV
returns a list
:
A
numeric
vector of grid values for the ridge penalty
The numeric
vector of grid values for the fusion
penalty
The numeric
matrix
of evaluations of the
loss function
optPenalty.fused.auto
serves a utilizes optim
for
identifying the optimal fused parameters and works for general classes of
penalty graphs.
optPenalty.fused.grid
gives a grid-based evaluation of the
(approximate) LOOCV loss.
Bilgrau, A.E., Peeters, C.F.W., Eriksen, P.S., Boegsted, M., and van Wieringen, W.N. (2020). Targeted Fused Ridge Estimation of Inverse Covariance Matrices from Multiple High-Dimensional Data Classes. Journal of Machine Learning Research, 21(26): 1-52.
See also default.penalty
, optPenalty.LOOCV
.
if (FALSE) {
# Generate some (not so) high-dimensional data witn (not so) many samples
ns <- c(4, 5, 6)
Ylist <- createS(n = ns, p = 6, dataset = TRUE)
Slist <- lapply(Ylist, covML)
Tlist <- default.target.fused(Slist, ns, type = "DIAES")
# Grid-based
lambdas <- 10^seq(-5, 3, length.out = 7)
a <- optPenalty.fused.grid(Ylist, Tlist,
lambdas = lambdas,
cv.method = "LOOCV", maxit = 1000)
b <- optPenalty.fused.grid(Ylist, Tlist,
lambdas = lambdas,
cv.method = "aLOOCV", maxit = 1000)
c <- optPenalty.fused.grid(Ylist, Tlist,
lambdas = lambdas,
cv.method = "sLOOCV", maxit = 1000)
d <- optPenalty.fused.grid(Ylist, Tlist,
lambdas = lambdas,
cv.method = "kCV", k = 2, maxit = 1000)
# Numerical optimization (uses the default "optim" optimizer with method "BFGS")
aa <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "LOOCV", method = "BFGS")
print(aa)
bb <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "aLOOCV", method = "BFGS")
print(bb)
cc <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "sLOOCV", method = "BFGS")
print(cc)
dd <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "kCV", k=3, method="BFGS")
print(dd)
#
# Plot the results
#
# LOOCV
# Get minimums and plot
amin <- log(expand.grid(a$lambda, a$lambdaF))[which.min(a$fcvl), ]
aamin <- c(log(aa$lambda[1,1]), log(aa$lambda[1,2]))
# Plot
filled.contour(log(a$lambda), log(a$lambdaF), log(a$fcvl), color = heat.colors,
plot.axes = {points(amin[1], amin[2], pch = 16);
points(aamin[1], aamin[2], pch = 16, col = "purple");
axis(1); axis(2)},
xlab = "lambda", ylab = "lambdaF", main = "LOOCV")
# Approximate LOOCV
# Get minimums and plot
bmin <- log(expand.grid(b$lambda, b$lambdaF))[which.min(b$fcvl), ]
bbmin <- c(log(bb$lambda[1,1]), log(unique(bb$lambda[1,2])))
filled.contour(log(b$lambda), log(b$lambdaF), log(b$fcvl), color = heat.colors,
plot.axes = {points(bmin[1], bmin[2], pch = 16);
points(bbmin[1], bbmin[2], pch = 16, col ="purple");
axis(1); axis(2)},
xlab = "lambda", ylab = "lambdaF", main = "Approximate LOOCV")
#
# Arbitrary penalty graphs
#
# Generate some new high-dimensional data and a 2 by 2 factorial design
ns <- c(6, 5, 3, 2)
df <- expand.grid(Factor1 = LETTERS[1:2], Factor2 = letters[3:4])
Ylist <- createS(n = ns, p = 4, dataset = TRUE)
Tlist <- lapply(lapply(Ylist, covML), default.target, type = "Null")
# Construct penalty matrix
lambda <- default.penalty(df, type = "CartesianUnequal")
# Find optimal parameters,
# Using optim with method "Nelder-Mead" with "special" LOOCV
ans1 <- optPenalty.fused(Ylist, Tlist, lambda = lambda,
cv.method = "sLOOCV", verbose = FALSE)
print(ans1$lambda.unique)
# By approximate LOOCV using optim with method "BFGS"
ans2 <- optPenalty.fused(Ylist, Tlist, lambda = lambda,
cv.method = "aLOOCV", verbose = FALSE,
method = "BFGS")
print(ans2$lambda.unique)
# By LOOCV using nlm
lambda.init <- matrix(1, 4, 4)
lambda.init[cbind(1:4,4:1)] <- 0
ans3 <- optPenalty.fused(Ylist, Tlist, lambda = lambda,
lambda.init = lambda.init,
cv.method = "LOOCV", verbose = FALSE,
optimizer = "nlm")
print(ans3$lambda.unique)
# Quite different results!
#
# Arbitrary penalty graphs with fixed penalties!
#
# Generate some new high-dimensional data and a 2 by 2 factorial design
ns <- c(6, 5, 5, 5)
df <- expand.grid(DS = LETTERS[1:2], ER = letters[3:4])
Ylist <- createS(n = ns, p = 4, dataset = TRUE)
Tlist <- lapply(lapply(Ylist, covML), default.target, type = "Null")
lambda <- default.penalty(df, type = "Tensor")
print(lambda) # Say we want to penalize the pair (1,2) with strength 2.1;
lambda[2,1] <- lambda[1,2] <- 2.1
print(lambda)
# Specifiying starting values is also possible:
init <- diag(length(ns))
init[2,1] <- init[1,2] <- 2.1
res <- optPenalty.fused(Ylist, Tlist, lambda = lambda, lambda.init = init,
cv.method = "aLOOCV", optimizer = "nlm")
print(res)
}