Functions to find the optimal ridge and fusion penalty parameters via leave-one-out cross validation. The functions support leave-one-out cross-validation (LOOCV), \(k\)-fold CV, and two forms of approximate LOOCV. Depending on the used function, general numerical optimization or a grid-based search is used.

optPenalty.fused.grid(
  Ylist,
  Tlist,
  lambdas = 10^seq(-5, 5, length.out = 15),
  lambdaFs = lambdas,
  cv.method = c("LOOCV", "aLOOCV", "sLOOCV", "kCV"),
  k = 10,
  verbose = TRUE,
  ...
)

optPenalty.fused.auto(
  Ylist,
  Tlist,
  lambda,
  cv.method = c("LOOCV", "aLOOCV", "sLOOCV", "kCV"),
  k = 10,
  verbose = TRUE,
  lambda.init,
  maxit.ridgeP.fused = 1000,
  optimizer = "optim",
  maxit.optimizer = 1000,
  debug = FALSE,
  optim.control = list(trace = verbose, maxit = maxit.optimizer),
  ...
)

optPenalty.fused(
  Ylist,
  Tlist,
  lambda = default.penalty(Ylist),
  cv.method = c("LOOCV", "aLOOCV", "sLOOCV", "kCV"),
  k = 10,
  grid = FALSE,
  ...
)

Arguments

Ylist

A list of \(G\) matrices of data with \(n_g\) samples in the rows and \(p\) variables in the columns corresponding to \(G\) classes of data.

Tlist

A list of \(G\) of p.d. class target matrices of size \(p\) times \(p\).

lambdas

A numeric vector of positive ridge penalties.

lambdaFs

A numeric vector of non-negative fusion penalties.

cv.method

character giving the cross-validation (CV) to use. The allowed values are "LOOCV", "aLOOCV", "sLOOCV", "kCV" for leave-one-out cross validation (LOOCV), appproximate LOOCV, special LOOCV, and k-fold CV, respectively.

k

integer giving the number of approximately equally sized parts each class is partioned into for \(k\)-fold CV. Only use if cv.method is "kCV".

verbose

logical. If TRUE, progress information is printed to the console.

...

For optPenalty.fused, arguments are passed to optPenalty.fused.grid or optPenalty.fused.auto depending on the value of grid. In optPenalty.fused.grid, arguments are passed to ridgeP.fused. In optPenalty.fused.auto, arguments are passed to the optimizer.

lambda

A symmetric character matrix encoding the class of penalty matrices to cross-validate over. The diagonal elements correspond to the class-specific ridge penalties whereas the off-diagonal elements correspond to the fusion penalties. The unique elements of lambda specify the penalties to determine by the method specified by cv.method. The penalties can be fixed if they are coercible to numeric values, such as e.g. "0", "2.71" or "3.14". Fusion between pairs can be "left out"" using either of "", NA, "NA", or "0". See default.penalty for help on the construction hereof and more details. Unused and can be omitted if grid == TRUE.

lambda.init

A numeric penalty matrix of initial values passed to the optimizer. If omitted, the function selects a starting values using a common ridge penaltiy (determined by 1D optimization) and all fusion penalties to zero.

maxit.ridgeP.fused

A integer giving the maximum number of iterations allowed for each fused ridge fit.

optimizer

character. Either "optim" or "nlm" determining which optimizer to use.

maxit.optimizer

A integer giving the maximum number of iterations allowed in the optimization procedure.

debug

logical. If TRUE additional output from the optimizer is appended to the output as an attribute.

optim.control

A list of control arguments for optim.

grid

logical. Should a grid based search be used? Default is FALSE.

Value

optPenalty.fused.auto returns a list:

Plist

A list of the precision estimates for the optimal parameters.

lambda

The estimated optimal fused penalty matrix.

lambda.unique

The unique entries of the lambda. A more concise overview of lambda

value

The value of the loss function in the estimated optimum.

optPenalty.fused.LOOCV returns a list:

ridge

A numeric vector of grid values for the ridge penalty

fusion

The numeric vector of grid values for the fusion penalty

fcvl

The numeric matrix of evaluations of the loss function

Details

optPenalty.fused.auto serves a utilizes optim for identifying the optimal fused parameters and works for general classes of penalty graphs.

optPenalty.fused.grid gives a grid-based evaluation of the (approximate) LOOCV loss.

References

Bilgrau, A.E., Peeters, C.F.W., Eriksen, P.S., Boegsted, M., and van Wieringen, W.N. (2020). Targeted Fused Ridge Estimation of Inverse Covariance Matrices from Multiple High-Dimensional Data Classes. Journal of Machine Learning Research, 21(26): 1-52.

See also

See also default.penalty, optPenalty.LOOCV.

Author

Anders Ellern Bilgrau, Carel F.W. Peeters <carel.peeters@wur.nl>, Wessel N. van Wieringen

Examples

if (FALSE) {
# Generate some (not so) high-dimensional data witn (not so) many samples
ns <- c(4, 5, 6)
Ylist <- createS(n = ns, p = 6, dataset = TRUE)
Slist <- lapply(Ylist, covML)
Tlist <- default.target.fused(Slist, ns, type = "DIAES")


# Grid-based
lambdas <- 10^seq(-5, 3, length.out = 7)
a <- optPenalty.fused.grid(Ylist, Tlist,
                           lambdas = lambdas,
                           cv.method = "LOOCV", maxit = 1000)
b <- optPenalty.fused.grid(Ylist, Tlist,
                           lambdas = lambdas,
                           cv.method = "aLOOCV", maxit = 1000)
c <- optPenalty.fused.grid(Ylist, Tlist,
                           lambdas = lambdas,
                           cv.method = "sLOOCV", maxit = 1000)
d <- optPenalty.fused.grid(Ylist, Tlist,
                           lambdas = lambdas,
                           cv.method = "kCV", k = 2, maxit = 1000)

# Numerical optimization (uses the default "optim" optimizer with method "BFGS")
aa <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "LOOCV", method = "BFGS")
print(aa)
bb <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "aLOOCV", method = "BFGS")
print(bb)
cc <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "sLOOCV", method = "BFGS")
print(cc)
dd <- optPenalty.fused.auto(Ylist, Tlist, cv.method = "kCV", k=3, method="BFGS")
print(dd)


#
# Plot the results
#

# LOOCV
# Get minimums and plot
amin  <- log(expand.grid(a$lambda, a$lambdaF))[which.min(a$fcvl), ]
aamin <- c(log(aa$lambda[1,1]), log(aa$lambda[1,2]))

# Plot
filled.contour(log(a$lambda), log(a$lambdaF), log(a$fcvl), color = heat.colors,
               plot.axes = {points(amin[1], amin[2], pch = 16);
                            points(aamin[1], aamin[2], pch = 16, col = "purple");
                            axis(1); axis(2)},
               xlab = "lambda", ylab = "lambdaF", main = "LOOCV")

# Approximate LOOCV
# Get minimums and plot
bmin <- log(expand.grid(b$lambda, b$lambdaF))[which.min(b$fcvl), ]
bbmin <- c(log(bb$lambda[1,1]), log(unique(bb$lambda[1,2])))

filled.contour(log(b$lambda), log(b$lambdaF), log(b$fcvl), color = heat.colors,
               plot.axes = {points(bmin[1], bmin[2], pch = 16);
                            points(bbmin[1], bbmin[2], pch = 16, col ="purple");
                            axis(1); axis(2)},
               xlab = "lambda", ylab = "lambdaF", main = "Approximate LOOCV")


#
# Arbitrary penalty graphs
#

# Generate some new high-dimensional data and a 2 by 2 factorial design
ns <- c(6, 5, 3, 2)
df <- expand.grid(Factor1 = LETTERS[1:2], Factor2 = letters[3:4])
Ylist <- createS(n = ns, p = 4, dataset = TRUE)
Tlist <- lapply(lapply(Ylist, covML), default.target, type = "Null")

# Construct penalty matrix
lambda <- default.penalty(df, type = "CartesianUnequal")

# Find optimal parameters,
# Using optim with method "Nelder-Mead" with "special" LOOCV
ans1 <- optPenalty.fused(Ylist, Tlist, lambda = lambda,
                         cv.method = "sLOOCV", verbose = FALSE)
print(ans1$lambda.unique)

# By approximate LOOCV using optim with method "BFGS"
ans2 <- optPenalty.fused(Ylist, Tlist, lambda = lambda,
                         cv.method = "aLOOCV", verbose = FALSE,
                         method = "BFGS")
print(ans2$lambda.unique)

# By LOOCV using nlm
lambda.init <- matrix(1, 4, 4)
lambda.init[cbind(1:4,4:1)] <- 0
ans3 <- optPenalty.fused(Ylist, Tlist, lambda = lambda,
                         lambda.init = lambda.init,
                         cv.method = "LOOCV", verbose = FALSE,
                         optimizer = "nlm")
print(ans3$lambda.unique)

# Quite different results!


#
# Arbitrary penalty graphs with fixed penalties!
#

# Generate some new high-dimensional data and a 2 by 2 factorial design
ns <- c(6, 5, 5, 5)
df <- expand.grid(DS = LETTERS[1:2], ER = letters[3:4])
Ylist <- createS(n = ns, p = 4, dataset = TRUE)
Tlist <- lapply(lapply(Ylist, covML), default.target, type = "Null")

lambda <- default.penalty(df, type = "Tensor")
print(lambda)  # Say we want to penalize the pair (1,2) with strength 2.1;
lambda[2,1] <- lambda[1,2] <- 2.1
print(lambda)

# Specifiying starting values is also possible:
init <- diag(length(ns))
init[2,1] <- init[1,2] <- 2.1

res <- optPenalty.fused(Ylist, Tlist, lambda = lambda, lambda.init = init,
                        cv.method = "aLOOCV", optimizer = "nlm")
print(res)
}