| Type: | Package |
| Version: | 0.1.1 |
| Date: | 2017-03-01 today |
| Title: | Sampling from Conditional C- and D-Vine Copulas |
| Description: | Provides tools for sampling from a conditional copula density decomposed via Pair-Copula Constructions as C- or D- vine. Here, the vines which can be used for such a sampling are those which sample as first the conditioning variables (when following the sampling algorithms shown in Aas et al. (2009) <doi:10.1016/j.insmatheco.2007.02.001>). The used sampling algorithm is presented and discussed in Bevacqua et al. (2017) <doi:10.5194/hess-2016-652>, and it is a modified version of that from Aas et al. (2009) <doi:10.1016/j.insmatheco.2007.02.001>. A function is available to select the best vine (based on information criteria) among those which allow for such a conditional sampling. The package includes a function to compare scatterplot matrices and pair-dependencies of two multivariate datasets. |
| Author: | Emanuele Bevacqua [aut, cre] |
| Maintainer: | Emanuele Bevacqua <emanuele.bevacqua@uni-graz.at> |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| Encoding: | UTF-8 |
| LazyData: | true |
| Imports: | combinat, graphics, stats, VineCopula |
| RoxygenNote: | 5.0.1 |
| Suggests: | testthat |
| Depends: | R (≥ 2.10) |
| NeedsCompilation: | no |
| Packaged: | 2017-07-28 14:25:59 UTC; emanuele |
| Repository: | CRAN |
| Date/Publication: | 2017-07-28 17:20:10 UTC |
Selection of a C- or D- vine copula model for conditional sampling
Description
This function fits either a C- or a D- vine model to a d-dimensional dataset of uniform variables.
The fit of the pair-copula families is performed sequentially through the function RVineCopSelect of
the package VineCopula. The vine structure is selected among a group of C- and a D- vines which satisfy the requirement
discussed in Bevacqua et al. (2017). This group is composed by all C- and D- vines from which the conditioning variables
would be sampled as first when following the algorithms from Aas et al. (2009). Alternatively, if the
vine matrix describing the vine structure is given to the function, the fit of the pair-copulas is directly performed skipping the vine structure
selection procedure.
Usage
CDVineCondFit(data, Nx, treecrit = "AIC", type = "CVine-DVine",
selectioncrit = "AIC", familyset = NA, indeptest = FALSE,
level = 0.05, se = FALSE, rotations = TRUE, method = "mle",
Matrix = FALSE)
Arguments
data |
An |
Nx |
Number of conditioning variables. |
treecrit |
Character indicating the criteria used to select the vine. All possible vines are fitted trough
the function |
type |
Type of vine to be fitted: |
selectioncrit |
Character indicating the criterion for pair-copula selection.
Possible choices are |
familyset |
"Integer vector of pair-copula families to select from. The vector has to include at least one
pair-copula family that allows for positive and one that allows for negative
dependence. Not listed copula families might be included to better handle
limit cases. If |
indeptest |
Logical; whether a hypothesis test for the independence of
|
level |
numeric; significance level of the independence test (default:
|
se |
Logical; whether standard errors are estimated (default: |
rotations |
logical; if |
method |
indicates the estimation method: either maximum
likelihood estimation ( |
Matrix |
|
Value
An RVineMatrix object describing the selected copula model
(for further details about RVineMatrix objects see the documentation file of the VineCopula package).
The selected families are stored
in $family, and the sequentially estimated parameters in $par and $par2.
The fit of the model is performed via the function RVineCopSelect of the package VineCopula.
"The object RVineMatrix includes the following information about the fit:
se, se2standard errors for the parameter estimates (if
se = TRUE; note that these are only approximate since they do not account for the sequential nature of the estimation,nobsnumber of observations,
logLik, pair.logLiklog likelihood (overall and pairwise)
AIC, pair.AICAikaike's Informaton Criterion (overall and pairwise),
BIC, pair.BICBayesian's Informaton Criterion (overall and pairwise),
emptaumatrix of empirical values of Kendall's tau,
p.value.indeptestmatrix of p-values of the independence test.
Note
For a comprehensive summary of the vine copula model, use
summary(object); to see all its contents, use str(object)". (VineCopula Documentation, version 2.1.1, pp. 103)
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
Ulf Schepsmeier, Jakob Stoeber, Eike Christian Brechmann, Benedikt Graeler, Thomas Nagler and Tobias Erhardt (2017). VineCopula: Statistical Inference of Vine Copulas. R package version 2.1.1. [link]
See Also
Examples
# Example 1
# Read data
data(dataset)
data <- dataset$data[1:100,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
## Not run:
# Select and fit a C- vine copula model, requiring that the
RVM <- CDVineCondFit(data,Nx=3,treecrit="BIC",type="CVine",selectioncrit="AIC")
summary(RVM)
RVM$Matrix
## End(Not run)
# Example 2
# Read data
data(dataset)
data <- dataset$data[1:80,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# Define a VineMatrix which can be used for conditional sampling
ListVines <- CDVineCondListMatrices(data,Nx=3)
Matrix=ListVines$DVine[[1]]
Matrix
## Not run:
# Fit copula families for the defined vine:
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix)
summary(RVM)
RVM$Matrix
RVM$family
# check
identical(RVM$Matrix,Matrix)
# Fit copula families for the defined vine, given a group of families to select from:
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix,familyset=c(1,2,3,14))
summary(RVM)
RVM$Matrix
RVM$family
# Try to fit copula families for a vine which is not among those
# that allow for conditional sampling:
Matrix
Matrix[which(Matrix==4)]=40
Matrix[which(Matrix==2)]=20
Matrix[which(Matrix==40)]=2
Matrix[which(Matrix==20)]=4
Matrix
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix)
RVM
## End(Not run)
List of the possible C- and D- vines allowing for conditional simulation
Description
Provides a list of the C- and D- vines which allow for conditional
sampling, under the condition discussed in the descriprion of CDVineCondFit.
Usage
CDVineCondListMatrices(data, Nx, type = "CVine-DVine")
Arguments
data |
An |
Nx |
Number of conditioning variables. |
type |
Type of vine to be considered: |
Value
Listes of matrices describing C- ($CVine) and D- ($DVine) Vines.
Each matrix corresponds to a vine, according to the same notation used for RVineMatrix
objects (for further details about RVineMatrix objects see the documentation file of the VineCopula package).
The index i in the matrix corresponds to the variable in the i-th column of data.
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
See Also
Examples
# Read data
data(dataset)
data <- dataset$data[1:100,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# List possible D-Vines:
ListVines <- CDVineCondListMatrices(data,Nx=3,"DVine")
ListVines$DVine
# List possible C-Vines:
ListVines <- CDVineCondListMatrices(data,Nx=3,"CVine")
ListVines$CVine
# List possible C- and D-Vines:
ListVines <- CDVineCondListMatrices(data,Nx=3,"CVine-DVine")
ListVines
Ranking of C- and D- vines allowing for conditional simulation
Description
Provides a ranking of the C- and D- vines which allow for conditional
sampling, under the condition discussed in the descriprion of CDVineCondFit.
Usage
CDVineCondRank(data, Nx, treecrit = "AIC", selectioncrit = "AIC",
familyset = NA, type = "CVine-DVine", indeptest = FALSE, level = 0.05,
se = FALSE, rotations = TRUE, method = "mle")
Arguments
data |
An |
Nx |
Number of conditioning variables. |
treecrit |
Character indicating the criteria used to select the vine. All possible vines are fitted trough
the function |
selectioncrit |
Character indicating the criterion for pair-copula selection.
Possible choices are |
familyset |
"Integer vector of pair-copula families to select from. The vector has to include at least one
pair-copula family that allows for positive and one that allows for negative
dependence. Not listed copula families might be included to better handle
limit cases. If |
type |
Type of vine to be fitted: |
indeptest |
"Logical; whether a hypothesis test for the independence of
|
level |
numeric; significance level of the independence test (default:
|
se |
Logical; whether standard errors are estimated (default: |
rotations |
logical; if |
method |
indicates the estimation method: either maximum
likelihood estimation ( |
Value
tableA table with the ranking of the vines, with vine index
i, values of the selectedtreecritand vinetype(1 for "CVine" and 2 for D-Vine).vinesA list where the element
[[i]]is anRVineMatrixobject corresponding to thei-th vine in the ranking shown intable. EachRVineMatrixobject containes the selected families ($family) as well as sequentially estimated parameters stored in$parand$par2. Details aboutRVineMatrixobjects are given in the documentation file of theVineCopulapackage). The fit of each model is performed via the functionRVineCopSelectof the packageVineCopula. "The object is augmented by the following information about the fit:se, se2standard errors for the parameter estimates (if
se = TRUE; note that these are only approximate since they do not account for the sequential nature of the estimationnobsnumber of observations
logLik, pair.logLiklog likelihood (overall and pairwise)
AIC, pair.AICAikaike's Informaton Criterion (overall and pairwise)
BIC, pair.BICBayesian's Informaton Criterion (overall and pairwise)
emptaumatrix of empirical values of Kendall's tau
p.value.indeptestmatrix of p-values of the independence test.
Note
For a comprehensive summary of the vine copula model, use
summary(object); to see all its contents, use str(object)." (VineCopula Documentation, version 2.1.1, pp. 103)
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
Ulf Schepsmeier, Jakob Stoeber, Eike Christian Brechmann, Benedikt Graeler, Thomas Nagler and Tobias Erhardt (2017). VineCopula: Statistical Inference of Vine Copulas. R package version 2.1.1. [link]
See Also
Examples
# Read data
data(dataset)
data <- dataset$data[1:100,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# Rank the possible D-Vines according to the AIC
## Not run:
Ranking <- CDVineCondRank(data,Nx=3,"AIC",type="DVine")
Ranking$table
# tree AIC type
# 1 1 -292.8720 2
# 2 2 -290.2941 2
# 3 3 -288.5719 2
# 4 4 -288.2496 2
# 5 5 -287.8006 2
# 6 6 -285.8503 2
# 7 7 -282.2867 2
# 8 8 -278.9371 2
# 9 9 -275.8339 2
# 10 10 -272.9459 2
# 11 11 -271.1526 2
# 12 12 -270.5269 2
Ranking$vines[[1]]$AIC
# [1] -292.8720
summary(Ranking$vines[[1]])
## End(Not run)
Simulation from a conditional C- or D-vine
Description
Simulates from a d-dimensional conditional C- or D-vine of the variables (Y,X), given the fixed conditioning variables X. The algorithm works for vines satysfying the requirements discussed in Bevacqua et al. (2017). The algorthm implemented here is a modified version of those form Aas et al. (2009) and is shown in Bevacqua et al. (2017).
Usage
CDVineCondSim(RVM, Condition, N)
Arguments
RVM |
An |
Condition |
A |
N |
Number of data to be simulated. By default N is taken from |
Value
A N x d matrix of the simulated variables from the given C- or D-vine copula model. In the first columns there are
the simulated conditioned variables, and in the last columns the conditioning variables Condition.
For more details about the exact order of the variables in the columns see the examples. The
function is built to work easily in combination with CDVineCondFit.
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
Ulf Schepsmeier, Jakob Stoeber, Eike Christian Brechmann, Benedikt Graeler, Thomas Nagler and Tobias Erhardt (2017). VineCopula: Statistical Inference of Vine Copulas. R package version 2.1.1. [link]
See Also
Examples
# Example 1: conditional sampling from a C-Vine
# Read data
data(dataset)
data <- dataset$data[1:400,1:4]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4")
## Not run:
# Select a vine and fit the copula families, specifying that there are 2 conditioning variables
RVM <- CDVineCondFit(data,Nx=2,type="CVine")
# Set the values of the conditioning variables as those used for the calibration.
# Order them with respect to RVM$Matrix, considering that is a C-Vine
d=dim(RVM$Matrix)[1]
cond1 <- data[,RVM$Matrix[(d+1)-1,(d+1)-1]]
cond2 <- data[,RVM$Matrix[(d+1)-2,(d+1)-2]]
condition <- cbind(cond1,cond2)
# Simulate the variables
Sim <- CDVineCondSim(RVM,condition)
# Plot the simulated variables over the observed
Sim <- data.frame(Sim)
overplot(Sim,data)
# Example 2: conditional sampling from a D-Vine
# Read data
data(dataset)
data <- dataset$data[1:100,1:4]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4")
# Select a vine and fit the copula families, specifying that there are 2 conditioning variables
RVM <- CDVineCondFit(data,Nx=2,type="DVine")
summary(RVM) #It is a D-Vine.
# Set the values of the conditioning variables as those used for the calibration.
# Order them with respect to RVM$Matrix, considering that is a D-Vine.
cond1 <- data[,RVM$Matrix[1,1]]
cond2 <- data[,RVM$Matrix[2,2]]
condition <- cbind(cond1,cond2)
# Simulate the variables
Sim <- CDVineCondSim(RVM,condition)
# Plot the simulated variables over the observed
Sim <- data.frame(Sim)
overplot(Sim,data)
# Example 3
# Read data
data(dataset)
data <- dataset$data[1:100,1:2]
colnames(data) <- c("Y1","X2")
# Fit copula
require(VineCopula)
BiCop <- BiCopSelect(data$Y1,data$X2)
BiCop
# Fix conditioning variable to low values and simulate
condition <- data$X2/10
Sim <- CDVineCondSim(BiCop,condition)
# Plot the simulated variables over the observed
Sim <- data.frame(Sim)
overplot(Sim,data)
## End(Not run)
Random dataset from a given vine copula model
Description
A random dataset simulated from a given 5-dimensional vine copula model.
Usage
dataset
Format
$dataAn
1000 x 5data set (formatdata.frame) with the uniform variables(U1,U2,U3,U4,U5).$vineRVineMatrixobject defyining the vine copula model from where$datawas sampled.
Author(s)
Emanuele Bevacqua
Examples
# Load data
data(dataset)
# Extract data
data <- dataset$data
plot(data)
# Extract the RVineMatrix object from where the dataset was randomly sampled
vine <- dataset$vine
vine$Matrix
vine$family
vine$par
vine$par2
summary(vine)
overplot
Description
This function overlays the scatterplot matrices of two multivariate datsets. Moreover, it shows the dependencies among all the pairs for both datsets.
Usage
overplot(data1, data2, col1 = "black", col2 = "grey", xlim = NA,
ylim = NA, labels = NA, method = "pearson", cex.cor = 1,
cex.labels = 1, cor.signif = 2, cex.axis = 1, pch1 = 1, pch2 = 1)
Arguments
data1, data2 |
Two |
col1, col2 |
Colors used for |
xlim, ylim |
Two bidimensional vectors indicating the limits of x and y axes for all the scatterplots. If not given, they are authomatically computed for each of the scatterplots. |
labels |
A character vector with the variable names to be printed (if not given, the names of |
method |
Character indicating the dependence types to be computed between the pairs. Possibilites: "kendall", "spearman" and "pearson" (default) |
cex.cor |
Number: character dimension of the printed dependencies. Default |
cex.labels |
Number: character dimension of the printed variable names. Default |
cor.signif |
Number: number of significant numbers of the printed dependencies. Default |
cex.axis |
Number: dimension of the axis numeric values. Default cex.axis=1. |
pch1, pch2 |
Paramter to specify the symbols to use when plotting points of |
Value
A matrix of overlaying scatterplots of the multivariate datsets data1 and data2, with
the dependencies of the pairs.
Author(s)
Emanuele Bevacqua
Examples
# Example 1
# Read and prepare the data for the plot
data(dataset)
data1 <- dataset$data[1:300,]
data2 <- dataset$data[301:600,]
overplot(data1,data2,xlim=c(0,1),ylim=c(0,1),method="kendall")
## Not run:
# Example 2
# Read and prepare the data for the plot
data(dataset)
data <- dataset$data[1:200,1:5]
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# Fit copula families for the defined vine:
ListVines <- CDVineCondListMatrices(data,Nx=3)
Matrix=ListVines$CVine[[1]]
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix)
# Simulate data:
d=dim(RVM$Matrix)[1]
cond1 <- data[,RVM$Matrix[(d+1)-1,(d+1)-1]]
cond2 <- data[,RVM$Matrix[(d+1)-2,(d+1)-2]]
cond3 <- data[,RVM$Matrix[(d+1)-3,(d+1)-3]]
condition <- cbind(cond1,cond2,cond3)
Sim <- CDVineCondSim(RVM,condition)
# Plot the simulated variables Sim over the observed
Sim <- data.frame(Sim)
overplot(data[,1:2],Sim[,1:2],xlim=c(0,1),ylim=c(0,1),method="spearman")
overplot(data,Sim,xlim=c(0,1),ylim=c(0,1),method="spearman")
## End(Not run)