| Type: | Package | 
| Title: | Misspecified Models for Gene-Environment Interaction | 
| Version: | 1.1 | 
| Date: | 2025-09-18 | 
| Description: | The first major functionality is to compute the bias in regression coefficients of misspecified linear gene-environment interaction models. The most generalized function for this objective is GE_bias(). However GE_bias() requires specification of many higher order moments of covariates in the model. If users are unsure about how to calculate/estimate these higher order moments, it may be easier to use GE_bias_normal_squaredmis(). This function places many more assumptions on the covariates (most notably that they are all jointly generated from a multivariate normal distribution) and is thus able to automatically calculate many of the higher order moments automatically, necessitating only that the user specify some covariances. There are also functions to solve for the bias through simulation and non-linear equation solvers; these can be used to check your work. Second major functionality is to implement the Bootstrap Inference with Correct Sandwich (BICS) testing procedure, which we have found to provide better finite-sample performance than other inference procedures for testing GxE interaction. More details on these functions are available in Sun, Carroll, Christiani, and Lin (2018) <doi:10.1111/biom.12813>. | 
| Imports: | mvtnorm, bindata, nleqslv, pracma, speedglm, rje, geepack, stats | 
| License: | GPL-3 | 
| RoxygenNote: | 6.1.1 | 
| Suggests: | knitr, rmarkdown, testthat | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2025-09-25 19:37:24 UTC; rsun3 | 
| Author: | Ryan Sun [aut, cre], Richard Barfield [ctb] | 
| Maintainer: | Ryan Sun <ryansun.work@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-10-01 07:50:17 UTC | 
GE_BICS.R
Description
A function to perform inference on the GxE interaction regression coefficient. Shows better small sample performance than comparable methods.
Usage
GE_BICS(outcome, design_mat, num_boots = 1000, desired_coef,
  outcome_type, check_singular = FALSE)
Arguments
| outcome | The outcome vector | 
| design_mat | The design matrix of covariates | 
| num_boots | The number of bootstrap resamples to perform - we suggest 1000 | 
| desired_coef | The column in the design matrix holding the interaction covariate | 
| outcome_type | Either 'D' for dichotomous outcome or 'C' for continuous outcome | 
| check_singular | Make sure the design matrix can be inverted for variance estimation | 
Value
The p-value for the interaction effect
Examples
E <- rnorm(n=500)
G <- rbinom(n=500, size=2, prob=0.3)
design_mat <- cbind(1, G, E, G*E)
outcome <- rnorm(500)
GE_BICS(outcome=outcome, design_mat=design_mat, desired_coef=4, outcome_type='C')
GE_bias.R
Description
A function to calculate the bias in testing for GxE interaction.
Usage
GE_bias(beta_list, cov_list, cov_mat_list, mu_list, HOM_list)
Arguments
| beta_list | A list of the effect sizes in the true model. Use the order beta_0, beta_G, beta_E, beta_I, beta_Z, beta_M. If G or Z or M is a vector, then beta_G/beta_Z/beta_M should be vectors. If Z and/or M/W do not exist in your model, then set beta_Z and/or beta_M = 0. | 
| cov_list | A list of expectations (which happen to be covariances if all covariates are centered at 0) in the order specified by GE_enumerate_inputs(). If Z and/or M/W do not exist in your model, then treat them as constants 0. For example, if Z doesn't exist and W includes 2 covariates, then set cov(EZ) = 0 and cov(ZW) = (0,0). If describing expectations relating two vectors, i.e. Z includes two covariates and W includes three covariates, sort by the first term and then the second. Thus in the example, the first three terms of cov(ZW) are cov(Z_1,W_1),cov(Z_1,W_2), cov(Z_1,W_3), and the last three terms are cov(Z_3,W_1), cov(Z_3,W_2), cov(Z_3,W_3). | 
| cov_mat_list | A list of matrices of expectations as specified by GE_enumerate_inputs(). | 
| mu_list | A list of means as specified by GE_enumerate_inputs(). | 
| HOM_list | A list of higher order moments as specified by GE_enumerate_inputs(). | 
Value
A list of the fitted coefficients alpha
Examples
solutions <- GE_bias_normal_squaredmis( beta_list=as.list(runif(n=6, min=0, max=1)), 
rho_list=as.list(rep(0.3,6)), prob_G=0.3, cov_Z=1, cov_W=1)
GE_bias(beta_list=solutions$beta_list, solutions$cov_list, solutions$cov_mat_list, 
					solutions$mu_list, solutions$HOM_list)
GE_bias_normal_squaredmis.R
Description
A function to calculate the bias in testing for GxE interaction, making many more
assumptions than GE_bias().  The additional assumptions are added to simplify the process
of calculating/estimating many higher order moments which the user may not be familiar with. 
The following assumptions are made: 
(1) All fitted covariates besides G (that is, E, all Z, and all W) have a marginal standard 
normal distribution with mean 0 and variance 1.  This corresponds to the case of the researcher
standardizing all of their fitted covariates. 
(2) All G are generated by means of thresholding two independent normal RVs and are centered to have mean 0. 
(3) The joint distributions of E, Z, W, and the thresholded variables underlying G can be described
by a multivariate normal distribution. 
(4) The misspecification is of the form f(E)=h(E)=E^2, and M_j=W_j^2 for all j. In particular,
W always has the same length as M here. 
Usage
GE_bias_normal_squaredmis(beta_list, rho_list, prob_G, cov_Z = NULL,
  cov_W = NULL, corr_G = NULL)
Arguments
| beta_list | A list of the effect sizes in the true model. Use the order beta_0, beta_G, beta_E, beta_I, beta_Z, beta_M. If G or Z or M is a vector, then beta_G/beta_Z/beta_M should be vectors. If Z and/or M/W do not exist in your model, then set beta_Z and/or beta_M = 0. | 
| rho_list | A list of expectations (which happen to be covariances if all covariates are centered at 0) in the order specified by GE_enumerate_inputs(). If Z and/or M/W do not exist in your model, then treat them as constants 0. For example, if Z doesn't exist and W includes 2 covariates, then set cov(EZ) = 0 and cov(ZW) = (0,0). If describing expectations relating two vectors, i.e. Z includes two covariates and W includes three covariates, sort by the first term and then the second. Thus in the example, the first three terms of cov(ZW) are cov(Z_1,W_1),cov(Z_1,W_2), cov(Z_1,W_3), and the last three terms are cov(Z_3,W_1), cov(Z_3,W_2), cov(Z_3,W_3). | 
| prob_G | Probability that each allele is equal to 1. Since each SNP has two alleles, the expectation of G is 2*prob_G. Should be a d*1 vector. | 
| cov_Z | Should be a matrix equal to cov(Z) or NULL if no Z. | 
| cov_W | Should be a matrix equal to cov(W) or NULL if no W. | 
| corr_G | Should be a matrix giving the *pairwise correlations* between each SNP in the set, or NULL. Must be specified if G is a vector. For example, the [2,3] element of the matrix would be the pairwise correlation between SNP2 and SNP3. | 
Value
A list with the elements:
| alpha_list | The asymptotic values of the fitted coefficients alpha. | 
| beta_list | The same beta_list that was given as input. | 
| cov_list | The list of all covariances (both input and calculated) for use with GE_nleqslv() and GE_bias(). | 
| mu_list | List of calculated means for f(E), h(E), Z, M, and W for use with GE_nleqslv() and GE_bias(). | 
| HOM_list | List of calculated Higher Order Moments for use with GE_nleqslv() and GE_bias(). | 
Examples
GE_bias_normal_squaredmis( beta_list=as.list(runif(n=6, min=0, max=1)),
rho_list=as.list(rep(0.3,6)), cov_Z=1, cov_W=1, prob_G=0.3)
GE_nleqslv.R #' Uses package nleqslv to get a numerical solution to the score equations, which we can use to check our direct solution from GE_bias().
Description
GE_nleqslv.R #' Uses package nleqslv to get a numerical solution to the score equations, which we can use to check our direct solution from GE_bias().
Usage
GE_nleqslv(beta_list, cov_list, cov_mat_list, mu_list, HOM_list)
Arguments
| beta_list | A list of the effect sizes in the true model. Use the order beta_0, beta_G, beta_E, beta_I, beta_Z, beta_M. If G or Z or M is a vector, then beta_G/beta_Z/beta_M should be vectors. If Z and/or M/W do not exist in your model, then set beta_Z and/or beta_M = 0. | 
| cov_list | A list of expectations (which happen to be covariances if all covariates are centered at 0) in the order specified by GE_enumerate_inputs(). If Z and/or M/W do not exist in your model, then treat them as constants 0. For example, if Z doesn't exist and W includes 2 covariates, then set cov(EZ) = 0 and cov(ZW) = (0,0). If describing expectations relating two vectors, i.e. Z includes two covariates and W includes three covariates, sort by the first term and then the second. Thus in the example, the first three terms of cov(ZW) are cov(Z_1,W_1),cov(Z_1,W_2), cov(Z_1,W_3), and the last three terms are cov(Z_3,W_1), cov(Z_3,W_2), cov(Z_3,W_3). | 
| cov_mat_list | A list of matrices of expectations as specified by GE_enumerate_inputs(). | 
| mu_list | A list of means as specified by GE_enumerate_inputs(). | 
| HOM_list | A list of higher order moments as specified by GE_enumerate_inputs(). | 
Value
A list of the fitted coefficients alpha
Examples
solutions <- GE_bias_normal_squaredmis( beta_list=as.list(runif(n=6, min=0, max=1)),
rho_list=as.list(rep(0.3,6)), prob_G=0.3, cov_Z=1, cov_W=1)
GE_nleqslv(beta_list=solutions$beta_list, solutions$cov_list, solutions$cov_mat_list,
solutions$mu_list, solutions$HOM_list)
GE_scoreeq_sim.R
Description
Here we perform simulation to verify that we have solved for the correct alpha values in GE_bias_norm_squaredmis(). Make the same assumptions as in GE_bias_norm_squaredmis().
Usage
GE_scoreeq_sim(num_sims = 5000, num_sub = 2000, beta_list, rho_list,
  prob_G, cov_Z = NULL, cov_W = NULL, corr_G = NULL)
Arguments
| num_sims | The number of simulations to run, we suggest 5000. | 
| num_sub | The number of subjects to generate in every simulation, we suggest 2000. | 
| beta_list | A list of the effect sizes in the true model. Use the order beta_0, beta_G, beta_E, beta_I, beta_Z, beta_M. If G or Z or M is a vector, then beta_G/beta_Z/beta_M should be vectors. If Z and/or M/W do not exist in your model, then set beta_Z and/or beta_M = 0. | 
| rho_list | A list of expectations (which happen to be covariances if all covariates are centered at 0) in the order specified by GE_enumerate_inputs(). If Z and/or M/W do not exist in your model, then treat them as constants 0. For example, if Z doesn't exist and W includes 2 covariates, then set cov(EZ) = 0 and cov(ZW) = (0,0). If describing expectations relating two vectors, i.e. Z includes two covariates and W includes three covariates, sort by the first term and then the second. Thus in the example, the first three terms of cov(ZW) are cov(Z_1,W_1),cov(Z_1,W_2), cov(Z_1,W_3), and the last three terms are cov(Z_3,W_1), cov(Z_3,W_2), cov(Z_3,W_3). | 
| prob_G | Probability that each allele is equal to 1. Since each SNP has two alleles, the expectation of G is 2*prob_G. Should be a d*1 vector. | 
| cov_Z | Should be a matrix equal to cov(Z) or NULL if no Z. | 
| cov_W | Should be a matrix equal to cov(W) or NULL if no W. | 
| corr_G | Should be a matrix giving the *pairwise correlations* between each SNP in the set, or NULL. Must be specified if G is a vector. For example, the [2,3] element of the matrix would be the pairwise correlation between SNP2 and SNP3. | 
Value
A list of the fitted values alpha
Examples
GE_scoreeq_sim( num_sims=10, num_sub=1000, beta_list=as.list(runif(n=6, min=0, max=1)), 
rho_list=as.list(rep(0.3,6)), prob_G=0.3, cov_Z=1, cov_W=1)
GE_test_moment_calcs.R
Description
Test function mostly for internal use to ensure the higher order moments (covariances) calculated in GE_bias_normal_squaredmis() are correct. Will give warning messages if some calculations appear to be incorrect. If receive warning messages, run again, and if still receive the same warning messages, something may be wrong.
Usage
GE_test_moment_calcs(beta_list, rho_list, prob_G, cov_Z = NULL,
  cov_W = NULL, num_sub = 2e+06, test_threshold = 0.003,
  corr_G = NULL)
Arguments
| beta_list | A list of the effect sizes in the true model. Use the order beta_0, beta_G, beta_E, beta_I, beta_Z, beta_M. If G or Z or M is a vector, then beta_G/beta_Z/beta_M should be vectors. If Z and/or M/W do not exist in your model, then set beta_Z and/or beta_M = 0. | 
| rho_list | A list of expectations (which happen to be covariances if all covariates are centered at 0) in the order specified by GE_enumerate_inputs(). If Z and/or M/W do not exist in your model, then treat them as constants 0. For example, if Z doesn't exist and W includes 2 covariates, then set cov(EZ) = 0 and cov(ZW) = (0,0). If describing expectations relating two vectors, i.e. Z includes two covariates and W includes three covariates, sort by the first term and then the second. Thus in the example, the first three terms of cov(ZW) are cov(Z_1,W_1),cov(Z_1,W_2), cov(Z_1,W_3), and the last three terms are cov(Z_3,W_1), cov(Z_3,W_2), cov(Z_3,W_3). | 
| prob_G | Probability that each allele is equal to 1. Since each SNP has two alleles, the expectation of G is 2*prob_G. Should be a d*1 vector. | 
| cov_Z | Should be a matrix equal to cov(Z) or NULL. Must be specified if Z is a vector. | 
| cov_W | Should be a matrix equal to cov(W) or NULL. Must be specified if W is a vector. | 
| num_sub | Number of subjects to simulate/test with. | 
| test_threshold | How close does our simulated value have to be to the analytic value? | 
| corr_G | Should be a matrix giving the *pairwise correlations* between each SNP in the set, or NULL. Must be specified if G is a vector. For example, the [2,3] element of the matrix would be the pairwise correlation between SNP2 and SNP3. | 
Value
Nothing
Examples
GE_test_moment_calcs(beta_list=as.list(runif(n=6, min=0, max=1)), 
rho_list=as.list(rep(0.3,6)), prob_G=0.3, cov_Z=1, cov_W=1)
GE_translate_inputs.R
Description
Mostly for internal use, function called by GE_bias_normal() and GE_scoreeq_sim() to translate the rho_list inputs and return a total covariance matrix for simulation/ checking validity of covariance structure. If invalid covariance structure, will stop and return an error message.
Usage
GE_translate_inputs(beta_list, rho_list, prob_G, cov_Z = NULL,
  cov_W = NULL, corr_G = NULL)
Arguments
| beta_list | A list of the effect sizes in the true model. Use the order beta_0, beta_G, beta_E, beta_I, beta_Z, beta_M. If G or Z or M is a vector, then beta_G/beta_Z/beta_M should be vectors. If Z and/or M/W do not exist in your model, then set beta_Z and/or beta_M = 0. | 
| prob_G | Probability that each allele is equal to 1. Since each SNP has two alleles, the expectation of G is 2*prob_G. | 
| cov_Z | Should be a matrix equal to cov(Z) or NULL if no Z. | 
| cov_W | Should be a matrix equal to cov(W) or NULL if no W. | 
| corr_G | Should be a matrix giving the *pairwise correlations* between each SNP in the set, or NULL. Must be specified if G is a vector. For example, the [2,3] element of the matrix would be the pairwise correlation between SNP2 and SNP3. Diagonal should be 1. | 
Value
A list with the elements:
| sig_mat_total | The sigma parameter for rmvnorm call to generate our data. | 
Examples
GE_translate_inputs( beta_list=as.list(runif(n=6, min=0, max=1)),
rho_list=as.list(rep(0.3,6)), prob_G=0.3, cov_Z=1, cov_W=1)