library(pleioh2g)pleioh2g-tutorial.R ### Installation:
install.packages("devtools")
devtools::install_github("yjzhao1004/pleioh2g")
library(pleioh2g)
PHBC needs LDSC .sumstats format data as input, so you first need to reformat the GWAS summary statistics as LDSC requested. We strongly recommend that you use the script munge_sumstats.py included in LDSC python package (https://github.com/bulik/ldsc) to convert your own GWAS summary statistics into LDSC format. (ref. Bulik-Sullivan et al. 2015b Nat Genet) We also provide all LDSC-format .sumstat.gz data used in our analyses. (See Data preparation) * Examples of three phenotypes (“401.1”, “250.2”, “296.22”) have been implemented in our package. You can use codes below to reload them.
library(pleioh2g)
munged_sumstats = list("401.1" = sumstats_munged_example_input(example = "401.1"), "250.2" = sumstats_munged_example_input(example = "250.2"),"296.22" = sumstats_munged_example_input(example = "296.22"))
The function pruning_pleioh2g_wrapper() (See example as below) is to compute h2pleio / h2 while performing pruning and bias correction with ldsc-format GWAS summary statistics (.sumstat.gz) as input.
# Specify phenotype names
phenotype<-c("401.1","250.2","296.22")
# First to determine which disease in your list is the target disease
G = 1 # Index of target disease in trait list - this example is to compute pleiotropic heritability for "401.1".
# Input ldsc format .sumstat.gz data
munged_sumstats = list("401.1" = sumstats_munged_example_input(example = "401.1"), "250.2" = sumstats_munged_example_input(example = "250.2"),"296.22" = sumstats_munged_example_input(example = "296.22"))
# Specify reference LD data: ld and wld path; and hapmap 3 SNPs list
ld_path<-fs::path(fs::path_package("extdata/eur_w_ld_chr", package = "pleioh2g"))
wld_path<-fs::path(fs::path_package("extdata/eur_w_ld_chr", package = "pleioh2g"))
hmp3<-fs::path(fs::path_package("extdata/w_hm3.snplist", package = "pleioh2g"))
# If you trait is disease phenotype or the other binary trait, specify prevalence to compute the liability-scale heritability; If you don't specify this, it will compute observed-scale heritability.
sample_prev <- c(0.37,0.1,0.17)
population_prev <- c(0.37,0.1,0.17)
# Specify number of genomic-jackknife block; We use n_block = 5 as example for quick computation, but we recommand to use 200 jackknife blocks; If you don't specify this, the default number is 200.
n_block<-5
# Specify number of Monte Carlo sampling iterations in bias correction; If you don't specify this, the default number is 1000.
sample_rep <- 1000
post_correction_results<-pruning_pleioh2g_wrapper(G,phenotype,munged_sumstats,ld_path, wld_path, sample_prev, population_prev,n_block, hmp3,sample_rep)
An output line will provide your post-correction
h2pleio / h2 estimate, along with a
result list post_correction_results, containing the
following elements: - target_disease (character): The
value “401.1”. - target_disease_h2_est (numeric): target
disease h2. - target_disease_h2_se (numeric):
target disease h2 s.e.. - selected_auxD
(character): auxiliary diseases. - h2pleio_uncorr
(numeric): pre-correction h2pleio estimate. -
h2pleio_uncorr_se (numeric): pre-correction
h2pleio jackknife s.e. estimate. -
percentage_h2pleio_uncorr (numeric): pre-correction
h2pleio / h2 estimate. -
percentage_h2pleio_uncorr_se (numeric): pre-correction
h2pleio / h2 jackknife s.e. estimate. -
percentage_h2pleio_jackknife_uncorr (numeric): vector of
all pre-correction h2pleio / h2
jackknife estimates in default 200 blocks. - h2pleio_corr
(numeric): post-correction h2pleio estimate. -
h2pleio_corr_se (numeric): post-correction
h2pleio jackknife s.e. estimate. -
percentage_h2pleio_corr (numeric): post-correction
h2pleio / h2 estimate. -
percentage_h2pleio_corr_se (numeric): post-correction
h2pleio / h2 jackknife s.e. estimate. -
corrected_weight (numeric): corrected weight ξc in bias
correction.
post_correction_results$selected_auxD.phenotype and munged_sumstats
input parameters with the remaining auxiliary diseases.post_correction_results$selected_auxD.phenotype and munged_sumstats
input parameters with the remaining auxiliary diseases.