| Title: | Latent Dirichlet Allocation Coupled with Time Series Analyses | 
| Version: | 0.3.0 | 
| Description: | Combines Latent Dirichlet Allocation (LDA) and Bayesian multinomial time series methods in a two-stage analysis to quantify dynamics in high-dimensional temporal data. LDA decomposes multivariate data into lower-dimension latent groupings, whose relative proportions are modeled using generalized Bayesian time series models that include abrupt changepoints and smooth dynamics. The methods are described in Blei et al. (2003) <doi:10.1162/jmlr.2003.3.4-5.993>, Western and Kleykamp (2004) <doi:10.1093/pan/mph023>, Venables and Ripley (2002, ISBN-13:978-0387954578), and Christensen et al. (2018) <doi:10.1002/ecy.2373>. | 
| URL: | https://weecology.github.io/LDATS/, https://github.com/weecology/LDATS | 
| BugReports: | https://github.com/weecology/LDATS/issues | 
| Depends: | R (≥ 3.5.0) | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| Imports: | coda, digest, extraDistr, graphics, grDevices, lubridate, magrittr, memoise, methods, mvtnorm, nnet, progress, stats, topicmodels, viridis | 
| Suggests: | knitr, pkgdown, rmarkdown, testthat, vdiffr | 
| SystemRequirements: | gsl | 
| VignetteBuilder: | knitr | 
| RoxygenNote: | 7.2.3 | 
| NeedsCompilation: | no | 
| Packaged: | 2023-09-18 16:29:40 UTC; dappe | 
| Author: | Juniper L. Simonis | 
| Maintainer: | Juniper L. Simonis <juniper.simonis@weecology.org> | 
| Repository: | CRAN | 
| Date/Publication: | 2023-09-19 09:10:06 UTC | 
Calculate AICc
Description
Calculate the small sample size correction of
AIC for the input object.
Usage
AICc(object)
Arguments
| object | 
Value
numeric value of AICc.
Examples
  dat <- data.frame(y = rnorm(50), x = rnorm(50))
  mod <- lm(dat)
  AICc(mod)
Package to conduct two-stage analyses combining Latent Dirichlet Allocation with Bayesian Time Series models
Description
Performs two-stage analysis of multivariate temporal data using a combination of Latent Dirichlet Allocation (Blei et al. 2003) and Bayesian Time Series models (Western and Kleykamp 2004) that we extend for multinomial data using softmax regression (Venables and Ripley 2002) following Christensen et al. (2018).
Documentation
Technical mathematical manuscript
 
 
End-user-focused vignette worked example
 
 
Computational pipeline vignette
 
 
Comparison to Christensen et al.
References
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993-1022. link.
Christensen, E., D. J. Harris, and S. K. M. Ernest. 2018. Long-term community change through multiple rapid transitions in a desert rodent community. Ecology 99:1523-1529. link.
Venables, W. N. and B. D. Ripley. 2002. Modern and Applied Statistics with S. Fourth Edition. Springer, New York, NY, USA.
Western, B. and M. Kleykamp. 2004. A Bayesian change point model for historical time series analysis. Political Analysis 12:354-374. link.
Run a full set of Latent Dirichlet Allocations and Time Series models
Description
Conduct a complete LDATS analysis (Christensen 
et al. 2018), including running a suite of Latent Dirichlet
Allocation (LDA) models (Blei et al. 2003, Grun and Hornik 2011) 
via LDA_set, selecting LDA model(s) via 
select_LDA, running a complete set of Bayesian Time Series
(TS) models (Western and Kleykamp 2004) via TS_on_LDA on
the chosen LDA model(s), and selecting the best TS model via 
select_TS. 
 
conform_LDA_TS_data converts the data input to
match internal and sub-function specifications. 
 
check_LDA_TS_inputs checks that the inputs to 
LDA_TS are of proper classes for a full analysis.
Usage
LDA_TS(
  data,
  topics = 2,
  nseeds = 1,
  formulas = ~1,
  nchangepoints = 0,
  timename = "time",
  weights = TRUE,
  control = list()
)
conform_LDA_TS_data(data, quiet = FALSE)
check_LDA_TS_inputs(
  data = NULL,
  topics = 2,
  nseeds = 1,
  formulas = ~1,
  nchangepoints = 0,
  timename = "time",
  weights = TRUE,
  control = list()
)
Arguments
| data | Either a document term table or a list including at least
a document term table (with the word "term" in the name of the element)
and optionally also a document covariate table  (with the word 
"covariate" in the name of the element). 
 | 
| topics | Vector of the number of topics to evaluate for each model.
Must be conformable to  | 
| nseeds | 
 | 
| formulas | Vector of  | 
| nchangepoints | Vector of  | 
| timename | 
 | 
| weights | Optional input for overriding standard weighting for 
documents in the time series. Defaults to  | 
| control | A  | 
| quiet | 
 | 
Value
LDA_TS: a class LDA_TS list object including all 
fitted LDA and TS models and selected models specifically as elements 
"LDA models" (from LDA_set),
"Selected LDA model" (from select_LDA), 
"TS models" (from TS_on_LDA), and
"Selected TS model" (from select_TS). 
 
conform_LDA_TS_data: a data list that is ready for analyses 
using the stage-specific functions. 
 
check_LDA_TS_inputs: an error message is thrown if any input is 
improper, otherwise NULL.
References
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993-1022. link.
Christensen, E., D. J. Harris, and S. K. M. Ernest. 2018. Long-term community change through multiple rapid transitions in a desert rodent community. Ecology 99:1523-1529. link.
Grun B. and K. Hornik. 2011. topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40:13. link.
Western, B. and M. Kleykamp. 2004. A Bayesian change point model for historical time series analysis. Political Analysis 12:354-374. link.
Examples
  data(rodents)
  mod <- LDA_TS(data = rodents, topics = 2, nseeds = 1, formulas = ~1,
                nchangepoints = 1, timename = "newmoon")
  conform_LDA_TS_data(rodents)
  check_LDA_TS_inputs(rodents, timename = "newmoon")
Create the controls list for the LDATS model
Description
Create and define a list of control options used to run the
LDATS model, as implemented by LDA_TS.
Usage
LDA_TS_control(
  quiet = FALSE,
  measurer_LDA = AIC,
  selector_LDA = min,
  iseed = 2,
  memoise = TRUE,
  response = "gamma",
  lambda = 0,
  measurer_TS = AIC,
  selector_TS = min,
  ntemps = 6,
  penultimate_temp = 2^6,
  ultimate_temp = 1e+10,
  q = 0,
  nit = 10000,
  magnitude = 12,
  burnin = 0,
  thin_frac = 1,
  summary_prob = 0.95,
  seed = NULL,
  ...
)
Arguments
| quiet | 
 | 
| measurer_LDA,selector_LDA | Function names for use in evaluation of 
the LDA models.  | 
| iseed | 
 | 
| memoise | 
 | 
| response | 
 | 
| lambda | 
 | 
| measurer_TS,selector_TS | Function names for use in evaluation of the 
TS models.  | 
| ntemps | 
 | 
| penultimate_temp | Penultimate temperature in the ptMCMC sequence. | 
| ultimate_temp | Ultimate temperature in the ptMCMC sequence. | 
| q | Exponent controlling the ptMCMC temperature sequence from the focal chain (reference with temperature = 1) to the penultimate chain. 0 (default) implies a geometric sequence. 1 implies squaring before exponentiating. | 
| nit | 
 | 
| magnitude | Average magnitude (defining a geometric distribution) for the proposed step size in the ptMCMC algorithm. | 
| burnin | 
 | 
| thin_frac | Fraction of iterations to retain, from the ptMCMC. Must be
 | 
| summary_prob | Probability used for summarizing the posterior 
distributions (via the highest posterior density interval, see
 | 
| seed | Input to  | 
| ... | Additional arguments to be passed to 
 | 
Value
list of control lists, with named elements 
LDAcontrol, TScontrol, and quiet.
Examples
  LDA_TS_control()
Create the model-running-message for an LDA
Description
Produce and print the message for a given LDA model.
Usage
LDA_msg(mod_topics, mod_seeds, control = list())
Arguments
| mod_topics | 
 | 
| mod_seeds | 
 | 
| control | Class  | 
Examples
  LDA_msg(mod_topics = 4, mod_seeds = 2)
Run a set of Latent Dirichlet Allocation models
Description
For a given dataset consisting of counts of words across 
multiple documents in a corpus, conduct multiple Latent Dirichlet 
Allocation (LDA) models (using the Variational Expectation 
Maximization (VEM) algorithm; Blei et al. 2003) to account for [1]  
uncertainty in the number of latent topics and [2] the impact of initial
values in the estimation procedure. 
 
LDA_set is a list wrapper of LDA
in the topicmodels package (Grun and Hornik 2011). 
 
check_LDA_set_inputs checks that all of the inputs 
are proper for LDA_set (that the table of observations is 
conformable to a matrix of integers, the number of topics is an integer, 
the number of seeds is an integer and the controls list is proper).
Usage
LDA_set(document_term_table, topics = 2, nseeds = 1, control = list())
check_LDA_set_inputs(document_term_table, topics, nseeds, control)
Arguments
| document_term_table | Table of observation count data (rows: 
documents, columns: terms. May be a class  | 
| topics | Vector of the number of topics to evaluate for each model.
Must be conformable to  | 
| nseeds | Number of seeds (replicate starts) to use for each 
value of  | 
| control | A  | 
Value
LDA_set: list (class: LDA_set) of LDA models 
(class: LDA_VEM).
check_LDA_set_inputs: an error message is thrown if any input is 
improper, otherwise NULL.
References
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993-1022. link.
Grun B. and K. Hornik. 2011. topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40:13. link.
Examples
  data(rodents)
  lda_data <- rodents$document_term_table
  r_LDA <- LDA_set(lda_data, topics = 2, nseeds = 2)                         
Create control list for set of LDA models
Description
This function provides a simple creation and definition of 
the list used to control the set of LDA models. It is set up to be easy
to work with the existing control capacity of 
LDA.
Usage
LDA_set_control(quiet = FALSE, measurer = AIC, selector = min, iseed = 2, ...)
Arguments
| quiet | 
 | 
| measurer,selector | Function names for use in evaluation of the LDA
models.  | 
| iseed | 
 | 
| ... | Additional arguments to be passed to 
 | 
Value
list for controlling the LDA model fit.
Examples
  LDA_set_control()
Conduct a single multinomial Bayesian Time Series analysis
Description
This is the main interface function for the LDATS application
of Bayesian change point Time Series analyses (Christensen et al.
2018), which extends the model of Western and Kleykamp (2004;
see also Ruggieri 2013) to multinomial (proportional) response data using
softmax regression (Ripley 1996, Venables and Ripley 2002, Bishop 2006) 
using a generalized linear modeling approach (McCullagh and Nelder 1989).
The models are fit using parallel tempering Markov Chain Monte Carlo
(ptMCMC) methods (Earl and Deem 2005) to locate change points and 
neural networks (Ripley 1996, Venables and Ripley 2002, Bishop 2006) to
estimate regressors. 
 
check_TS_inputs checks that the inputs to 
TS are of proper classes for a full analysis.
Usage
TS(
  data,
  formula = gamma ~ 1,
  nchangepoints = 0,
  timename = "time",
  weights = NULL,
  control = list()
)
check_TS_inputs(
  data,
  formula = gamma ~ 1,
  nchangepoints = 0,
  timename = "time",
  weights = NULL,
  control = list()
)
Arguments
| data | 
 | 
| formula | 
 | 
| nchangepoints | 
 | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
Value
TS: TS_fit-class list containing the following
elements, many of
which are hidden for printing, but are accessible:
- data
- datainput to the function.
- formula
- formulainput to the function.
- nchangepoints
- nchangepointsinput to the function.
- weights
- weightsinput to the function.
- control
- controlinput to the function.
- lls
- Iteration-by-iteration logLik values for the full time series fit by - multinom_TS.
- rhos
- Iteration-by-iteration change point estimates from - est_changepoints.
- etas
- Iteration-by-iteration marginal regressor estimates from - est_regressors, which have been unconditioned with respect to the change point locations.
- ptMCMC_diagnostics
- ptMCMC diagnostics, see - diagnose_ptMCMC
- rho_summary
- Summary table describing - rhos(the change point locations), see- summarize_rhos.
- rho_vcov
- Variance-covariance matrix for the estimates of - rhos(the change point locations), see- measure_rho_vcov.
- eta_summary
- Summary table describing - ets(the regressors), see- summarize_etas.
- eta_vcov
- Variance-covariance matrix for the estimates of - etas(the regressors), see- measure_eta_vcov.
- logLik
- Across-iteration average of log-likelihoods ( - lls).
- nparams
- Total number of parameters in the full model, including the change point locations and regressors. 
- deviance
- Penalized negative log-likelihood, based on - logLikand- nparams.
check_TS_inputs: An error message is thrown if any input
is not proper, else NULL.
References
Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer, New York, NY, USA.
Christensen, E., D. J. Harris, and S. K. M. Ernest. 2018. Long-term community change through multiple rapid transitions in a desert rodent community. Ecology 99:1523-1529. link.
Earl, D. J. and M. W. Deem. 2005. Parallel tempering: theory, applications, and new perspectives. Physical Chemistry Chemical Physics 7: 3910-3916. link.
McCullagh, P. and J. A. Nelder. 1989. Generalized Linear Models. 2nd Edition. Chapman and Hall, New York, NY, USA.
Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, UK.
Ruggieri, E. 2013. A Bayesian approach to detecting change points in climactic records. International Journal of Climatology 33:520-528. link.
Venables, W. N. and B. D. Ripley. 2002. Modern and Applied Statistics with S. Fourth Edition. Springer, New York, NY, USA.
Western, B. and M. Kleykamp. 2004. A Bayesian change point model for historical time series analysis. Political Analysis 12:354-374. link.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  check_TS_inputs(data, timename = "newmoon")
Create the controls list for the Time Series model
Description
This function provides a simple creation and definition of a
list used to control the time series model fit occurring within 
TS.
Usage
TS_control(
  memoise = TRUE,
  response = "gamma",
  lambda = 0,
  measurer = AIC,
  selector = min,
  ntemps = 6,
  penultimate_temp = 2^6,
  ultimate_temp = 1e+10,
  q = 0,
  nit = 10000,
  magnitude = 12,
  quiet = FALSE,
  burnin = 0,
  thin_frac = 1,
  summary_prob = 0.95,
  seed = NULL
)
Arguments
| memoise | 
 | 
| response | 
 | 
| lambda | 
 | 
| measurer,selector | Function names for use in evaluation of the TS
models.  | 
| ntemps | 
 | 
| penultimate_temp | Penultimate temperature in the ptMCMC sequence. | 
| ultimate_temp | Ultimate temperature in the ptMCMC sequence. | 
| q | Exponent controlling the ptMCMC temperature sequence from the focal chain (reference with temperature = 1) to the penultimate chain. 0 (default) implies a geometric sequence. 1 implies squaring before exponentiating. | 
| nit | 
 | 
| magnitude | Average magnitude (defining a geometric distribution) for the proposed step size in the ptMCMC algorithm. | 
| quiet | 
 | 
| burnin | 
 | 
| thin_frac | Fraction of iterations to retain, must be  | 
| summary_prob | Probability used for summarizing the posterior 
distributions (via the highest posterior density interval, see
 | 
| seed | Input to  | 
Value
list, with named elements corresponding to the arguments.
Examples
  TS_control()
Plot the diagnostics of the parameters fit in a TS model
Description
Plot 4-panel figures (showing trace plots, posterior ECDF, 
posterior density, and iteration autocorrelation) for each of the 
parameters (change point locations and regressors) fitted within a 
multinomial time series model (fit by TS). 
 
eta_diagnostics_plots creates the diagnostic plots
for the regressors (etas) of a time series model. 
 
rho_diagnostics_plots creates the diagnostic plots
for the change point locations (rho) of a time series model.
Usage
TS_diagnostics_plot(x, interactive = TRUE)
eta_diagnostics_plots(x, interactive)
rho_diagnostics_plots(x, interactive)
Arguments
| x | Object of class  | 
| interactive | 
 | 
Value
NULL.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  TS_diagnostics_plot(TSmod)
Conduct a set of Time Series analyses on a set of LDA models
Description
This is a wrapper function that expands the main Time Series
analyses function (TS) across the LDA models (estimated
using LDA or LDA_set and the 
Time Series models, with respect to both continuous time formulas and the 
number of discrete changepoints. This function allows direct passage of
the control parameters for the parallel tempering MCMC through to the 
main Time Series function, TS, via the 
ptMCMC_controls argument. 
 
check_TS_on_LDA_inputs checks that the inputs to 
TS_on_LDA are of proper classes for a full analysis.
Usage
TS_on_LDA(
  LDA_models,
  document_covariate_table,
  formulas = ~1,
  nchangepoints = 0,
  timename = "time",
  weights = NULL,
  control = list()
)
check_TS_on_LDA_inputs(
  LDA_models,
  document_covariate_table,
  formulas = ~1,
  nchangepoints = 0,
  timename = "time",
  weights = NULL,
  control = list()
)
Arguments
| LDA_models | List of LDA models (class  | 
| document_covariate_table | Document covariate table (rows: documents,
columns: time index and covariate options). Every model needs a
covariate to describe the time value for each document (in whatever 
units and whose name in the table is input in  | 
| formulas | Vector of  | 
| nchangepoints | Vector of  | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
Value
TS_on_LDA: TS_on_LDA-class list of results 
from TS applied for each model on each LDA model input.
 
check_TS_inputs: An error message is thrown if any input
is not proper, else NULL.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
  LDA_models <- select_LDA(LDAs)
  weights <- document_weights(document_term_table)
  formulas <- c(~ 1, ~ newmoon)
  mods <- TS_on_LDA(LDA_models, document_covariate_table, formulas,
                    nchangepoints = 0:1, timename = "newmoon", weights)
Create the summary plot for a TS fit to an LDA model
Description
Produces a two-panel figure of [1] the change point 
distributions as histograms over time and [2] the time series of the 
fitted topic proportions over time, based on a selected set of 
change point locations. 
 
pred_gamma_TS_plot produces a time series of the 
fitted topic proportions over time, based on a selected set of change 
point locations. 
 
rho_hist: make a plot of the change point 
distributions as histograms over time.
Usage
TS_summary_plot(
  x,
  cols = set_TS_summary_plot_cols(),
  bin_width = 1,
  xname = NULL,
  border = NA,
  selection = "median",
  LDATS = FALSE
)
pred_gamma_TS_plot(
  x,
  selection = "median",
  cols = set_gamma_colors(x),
  xname = NULL,
  together = FALSE,
  LDATS = FALSE
)
rho_hist(
  x,
  cols = set_rho_hist_colors(x$rhos),
  bin_width = 1,
  xname = NULL,
  border = NA,
  together = FALSE,
  LDATS = FALSE
)
Arguments
| x | Object of class  | 
| cols | 
 | 
| bin_width | Width of the bins used in the histograms, in units of the x-axis (the time variable used to fit the model). | 
| xname | Label for the x-axis in the summary time series plot. Defaults
to  | 
| border | Border for the histogram, default is  | 
| selection | Indicator of the change points to use. Currently only defined for "median" and "mode". | 
| LDATS | 
 | 
| together | 
 | 
Value
NULL.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  TS_summary_plot(TSmod)
  pred_gamma_TS_plot(TSmod)
  rho_hist(TSmod)
Produce the autocorrelation panel for the TS diagnostic plot of a parameter
Description
Produce a vanilla ACF plot using acf for
the parameter of interest (rho or eta) as part of 
TS_diagnostics_plot.
Usage
autocorr_plot(x)
Arguments
| x | Vector of parameter values drawn from the posterior distribution, indexed to the iteration by the order of the vector. | 
Value
NULL.
Examples
 autocorr_plot(rnorm(100, 0, 1))
Check that LDA model input is proper
Description
Check that the LDA_models input is either a set of 
LDA models (class LDA_set, produced by
LDA_set) or a singular LDA model (class LDA,
produced by LDA).
Usage
check_LDA_models(LDA_models)
Arguments
| LDA_models | List of LDA models or singular LDA model to evaluate. | 
Value
An error message is thrown if LDA_models is not proper,
else NULL.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2, nseeds = 1)
  LDA_models <- select_LDA(LDAs)
  check_LDA_models(LDA_models)
Check that a set of change point locations is proper
Description
Check that the change point locations are numeric
and conformable to interger values.
Usage
check_changepoints(changepoints = NULL)
Arguments
| changepoints | Change point locations to evaluate. | 
Value
An error message is thrown if changepoints are not proper,
else NULL.
Examples
  check_changepoints(100)
Check that a control list is proper
Description
Check that a list of controls is of the right class.
Usage
check_control(control, eclass = "list")
Arguments
| control | Control list to evaluate. | 
| eclass | Expected class of the list to be evaluated. | 
Value
an error message is thrown if the input is improper, otherwise 
NULL.
Examples
 check_control(list())
Check that the document covariate table is proper
Description
Check that the table of document-level covariates is conformable to a data frame and of the right size (correct number of documents) for the document-topic output from the LDA models.
Usage
check_document_covariate_table(
  document_covariate_table,
  LDA_models = NULL,
  document_term_table = NULL
)
Arguments
| document_covariate_table | Document covariate table to evaluate. | 
| LDA_models | Reference LDA model list (class  | 
| document_term_table | Optional input for checking when
 | 
Value
An error message is thrown if document_covariate_table is 
not proper, else NULL.
Examples
  data(rodents)
  check_document_covariate_table(rodents$document_covariate_table)
Check that document term table is proper
Description
Check that the table of observations is conformable to a matrix of integers.
Usage
check_document_term_table(document_term_table)
Arguments
| document_term_table | Table of observation count data (rows: 
documents, columns: terms. May be a class  | 
Value
an error message is thrown if the input is improper, otherwise 
NULL.
Examples
 data(rodents)
 check_document_term_table(rodents$document_term_table)
Check that a formula is proper
Description
Check that formula is actually a 
formula and that the
response and predictor variables are all included in data.
Usage
check_formula(data, formula)
Arguments
| data | 
 | 
| formula | 
 | 
Value
An error message is thrown if formula is not proper,
else NULL.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  check_formula(data, gamma ~ 1)
Check that formulas vector is proper and append the response variable
Description
Check that the vector of formulas is actually formatted
as a vector of formula objects and that the 
predictor variables are all included in the document covariate table.
Usage
check_formulas(formulas, document_covariate_table, control = list())
Arguments
| formulas | Vector of the formulas to evaluate. | 
| document_covariate_table | Document covariate table used to evaluate the availability of the data required by the formula inputs. | 
| control | A  | 
Value
An error message is thrown if formulas is 
not proper, else NULL.
Examples
  data(rodents)
  check_formulas(~ 1, rodents$document_covariate_table)
Check that nchangepoints vector is proper
Description
Check that the vector of numbers of changepoints is conformable to integers greater than 1.
Usage
check_nchangepoints(nchangepoints)
Arguments
| nchangepoints | Vector of the number of changepoints to evaluate. | 
Value
An error message is thrown if nchangepoints is not proper,
else NULL.
Examples
  check_nchangepoints(0)
  check_nchangepoints(2)
Check that nseeds value or seeds vector is proper
Description
Check that the vector of numbers of seeds is conformable to integers greater than 0.
Usage
check_seeds(nseeds)
Arguments
| nseeds | 
 | 
Value
an error message is thrown if the input is improper, otherwise 
NULL.
Examples
 check_seeds(1)
 check_seeds(2)
Check that the time vector is proper
Description
Check that the vector of time values is included in the 
document covariate table and that it is either a integer-conformable or
a date. If it is a date, the input is converted to an 
integer, resulting in the timestep being 1 day, which is often not 
desired behavior.
Usage
check_timename(document_covariate_table, timename)
Arguments
| document_covariate_table | Document covariate table used to query for the time column. | 
| timename | Column name for the time variable to evaluate. | 
Value
An error message is thrown if timename is 
not proper, else NULL.
Examples
  data(rodents)
  check_timename(rodents$document_covariate_table, "newmoon")
Check that topics vector is proper
Description
Check that the vector of numbers of topics is conformable to integers greater than 1.
Usage
check_topics(topics)
Arguments
| topics | Vector of the number of topics to evaluate for each model.
Must be conformable to  | 
Value
an error message is thrown if the input is improper, otherwise 
NULL.
Examples
 check_topics(2)
Check that weights vector is proper
Description
Check that the vector of document weights is numeric and positive and inform the user if the average weight isn't 1.
Usage
check_weights(weights)
Arguments
| weights | Vector of the document weights to evaluate, or  | 
Value
An error message is thrown if weights is not proper,
else NULL.
Examples
  check_weights(1)
  wts <- runif(100, 0.1, 100)
  check_weights(wts)
  wts2 <- wts / mean(wts)
  check_weights(wts2)
  check_weights(TRUE)
Count trips of the ptMCMC particles
Description
Count the full trips (from one extreme temperature chain to
the other and back again; Katzgraber et al. 2006) for each of the
ptMCMC particles, as identified by their id on initialization.
 
This function was designed to work within TS and process
the output of est_changepoints as a component of 
diagnose_ptMCMC, but has been generalized
and would work with any output from a ptMCMC as long as ids
is formatted properly.
Usage
count_trips(ids)
Arguments
| ids | 
 | 
Value
list of [1] vector of within particle trip counts 
($trip_counts), and [2] vector of within-particle average 
trip rates ($trip_rates).
References
Katzgraber, H. G., S. Trebst, D. A. Huse. And M. Troyer. 2006. Feedback-optimized parallel tempering Monte Carlo. Journal of Statistical Mechanics: Theory and Experiment 3:P03018 link.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, gamma ~ 1, 1, "newmoon", weights,
                               TS_control())
  count_trips(rho_dist$ids)
Calculate ptMCMC summary diagnostics
Description
Summarize the step and swap acceptance rates as well as trip metrics from the saved output of a ptMCMC estimation.
Usage
diagnose_ptMCMC(ptMCMCout)
Arguments
| ptMCMCout | Named  | 
Details
Within-chain step acceptance rates are averaged for each of the
chains from the raw step acceptance histories 
(ptMCMCout$step_accepts) and between-chain swap acceptance rates
are similarly averaged for each of the neighboring pairs of chains from
the raw swap acceptance histories (ptMCMCout$swap_accepts).
Trips are defined as movement from one extreme chain to the other and
back again (Katzgraber et al. 2006). Trips are counted and turned 
to per-iteration rates using count_trips.
 
 
This function was first designed to work within TS and 
process the output of est_changepoints, but has been 
generalized and would work with any output from a ptMCMC as long as 
ptMCMCout is formatted properly.
Value
list of [1] within-chain average step acceptance rates 
($step_acceptance_rate), [2] average between-chain swap acceptance
rates ($swap_acceptance_rate), [3] within particle trip counts 
($trip_counts), and [4] within-particle average trip rates 
($trip_rates).
References
Katzgraber, H. G., S. Trebst, D. A. Huse. And M. Troyer. 2006. Feedback-optimized parallel tempering Monte Carlo. Journal of Statistical Mechanics: Theory and Experiment 3:P03018 link.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, gamma ~ 1, 1, "newmoon", 
                               weights, TS_control())
  diagnose_ptMCMC(rho_dist)
Calculate document weights for a corpus
Description
Simple calculation of document weights based on the average number of words in a document within the corpus (mean value = 1).
Usage
document_weights(document_term_table)
Arguments
| document_term_table | Table of observation count data (rows: 
documents, columns: terms. May be a class  | 
Value
Vector of weights, one for each document, with the average sample receiving a weight of 1.0.
Examples
 data(rodents)
 document_weights(rodents$document_term_table)
Produce the posterior distribution ECDF panel for the TS diagnostic plot of a parameter
Description
Produce a vanilla ECDF (empirical cumulative distribution
function) plot using ecdf for the parameter of interest (rho or 
eta) as part of TS_diagnostics_plot. A horizontal line 
is added to show the median of the posterior.
Usage
ecdf_plot(x, xlab = "parameter value")
Arguments
| x | Vector of parameter values drawn from the posterior distribution, indexed to the iteration by the order of the vector. | 
| xlab | 
 | 
Value
NULL.
Examples
 ecdf_plot(rnorm(100, 0, 1))
Use ptMCMC to estimate the distribution of change point locations
Description
This function executes ptMCMC-based estimation of the change point location distributions for multinomial Time Series analyses.
Usage
est_changepoints(
  data,
  formula,
  nchangepoints,
  timename,
  weights,
  control = list()
)
Arguments
| data | 
 | 
| formula | 
 | 
| nchangepoints | 
 | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
Value
List of saved data objects from the ptMCMC estimation of
change point locations (unless nchangepoints is 0, then 
NULL is returned).
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  formula <- gamma ~ 1
  nchangepoints <- 1
  control <- TS_control()
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, formula, nchangepoints, "newmoon", 
                               weights, control)
Estimate the distribution of regressors, unconditional on the change point locations
Description
This function uses the marginal posterior distributions of
the change point locations (estimated by est_changepoints)
in combination with the conditional (on the change point locations) 
posterior distributions of the regressors (estimated by
multinom_TS) to estimate the marginal posterior 
distribution of the regressors, unconditional on the change point 
locations.
Usage
est_regressors(rho_dist, data, formula, timename, weights, control = list())
Arguments
| rho_dist | List of saved data objects from the ptMCMC estimation of
change point locations (unless  | 
| data | 
 | 
| formula | 
 | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
Details
The general approach follows that of Western and Kleykamp
(2004), although we note some important differences. Our regression
models are fit independently for each chunk (segment of time), and 
therefore the variance-covariance matrix for the full model 
has 0 entries for covariances between regressors in different
chunks of the time series. Further, because the regression model here
is a standard (non-hierarchical) softmax (Ripley 1996, Venables and 
Ripley 2002, Bishop 2006), there is no error term in the regression  
(as there is in the normal model used by Western and Kleykamp 2004), 
and so the posterior distribution used here is a multivariate normal,
as opposed to a multivariate t, as used by Western and Kleykamp (2004).
Value
matrix of draws (rows) from the marginal posteriors of the 
coefficients across the segments (columns).
References
Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer, New York, NY, USA.
Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, UK.
Venables, W. N. and B. D. Ripley. 2002. Modern and Applied Statistics with S. Fourth Edition. Springer, New York, NY, USA.
Western, B. and M. Kleykamp. 2004. A Bayesian change point model for historical time series analysis. Political Analysis 12:354-374. link.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  formula <- gamma ~ 1
  nchangepoints <- 1
  control <- TS_control()
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, formula, nchangepoints, "newmoon", 
                               weights, control)
  eta_dist <- est_regressors(rho_dist, data, formula, "newmoon", weights, 
                             control)
Expand the TS models across the factorial combination of LDA models, formulas, and number of change points
Description
Expand the completely crossed combination of model inputs: LDA model results, formulas, and number of change points.
Usage
expand_TS(LDA_models, formulas, nchangepoints)
Arguments
| LDA_models | List of LDA models (class  | 
| formulas | Vector of  | 
| nchangepoints | Vector of  | 
Value
Expanded data.frame table of the three values (columns) for
each unique model run (rows): [1] the LDA model (indicated
as a numeric element reference to the LDA_models object), [2] the 
regressor formula, and [3] the number of changepoints.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
  LDA_models <- select_LDA(LDAs)
  weights <- document_weights(document_term_table)
  formulas <- c(~ 1, ~ newmoon)
  nchangepoints <- 0:1
  expand_TS(LDA_models, formulas, nchangepoints)
Replace if TRUE
Description
If the focal input is TRUE, replace it with 
alternative.
Usage
iftrue(x = TRUE, alt = NULL)
Arguments
| x | Focal input. | 
| alt | Alternative value. | 
Value
x if not TRUE, alt otherwise.
Examples
 iftrue()
 iftrue(TRUE, 1)
 iftrue(2, 1)
 iftrue(FALSE, 1)
Jornada rodent data
Description
Counts of 17 rodent species across 24 sampling events, with the count being the total number observed across three trapping webs (146 traps in total) (Lightfoot et al. 2012).
Usage
jornada
Format
A list of two data.frame-class objects with rows 
corresponding to documents (sampling events). One element is the
document term table (called document_term_table), which contains
counts of the species (terms) in each sample (document), and the other is
the document covariate table (called document_covariate_table) 
with columns of covariates (time step, year, season).
Source
https://lter.jornada.nmsu.edu/data-catalog/
References
Lightfoot, D. C., A. D. Davidson, D. G. Parker, L. Hernandez, and J. W. Laundre. 2012. Bottom-up regulation of desert grassland and shrubland rodent communities: implications of species-specific reproductive potentials. Journal of Mammalogy 93:1017-1028. link.
Calculate the log likelihood of a VEM LDA model fit
Description
Imported but updated calculations from topicmodels package, as
applied to Latent Dirichlet Allocation fit with Variational Expectation 
Maximization via LDA.
Usage
## S3 method for class 'LDA_VEM'
logLik(object, ...)
Arguments
| object | A  | 
| ... | Not used, simply included to maintain method compatibility. | 
Details
The number of degrees of freedom is 1 (for alpha) plus the number of entries in the document-topic matrix. The number of observations is the number of entries in the document-term matrix.
Value
Log likelihood of the model logLik, also with df
(degrees of freedom) and nobs (number of observations) values.
References
Buntine, W. 2002. Variational extensions to EM and multinomial PCA. European Conference on Machine Learning, Lecture Notes in Computer Science 2430:23-34. link.
Grun B. and K. Hornik. 2011. topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40:13. link.
Hoffman, M. D., D. M. Blei, and F. Bach. 2010. Online learning for latent Dirichlet allocation. Advances in Neural Information Processing Systems 23:856-864. link.
Examples
  data(rodents)
  lda_data <- rodents$document_term_table
  r_LDA <- LDA_set(lda_data, topics = 2)   
  logLik(r_LDA[[1]])
Determine the log likelihood of a Time Series model
Description
Convenience function to extract and format the log likelihood
of a TS_fit-class object fit by multinom_TS.
Usage
## S3 method for class 'TS_fit'
logLik(object, ...)
Arguments
| object | Class  | 
| ... | Not used, simply included to maintain method compatibility. | 
Value
Log likelihood of the model logLik, also with df
(degrees of freedom) and nobs (number of observations) values.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  logLik(TSmod)
Log likelihood of a multinomial TS model
Description
Convenience function to simply extract the logLik
element (and df and nobs) from a multinom_TS_fit
object fit by multinom_TS. Extends 
logLik from multinom to 
multinom_TS_fit objects.
Usage
## S3 method for class 'multinom_TS_fit'
logLik(object, ...)
Arguments
| object | A  | 
| ... | Not used, simply included to maintain method compatibility. | 
Value
Log likelihood of the model, as class logLik, with 
attributes df (degrees of freedom) and nobs (the number of
weighted observations, accounting for size differences among documents).
Examples
  data(rodents)
  dtt <- rodents$document_term_table
  lda <- LDA_set(dtt, 2, 1, list(quiet = TRUE))
  dct <- rodents$document_covariate_table
  dct$gamma <- lda[[1]]@gamma
  weights <- document_weights(dtt)
  mts <- multinom_TS(dct, formula = gamma ~ 1, changepoints = c(20,50),
                     timename = "newmoon", weights = weights)
  logLik(mts)
Calculate the log-sum-exponential (LSE) of a vector
Description
Calculate the exponent of a vector (offset by the max), sum the elements, calculate the log, remove the offset.
Usage
logsumexp(x)
Arguments
| x | 
 | 
Value
The LSE.
Examples
  logsumexp(1:10)
Logical control on whether or not to memoise
Description
This function provides a simple, logical toggle control on
whether the function fun should be memoised via
memoise or not.
Usage
memoise_fun(fun, memoise_tf = TRUE)
Arguments
| fun | Function name to (potentially) be memoised. | 
| memoise_tf | 
 | 
Value
fun, memoised if desired.
Examples
  sum_memo <- memoise_fun(sum)
Optionally generate a message based on a logical input
Description
Given the input to quiet, generate the message(s) 
in msg or not.
Usage
messageq(msg = NULL, quiet = FALSE)
Arguments
| msg | 
 | 
| quiet | 
 | 
Examples
  messageq("hello")
  messageq("hello", TRUE)
Create a properly symmetric variance covariance matrix
Description
A wrapper on vcov to produce a symmetric
matrix. If the default matrix returned by vcov is
symmetric it is returned simply. If it is not, in fact, symmetric
(as occurs occasionally with multinom applied to 
proportions), the matrix is made symmetric by averaging the lower and
upper triangles. If the relative difference between the upper and lower 
triangles for any entry is more than 0.1
Usage
mirror_vcov(x)
Arguments
| x | Model object that has a defined method for 
 | 
Value
Properly symmetric variance covariance matrix.
Examples
  dat <- data.frame(y = rnorm(50), x = rnorm(50))
  mod <- lm(dat)
  mirror_vcov(mod)
Determine the mode of a distribution
Description
Find the most common entry in a vector. Ties are not allowed, the first value encountered within the modal set if there are ties is deemed the mode.
Usage
modalvalue(x)
Arguments
| x | 
 | 
Value
Numeric value of the mode.
Examples
 d1 <- c(1, 1, 1, 2, 2, 3)
 modalvalue(d1)
Fit a multinomial change point Time Series model
Description
Fit a set of multinomial regression models (via
multinom, Venables and Ripley 2002) to a time series
of data divided into multiple segments (a.k.a. chunks) based on given 
locations for a set of change points. 
 
check_multinom_TS_inputs checks that the inputs to 
multinom_TS are of proper classes for an analysis.
Usage
multinom_TS(
  data,
  formula,
  changepoints = NULL,
  timename = "time",
  weights = NULL,
  control = list()
)
check_multinom_TS_inputs(
  data,
  formula = gamma ~ 1,
  changepoints = NULL,
  timename = "time",
  weights = NULL,
  control = list()
)
Arguments
| data | 
 | 
| formula | 
 | 
| changepoints | Numeric vector indicating locations of the change 
points. Must be conformable to  | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
Value
multinom_TS: Object of class multinom_TS_fit, 
which is a list of [1]
chunk-level model fits ("chunk models"), [2] the total log 
likelihood combined across all chunks ("logLik"), and [3] a 
data.frame of chunk beginning and ending times ("logLik"
with columns "start" and "end"). 
 
check_multinom_TS_inputs: an error message is thrown if any 
input is improper, otherwise NULL.
References
Venables, W. N. and B. D. Ripley. 2002. Modern and Applied Statistics with S. Fourth Edition. Springer, New York, NY, USA.
Examples
  data(rodents)
  dtt <- rodents$document_term_table
  lda <- LDA_set(dtt, 2, 1, list(quiet = TRUE))
  dct <- rodents$document_covariate_table
  dct$gamma <- lda[[1]]@gamma
  weights <- document_weights(dtt)
  check_multinom_TS_inputs(dct, timename = "newmoon")
  mts <- multinom_TS(dct, formula = gamma ~ 1, changepoints = c(20,50),
                     timename = "newmoon", weights = weights) 
Fit a multinomial Time Series model chunk
Description
Fit a multinomial regression model (via
multinom, Ripley 1996, Venables and Ripley 2002)
to a defined chunk of time (a.k.a. segment)
[chunk$start, chunk$end] within a time series.
Usage
multinom_TS_chunk(
  data,
  formula,
  chunk,
  timename = "time",
  weights = NULL,
  control = list()
)
Arguments
| data | Class  | 
| formula | Formula as a  | 
| chunk | Length-2 vector of times: [1]  | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
Value
Fitted model object for the chunk, of classes multinom and
nnet.
References
Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge.
Venables, W. N. and B. D. Ripley. 2002. Modern Applied Statistics with S. Fourth edition. Springer.
Examples
  data(rodents)
  dtt <- rodents$document_term_table
  lda <- LDA_set(dtt, 2, 1, list(quiet = TRUE))
  dct <- rodents$document_covariate_table
  dct$gamma <- lda[[1]]@gamma
  weights <- document_weights(dtt)
  chunk <- c(start = 0, end = 100)
  mtsc <- multinom_TS_chunk(dct, formula = gamma ~ 1, chunk = chunk,
                     timename = "newmoon", weights = weights) 
Normalize a vector
Description
Normalize a numeric vector to be on the scale of [0,1].
Usage
normalize(x)
Arguments
| x | 
 | 
Value
Normalized x.
Examples
 normalize(1:10)
Package the output of LDA_TS
Description
Combine the objects returned by LDA_set,
select_LDA, TS_on_LDA, and
select_TS, name them as elements of the list, and
set the class of the list as LDA_TS, for the return from
LDA_TS.
Usage
package_LDA_TS(LDAs, sel_LDA, TSs, sel_TSs)
Arguments
| LDAs | List (class:  | 
| sel_LDA | A reduced version of  | 
| TSs | Class  | 
| sel_TSs | A reduced version of  | 
Value
Class LDA_TS-class object including all fitted models and 
selected models specifically, ready to be returned from 
LDA_TS.
Examples
  data(rodents)
  data <- rodents
  control <- LDA_TS_control()              
  dtt <- data$document_term_table
  dct <- data$document_covariate_table
  weights <- document_weights(dtt)
  LDAs <- LDA_set(dtt, 2, 1, control$LDA_set_control)
  sel_LDA <- select_LDA(LDAs, control$LDA_set_control)
  TSs <- TS_on_LDA(sel_LDA, dct, ~1, 1, "newmoon", weights,  
                   control$TS_control)
  sel_TSs <- select_TS(TSs, control$TS_control)
  package_LDA_TS(LDAs, sel_LDA, TSs, sel_TSs)
 
Package the output from LDA_set
Description
Name the elements (LDA models) and set the class 
(LDA_set) of the models returned by LDA_set.
Usage
package_LDA_set(mods, mod_topics, mod_seeds)
Arguments
| mods | Fitted models returned from  | 
| mod_topics | Vector of  | 
| mod_seeds | Vector of  | 
Value
lis (class: LDA_set) of LDA models (class: 
LDA_VEM).
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  topics <- 2
  nseeds <- 2
  control <- LDA_set_control()
  mod_topics <- rep(topics, each = length(seq(2, nseeds * 2, 2)))
  iseed <- control$iseed
  mod_seeds <- rep(seq(iseed, iseed + (nseeds - 1)* 2, 2), length(topics))
  nmods <- length(mod_topics)
  mods <- vector("list", length = nmods)
  for (i in 1:nmods){
    LDA_msg(mod_topics[i], mod_seeds[i], control)
    control_i <- prep_LDA_control(seed = mod_seeds[i], control = control)
    mods[[i]] <- topicmodels::LDA(document_term_table, k = mod_topics[i], 
                     control = control_i)
  }
  package_LDA_set(mods, mod_topics, mod_seeds)
Summarize the Time Series model
Description
Calculate relevant summaries for the run of a Time Series
model within TS and package the output as a
TS_fit-class object.
Usage
package_TS(data, formula, timename, weights, control, rho_dist, eta_dist)
Arguments
| data | 
 | 
| formula | 
 | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
| rho_dist | List of saved data objects from the ptMCMC estimation of
change point locations returned by  | 
| eta_dist | Matrix of draws (rows) from the marginal posteriors of the 
coefficients across the segments (columns), as estimated by
 | 
Value
TS_fit-class list containing the following elements, many of
which are hidden for printing, but are accessible:
- data
- datainput to the function.
- formula
- formulainput to the function.
- nchangepoints
- nchangepointsinput to the function.
- weights
- weightsinput to the function.
- timename
- timenameinput to the function.
- control
- controlinput to the function.
- lls
- Iteration-by-iteration logLik values for the full time series fit by - multinom_TS.
- rhos
- Iteration-by-iteration change point estimates from - est_changepoints.
- etas
- Iteration-by-iteration marginal regressor estimates from - est_regressors, which have been unconditioned with respect to the change point locations.
- ptMCMC_diagnostics
- ptMCMC diagnostics, see - diagnose_ptMCMC
- rho_summary
- Summary table describing - rhos(the change point locations), see- summarize_rhos.
- rho_vcov
- Variance-covariance matrix for the estimates of - rhos(the change point locations), see- measure_rho_vcov.
- eta_summary
- Summary table describing - ets(the regressors), see- summarize_etas.
- eta_vcov
- Variance-covariance matrix for the estimates of - etas(the regressors), see- measure_eta_vcov.
- logLik
- Across-iteration average of log-likelihoods ( - lls).
- nparams
- Total number of parameters in the full model, including the change point locations and regressors. 
- AIC
- Penalized negative log-likelihood, based on - logLikand- nparams.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  formula <- gamma ~ 1
  nchangepoints <- 1
  control <- TS_control()
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, formula, nchangepoints, "newmoon", 
                               weights, control)
  eta_dist <- est_regressors(rho_dist, data, formula, "newmoon", weights, 
                             control)
  package_TS(data, formula, "newmoon", weights, control, rho_dist, 
             eta_dist)
Package the output of TS_on_LDA
Description
Set the class and name the elements of the results list 
returned from applying TS to the combination of TS models
requested for the LDA model(s) input.
Usage
package_TS_on_LDA(TSmods, LDA_models, models)
Arguments
| TSmods | list of results from  | 
| LDA_models | List of LDA models (class  | 
| models | 
 | 
Value
Class TS_on_LDA list of results from TS 
applied for each model on each LDA model input.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
  LDA_models <- select_LDA(LDAs)
  weights <- document_weights(document_term_table)
  mods <- expand_TS(LDA_models, c(~ 1, ~ newmoon), 0:1)
  nmods <- nrow(mods)
  TSmods <- vector("list", nmods)
  for(i in 1:nmods){
    formula_i <- mods$formula[[i]]
    nchangepoints_i <- mods$nchangepoints[i]
    data_i <- prep_TS_data(document_covariate_table, LDA_models, mods, i)
    TSmods[[i]] <- TS(data_i, formula_i, nchangepoints_i, "newmoon", 
                      weights, TS_control())
  }
  package_TS_on_LDA(TSmods, LDA_models, mods)
Package the output of the chunk-level multinomial models into a multinom_TS_fit list
Description
Takes the list of fitted chunk-level models returned from
TS_chunk_memo (the memoised version of 
multinom_TS_chunk and packages it as a 
multinom_TS_fit object. This involves naming the model fits based 
on the chunk time windows, combining the log likelihood values across the 
chunks, and setting the class of the output object.
Usage
package_chunk_fits(chunks, fits)
Arguments
| chunks | Data frame of  | 
| fits | List of chunk-level fits returned by  | 
Value
Object of class multinom_TS_fit, which is a list of [1]
chunk-level model fits, [2] the total log likelihood combined across 
all chunks, and [3] the chunk time data table.
Examples
  data(rodents)
  dtt <- rodents$document_term_table
  lda <- LDA_set(dtt, 2, 1, list(quiet = TRUE))
  dct <- rodents$document_covariate_table
  dct$gamma <- lda[[1]]@gamma
  weights <- document_weights(dtt)
  formula <- gamma ~ 1
  changepoints <- c(20,50)
  timename <- "newmoon"
  TS_chunk_memo <- memoise_fun(multinom_TS_chunk, TRUE)
  chunks <- prep_chunks(dct, changepoints, timename)
  nchunks <- nrow(chunks)
  fits <- vector("list", length = nchunks)
  for (i in 1:nchunks){
    fits[[i]] <- TS_chunk_memo(dct, formula, chunks[i, ], timename, 
                               weights, TS_control())
  }
  package_chunk_fits(chunks, fits) 
Plot the key results from a full LDATS analysis
Description
Generalization of the plot function to
work on fitted LDA_TS model objects (class LDA_TS) returned by
LDA_TS).
Usage
## S3 method for class 'LDA_TS'
plot(
  x,
  ...,
  cols = set_LDA_TS_plot_cols(),
  bin_width = 1,
  xname = NULL,
  border = NA,
  selection = "median"
)
Arguments
| x | A  | 
| ... | Additional arguments to be passed to subfunctions. Not currently
used, just retained for alignment with  | 
| cols | 
 | 
| bin_width | Width of the bins used in the histograms of the summary time series plot, in units of the time variable used to fit the model (the x-axis). | 
| xname | Label for the x-axis in the summary time series plot. Defaults
to  | 
| border | Border for the histogram, default is  | 
| selection | Indicator of the change points to use in the time series
summary plot. Currently only defined for  | 
Value
NULL.
Examples
  data(rodents)
  mod <- LDA_TS(data = rodents, topics = 2, nseeds = 1, formulas = ~1,
                nchangepoints = 1, timename = "newmoon")
  plot(mod, binwidth = 5, xlab = "New moon")
Plot the results of an LDATS LDA model
Description
Create an LDATS LDA summary plot, with a top panel showing
the topic proportions for each word and a bottom panel showing the topic
proportions of each document/over time. The plot function is defined for
class LDA_VEM specifically (see LDA).
 
LDA_plot_top_panel creates an LDATS LDA summary plot 
top panel showing the topic proportions word-by-word. 
 
LDA_plot_bottom_panel creates an LDATS LDA summary plot
bottom panel showing the topic proportions over time/documents.
Usage
## S3 method for class 'LDA_VEM'
plot(
  x,
  ...,
  xtime = NULL,
  xname = NULL,
  cols = NULL,
  option = "C",
  alpha = 0.8,
  LDATS = FALSE
)
LDA_plot_top_panel(
  x,
  cols = NULL,
  option = "C",
  alpha = 0.8,
  together = FALSE,
  LDATS = FALSE
)
LDA_plot_bottom_panel(
  x,
  xtime = NULL,
  xname = NULL,
  cols = NULL,
  option = "C",
  alpha = 0.8,
  together = FALSE,
  LDATS = FALSE
)
Arguments
| x | Object of class  | 
| ... | Not used, retained for alignment with base function. | 
| xtime | Optional x values used to plot the topic proportions according to a specific time value (rather than simply the order of observations). | 
| xname | Optional name for the x values used in plotting the topic proportions (otherwise defaults to "Document"). | 
| cols | Colors to be used to plot the topics.
Any valid color values (e.g., see  | 
| option | A  | 
| alpha | Numeric value [0,1] that indicates the transparency of the 
colors used. Supported only on some devices, see 
 | 
| LDATS | 
 | 
| together | 
 | 
Value
NULL.
Examples
  data(rodents)
  lda_data <- rodents$document_term_table
  r_LDA <- LDA_set(lda_data, topics = 4, nseeds = 10) 
  best_lda <- select_LDA(r_LDA)[[1]]
  plot(best_lda, option = "cividis")
  LDA_plot_top_panel(best_lda, option = "cividis")
  LDA_plot_bottom_panel(best_lda, option = "cividis")
Plot a set of LDATS LDA models
Description
Generalization of the plot function to 
work on a list of LDA topic models (class LDA_set).
Usage
## S3 method for class 'LDA_set'
plot(x, ...)
Arguments
| x | An  | 
| ... | Additional arguments to be passed to subfunctions. | 
Value
NULL.
Examples
  data(rodents)
  lda_data <- rodents$document_term_table
  r_LDA <- LDA_set(lda_data, topics = 2, nseeds = 2) 
  plot(r_LDA)
Plot an LDATS TS model
Description
Generalization of the plot function to 
work on fitted TS model objects (class TS_fit) returned from 
TS.
Usage
## S3 method for class 'TS_fit'
plot(
  x,
  ...,
  plot_type = "summary",
  interactive = FALSE,
  cols = set_TS_summary_plot_cols(),
  bin_width = 1,
  xname = NULL,
  border = NA,
  selection = "median",
  LDATS = FALSE
)
Arguments
| x | A  | 
| ... | Additional arguments to be passed to subfunctions. Not currently
used, just retained for alignment with  | 
| plot_type | "diagnostic" or "summary". | 
| interactive | 
 | 
| cols | 
 | 
| bin_width | Width of the bins used in the histograms of the summary time series plot, in units of the x-axis (the time variable used to fit the model). | 
| xname | Label for the x-axis in the summary time series plot. Defaults
to  | 
| border | Border for the histogram, default is  | 
| selection | Indicator of the change points to use in the time series
summary plot. Currently only defined for  | 
| LDATS | 
 | 
Value
NULL.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  plot(TSmod)
Produce the posterior distribution histogram panel for the TS diagnostic plot of a parameter
Description
Produce a vanilla histogram plot using hist for the 
parameter of interest (rho or eta) as part of 
TS_diagnostics_plot. A vertical line is added to show the 
median of the posterior.
Usage
posterior_plot(x, xlab = "parameter value")
Arguments
| x | Vector of parameter values drawn from the posterior distribution, indexed to the iteration by the order of the vector. | 
| xlab | 
 | 
Value
NULL.
Examples
 posterior_plot(rnorm(100, 0, 1))
Set the control inputs to include the seed
Description
Update the control list for the LDA model with the specific seed as indicated. And remove controls not used within the LDA itself.
Usage
prep_LDA_control(seed, control = list())
Arguments
| seed | 
 | 
| control | Named list of control parameters to be used in 
 | 
Value
list of controls to be used in the LDA.
Examples
  prep_LDA_control(seed = 1) 
Prepare the model-specific data to be used in the TS analysis of LDA output
Description
Append the estimated topic proportions from a fitted LDA model 
to the document covariate table to create the data structure needed for 
TS.
Usage
prep_TS_data(document_covariate_table, LDA_models, mods, i = 1)
Arguments
| document_covariate_table | Document covariate table (rows: documents,
columns: time index and covariate options). Every model needs a
covariate to describe the time value for each document (in whatever 
units and whose name in the table is input in  | 
| LDA_models | List of LDA models (class  | 
| mods | The  | 
| i | 
 | 
Value
Class data.frame object including [1] the time variable
(indicated in control), [2] the predictor variables (required by
formula) and [3], the multinomial response variable (indicated
in formula), ready for input into TS.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
  LDA_models <- select_LDA(LDAs)
  weights <- document_weights(document_term_table)
  formulas <- c(~ 1, ~ newmoon)
  mods <- expand_TS(LDA_models, formulas = ~1, nchangepoints = 0)
  data1 <- prep_TS_data(document_covariate_table, LDA_models, mods)
Prepare the time chunk table for a multinomial change point Time Series model
Description
Creates the table containing the start and end times for each
chunk within a time series, based on the change points (used to break up
the time series) and the range of the time series. If there are no 
change points (i.e. changepoints is NULL, there is still a
single chunk defined by the start and end of the time series.
Usage
prep_chunks(data, changepoints = NULL, timename = "time")
Arguments
| data | Class  | 
| changepoints | Numeric vector indicating locations of the change 
points. Must be conformable to  | 
| timename | 
 | 
Value
data.frame of start and end times (columns)
for each chunk (rows).
Examples
  data(rodents)
  dtt <- rodents$document_term_table
  lda <- LDA_set(dtt, 2, 1, list(quiet = TRUE))
  dct <- rodents$document_covariate_table
  dct$gamma <- lda[[1]]@gamma
  chunks <- prep_chunks(dct, changepoints = 100, timename = "newmoon")   
Initialize and update the change point matrix used in the ptMCMC algorithm
Description
Each of the chains is initialized by prep_cpts using a 
draw from the available times (i.e. assuming a uniform prior), the best 
fit (by likelihood) draw is put in the focal chain with each subsequently 
worse fit placed into the subsequently hotter chain. update_cpts
updates the change points after every iteration in the ptMCMC algorithm.
Usage
prep_cpts(data, formula, nchangepoints, timename, weights, control = list())
update_cpts(cpts, swaps)
Arguments
| data | 
 | 
| formula | 
 | 
| nchangepoints | 
 | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
| cpts | The existing matrix of change points. | 
| swaps | Chain configuration after among-temperature swaps. | 
Value
list of [1] matrix of change points (rows) for 
each temperature (columns) and [2] vector of log-likelihood 
values for each of the chains.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights,
                               TS_control())
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  ids <- prep_ids(TS_control())
  for(i in 1:TS_control()$nit){
    steps <- step_chains(i, cpts, inputs)
    swaps <- swap_chains(steps, inputs, ids)
    saves <- update_saves(i, saves, steps, swaps)
    cpts <- update_cpts(cpts, swaps)
    ids <- update_ids(ids, swaps)
  }
Initialize and update the chain ids throughout the ptMCMC algorithm
Description
prep_ids creates and update_ids updates 
the active vector of identities (ids) for each of the chains in the 
ptMCMC algorithm. These ids are used to track trips of the particles
among chains.
 
These functions were designed to work within TS and 
specifically est_changepoints, but have been generalized
and would work within any general ptMCMC as long as control,
ids, and swaps are formatted properly.
Usage
prep_ids(control = list())
update_ids(ids, swaps)
Arguments
| control | A  | 
| ids | The existing vector of chain ids. | 
| swaps | Chain configuration after among-temperature swaps. | 
Value
The vector of chain ids.
Examples
  prep_ids()
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights,
                               TS_control())
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  ids <- prep_ids(TS_control())
  for(i in 1:TS_control()$nit){
    steps <- step_chains(i, cpts, inputs)
    swaps <- swap_chains(steps, inputs, ids)
    saves <- update_saves(i, saves, steps, swaps)
    cpts <- update_cpts(cpts, swaps)
    ids <- update_ids(ids, swaps)
  }
Initialize and tick through the progress bar
Description
prep_pbar creates and update_pbar steps
through the progress bars (if desired) in TS
Usage
prep_pbar(control = list(), bar_type = "rho", nr = NULL)
update_pbar(pbar, control = list())
Arguments
| control | A  | 
| bar_type | "rho" (for change point locations) or "eta" (for regressors). | 
| nr | 
 | 
| pbar | The progress bar object returned from  | 
Value
prep_pbar: the initialized progress bar object. 
 
update_pbar: the ticked-forward pbar.
Examples
  pb <- prep_pbar(control = list(nit = 2)); pb
  pb <- update_pbar(pb); pb
  pb <- update_pbar(pb); pb
Pre-calculate the change point proposal distribution for the ptMCMC algorithm
Description
Calculate the proposal distribution in advance of actually running the ptMCMC algorithm in order to decrease computation time. The proposal distribution is a joint of three distributions: [1] a multinomial distribution selecting among the change points within the chain, [2] a binomial distribution selecting the direction of the step of the change point (earlier or later in the time series), and [3] a geometric distribution selecting the magnitude of the step.
Usage
prep_proposal_dist(nchangepoints, control = list())
Arguments
| nchangepoints | Integer corresponding to the number of change points to include in the model. 0 is a valid input (corresponding to no change points, so a singular time series model), and the current implementation can reasonably include up to 6 change points. The number of change points is used to dictate the segmentation of the data for each continuous model and each LDA model. | 
| control | A  | 
Value
list of two matrix elements: [1] the size of the 
proposed step for each iteration of each chain and [2] the identity of 
the change point location to be shifted by the step for each iteration of
each chain.
Examples
  prep_proposal_dist(nchangepoints = 2)
Prepare the inputs for the ptMCMC algorithm estimation of change points
Description
Package the static inputs (controls and data structures) used
by the ptMCMC algorithm in the context of estimating change points. 
 
This function was designed to work within TS and 
specifically est_changepoints. It is still hardcoded to do
so, but has the capacity to be generalized to work with any estimation
via ptMCMC with additional coding work.
Usage
prep_ptMCMC_inputs(
  data,
  formula,
  nchangepoints,
  timename,
  weights = NULL,
  control = list()
)
Arguments
| data | Class  | 
| formula | 
 | 
| nchangepoints | Integer corresponding to the number of change points to include in the model. 0 is a valid input (corresponding to no change points, so a singular time series model), and the current implementation can reasonably include up to 6 change points. The number of change points is used to dictate the segmentation of the data for each continuous model and each LDA model. | 
| timename | 
 | 
| weights | Optional class  | 
| control | A  | 
Value
Class ptMCMC_inputs list, containing the static 
inputs for use within the ptMCMC algorithm for estimating change points.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights, 
                               TS_control())
Prepare and update the data structures to save the ptMCMC output
Description
prep_saves creates the data structure used to save the 
output from each iteration of the ptMCMC algorithm, which is added via
update_saves. Once the ptMCMC is complete, the saved data objects
are then processed (burn-in iterations are dropped and the remaining
iterations are thinned) via process_saves.
 
This set of functions was designed to work within TS and 
specifically est_changepoints. They are still hardcoded to
do so, but have the capacity to be generalized to work with any
estimation via ptMCMC with additional coding work.
Usage
prep_saves(nchangepoints, control = list())
update_saves(i, saves, steps, swaps)
process_saves(saves, control = list())
Arguments
| nchangepoints | 
 | 
| control | A  | 
| i | 
 | 
| saves | The existing list of saved data objects. | 
| steps | Chain configuration after within-temperature steps. | 
| swaps | Chain configuration after among-temperature swaps. | 
Value
list of ptMCMC objects: change points ($cpts), 
log-likelihoods ($lls), chain ids ($ids), step acceptances
($step_accepts), and swap acceptances ($swap_accepts).
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights, 
                               TS_control())
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  ids <- prep_ids(TS_control())
  for(i in 1:TS_control()$nit){
    steps <- step_chains(i, cpts, inputs)
    swaps <- swap_chains(steps, inputs, ids)
    saves <- update_saves(i, saves, steps, swaps)
    cpts <- update_cpts(cpts, swaps)
    ids <- update_ids(ids, swaps)
  }
  process_saves(saves, TS_control())
Prepare the ptMCMC temperature sequence
Description
Create the series of temperatures used in the ptMCMC 
algorithm.
 
This function was designed to work within TS and 
est_changepoints specifically, but has been generalized
and would work with any ptMCMC model as long as control
includes the relevant control parameters (and provided that the 
check_control function and its use here are generalized).
Usage
prep_temp_sequence(control = list())
Arguments
| control | A  | 
Value
vector of temperatures.
Examples
  prep_temp_sequence()
Print the selected LDA and TS models of LDA_TS object
Description
Convenience function to print only the selected elements of a 
LDA_TS-class object returned by LDA_TS
Usage
## S3 method for class 'LDA_TS'
print(x, ...)
Arguments
| x | Class  | 
| ... | Not used, simply included to maintain method compatibility. | 
Value
The selected models in x as a two-element list with
the TS component only returning the non-hidden components.
Examples
  data(rodents)
  mod <- LDA_TS(data = rodents, topics = 2, nseeds = 1, formulas = ~1,
                nchangepoints = 1, timename = "newmoon")
  print(mod)
Print a Time Series model fit
Description
Convenience function to print only the most important 
components of a TS_fit-class object fit by 
TS.
Usage
## S3 method for class 'TS_fit'
print(x, ...)
Arguments
| x | Class  | 
| ... | Not used, simply included to maintain method compatibility. | 
Value
The non-hidden parts of x as a list.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  print(TSmod)
Print a set of Time Series models fit to LDAs
Description
Convenience function to print only the names of a 
TS_on_LDA-class object generated by TS_on_LDA.
Usage
## S3 method for class 'TS_on_LDA'
print(x, ...)
Arguments
| x | Class  | 
| ... | Not used, simply included to maintain method compatibility. | 
Value
character vector of the names of x's models.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
  LDA_models <- select_LDA(LDAs)
  weights <- document_weights(document_term_table)
  formulas <- c(~ 1, ~ newmoon)
  mods <- TS_on_LDA(LDA_models, document_covariate_table, formulas,
                    nchangepoints = 0:1, timename = "newmoon", weights)
  print(mods)
Print the message to the console about which combination of the Time Series and LDA models is being run
Description
If desired, print a message at the beginning of every model combination stating the TS model and the LDA model being evaluated.
Usage
print_model_run_message(models, i, LDA_models, control)
Arguments
| models | 
 | 
| i | 
 | 
| LDA_models | List of LDA models (class  | 
| control | A  | 
Value
NULL.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
  LDA_models <- select_LDA(LDAs)
  weights <- document_weights(document_term_table)
  formulas <- c(~ 1, ~ newmoon)
  nchangepoints <- 0:1
  mods <- expand_TS(LDA_models, formulas, nchangepoints)
  print_model_run_message(mods, 1, LDA_models, TS_control())
Fit the chunk-level models to a time series, given a set of proposed change points within the ptMCMC algorithm
Description
This function wraps around TS_memo 
(optionally memoised multinom_TS) to provide a
simpler interface within the ptMCMC algorithm and is implemented within
propose_step.
Usage
proposed_step_mods(prop_changepts, inputs)
Arguments
| prop_changepts | 
 | 
| inputs | Class  | 
Value
List of models associated with the proposed step, with an element for each chain.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights,
                               TS_control())
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  i <- 1
  pdist <- inputs$pdist
  ntemps <- length(inputs$temps)
  selection <- cbind(pdist$which_steps[i, ], 1:ntemps)
  prop_changepts <- cpts$changepts
  curr_changepts_s <- cpts$changepts[selection]
  prop_changepts_s <- curr_changepts_s + pdist$steps[i, ]
  if(all(is.na(prop_changepts_s))){
    prop_changepts_s <- NULL
  }
  prop_changepts[selection] <- prop_changepts_s
  mods <- proposed_step_mods(prop_changepts, inputs)
Add change point location lines to the time series plot
Description
Adds vertical lines to the plot of the time series of fitted proportions associated with the change points of interest.
Usage
rho_lines(spec_rhos)
Arguments
| spec_rhos | 
 | 
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  pred_gamma_TS_plot(TSmod)
  rho_lines(200)
Portal rodent data
Description
An example LDATS dataset, functionally that used in Christensen et al. (2018). The data are counts of 21 rodent species across 436 sampling events, with the count being the total number observed across 8 50 m x 50 m plots, each sampled using 49 live traps (Brown 1998, Ernest et al. 2016).
Usage
rodents
Format
A list of two data.frame-class objects with rows 
corresponding to documents (sampling events). One element is the
document term table (called document_term_table), which contains
counts of the species (terms) in each sample (document), and the other is
the document covariate table (called document_covariate_table) 
with columns of covariates (newmoon number, sin and cos of the fraction
of the year).
Source
https://github.com/weecology/PortalData/tree/master/Rodents
References
Brown, J. H. 1998. The desert granivory experiments at Portal. Pages 71-95 in W. J. Resetarits Jr. and J. Bernardo, editors, Experimental Ecology. Oxford University Press, New York, New York, USA.
Christensen, E., D. J. Harris, and S. K. M. Ernest. 2018. Long-term community change through multiple rapid transitions in a desert rodent community. Ecology 99:1523-1529. link.
Ernest, S. K. M., et al. 2016. Long-term monitoring and experimental manipulation of a Chihuahuan desert ecosystem near Portal, Arizona (1977-2013). Ecology 97:1082. link.
Select the best LDA model(s) for use in time series
Description
Select the best model(s) of interest from an
LDA_set object, based on a set of user-provided functions. The
functions default to choosing the model with the lowest AIC value.
Usage
select_LDA(LDA_models = NULL, control = list())
Arguments
| LDA_models | An object of class  | 
| control | A  | 
Value
A reduced version of LDA_models that only includes the 
selected LDA model(s). The returned object is still an object of
class LDA_set.
Examples
  data(rodents)
  lda_data <- rodents$document_term_table
  r_LDA <- LDA_set(lda_data, topics = 2, nseeds = 2)  
  select_LDA(r_LDA)                       
Select the best Time Series model
Description
Select the best model of interest from an
TS_on_LDA object generated by TS_on_LDA, based on
a set of user-provided functions. The functions default to choosing the 
model with the lowest AIC value. 
 
Presently, the set of functions should result in a singular selected
model. If multiple models are chosen via the selection, only the first
is returned.
Usage
select_TS(TS_models, control = list())
Arguments
| TS_models | An object of class  | 
| control | A  | 
Value
A reduced version of TS_models that only includes the 
selected TS model. The returned object is a single TS model object of
class TS_fit.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
  LDA_models <- select_LDA(LDAs)
  weights <- document_weights(document_term_table)
  formulas <- c(~ 1, ~ newmoon)
  mods <- TS_on_LDA(LDA_models, document_covariate_table, formulas,
                    nchangepoints = 0:1, timename = "newmoon", weights)
  select_TS(mods)
Create the list of colors for the LDATS summary plot
Description
A default list generator function that produces the options
for the colors controlling the panels of the LDATS summary plots, needed
because the change point histogram panel should be in a different color
scheme than the LDA and fitted time series model panels, which should be
in a matching color scheme. See set_LDA_plot_colors,
set_TS_summary_plot_cols, set_gamma_colors,
and set_rho_hist_colors for specific details on usage.
Usage
set_LDA_TS_plot_cols(
  rho_cols = NULL,
  rho_option = "D",
  rho_alpha = 0.4,
  gamma_cols = NULL,
  gamma_option = "C",
  gamma_alpha = 0.8
)
Arguments
| rho_cols | Colors to be used to plot the histograms of change points.
Any valid color values (e.g., see  | 
| rho_option | A  | 
| rho_alpha | Numeric value [0,1] that indicates the transparency of the
colors used. Supported only on some devices, see
 | 
| gamma_cols | Colors to be used to plot the LDA topic proportions,
time series of observed topic proportions, and time series of fitted
topic proportions. Any valid color values (e.g., see
 | 
| gamma_option | A  | 
| gamma_alpha | Numeric value [0,1] that indicates the transparency of
the colors used. Supported only on some devices, see
 | 
Value
list of elements used to define the colors for the two
panels of the summary plot, as generated simply using
set_LDA_TS_plot_cols. cols has two elements:
LDA and TS, each corresponding the set of plots for
its stage in the full model. LDA contains entries cols
and options (see set_LDA_plot_colors). TS
contains two entries, rho and gamma, each corresponding
to the related panel, and each containing default values for entries
named cols, option, and alpha (see
set_TS_summary_plot_cols, set_gamma_colors,
and set_rho_hist_colors).
Examples
  set_LDA_TS_plot_cols()
Prepare the colors to be used in the LDA plots
Description
Based on the inputs, create the set of colors to be used in
the LDA plots made by plot.LDA_TS.
Usage
set_LDA_plot_colors(x, cols = NULL, option = "C", alpha = 0.8)
Arguments
| x | Object of class  | 
| cols | Colors to be used to plot the topics.
Any valid color values (e.g., see  | 
| option | A  | 
| alpha | Numeric value [0,1] that indicates the transparency of the 
colors used. Supported only on some devices, see 
 | 
Value
vector of character hex codes indicating colors to 
use.
Examples
  data(rodents)
  lda_data <- rodents$document_term_table
  r_LDA <- LDA_set(lda_data, topics = 4, nseeds = 10) 
  set_LDA_plot_colors(r_LDA[[1]])
Create the list of colors for the TS summary plot
Description
A default list generator function that produces the options
for the colors controlling the panels of the TS summary plots, so needed
because the panels should be in different color schemes. See 
set_gamma_colors and set_rho_hist_colors for
specific details on usage.
Usage
set_TS_summary_plot_cols(
  rho_cols = NULL,
  rho_option = "D",
  rho_alpha = 0.4,
  gamma_cols = NULL,
  gamma_option = "C",
  gamma_alpha = 0.8
)
Arguments
| rho_cols | Colors to be used to plot the histograms of change points.
Any valid color values (e.g., see  | 
| rho_option | A  | 
| rho_alpha | Numeric value [0,1] that indicates the transparency of the 
colors used. Supported only on some devices, see 
 | 
| gamma_cols | Colors to be used to plot the LDA topic proportions,
time series of observed topic proportions, and time series of fitted 
topic proportions. Any valid color values (e.g., see 
 | 
| gamma_option | A  | 
| gamma_alpha | Numeric value [0,1] that indicates the transparency of 
the colors used. Supported only on some devices, see 
 | 
Value
list of elements used to define the colors for the two
panels. Contains two elements rho and gamma, each 
corresponding to the related panel, and each containing default values 
for entries named cols, option, and alpha.
Examples
  set_TS_summary_plot_cols()
Prepare the colors to be used in the gamma time series
Description
Based on the inputs, create the set of colors to be used in the time series of the fitted gamma (topic proportion) values.
Usage
set_gamma_colors(x, cols = NULL, option = "D", alpha = 1)
Arguments
| x | Object of class  | 
| cols | Colors to be used to plot the time series of fitted topic proportions. | 
| option | A  | 
| alpha | Numeric value [0,1] that indicates the transparency of the 
colors used. Supported only on some devices, see 
 | 
Value
Vector of character hex codes indicating colors to use.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  set_gamma_colors(TSmod)
Prepare the colors to be used in the change point histogram
Description
Based on the inputs, create the set of colors to be used in the change point histogram.
Usage
set_rho_hist_colors(x = NULL, cols = NULL, option = "D", alpha = 1)
Arguments
| x | 
 | 
| cols | Colors to be used to plot the histograms of change points.
Any valid color values (e.g., see  | 
| option | A  | 
| alpha | Numeric value [0,1] that indicates the transparency of the 
colors used. Supported only on some devices, see 
 | 
Value
Vector of character hex codes indicating colors to use.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  TSmod <- TS(data, gamma ~ 1, nchangepoints = 1, "newmoon", weights)
  set_rho_hist_colors(TSmod$rhos)
Simulate LDA_TS data from LDA and TS model structures and parameters
Description
For a given set of covariates X; parameters 
Beta, Eta, rho, and err; and 
document-specific time stamps tD and lengths N),
simulate a document-by-topic matrix.
Additional structuring variables (the numbers of topics (k), terms (V), 
documents (M), segments (S), and covariates per segment (C))
are inferred from input objects.
Usage
sim_LDA_TS_data(N, Beta, X, Eta, rho, tD, err = 0, seed = NULL)
Arguments
| N | A vector of document sizes (total word counts). Must be integer conformable. Is used to infer the total number of documents. | 
| Beta | 
 | 
| X | 
 | 
| Eta | 
 | 
| rho | Vector of integer-conformable time locations of changepoints or 
 | 
| tD | Vector of integer-conformable times of the documents. Must be
of length M (as determined by  | 
| err | Additive error on the link-scale. Must be a non-negative 
 | 
| seed | Input to  | 
Value
A document-by-term matrix of counts (dim: M x V).
Examples
  N <- c(10, 22, 15, 31)
  tD <- c(1, 3, 4, 6)
  rho <- 3
  X <- cbind(rep(1, 4), 1:4)
  Eta <- cbind(c(0.5, 0.3, 0.9, 0.5), c(1.2, 1.1, 0.1, 0.5))
  Beta <- matrix(c(0.1, 0.1, 0.8, 0.2, 0.6, 0.2), 2, 3, byrow = TRUE)
  err <- 1
  sim_LDA_TS_data(N, Beta, X, Eta, rho, tD, err)
  
Simulate LDA data from an LDA structure given parameters
Description
For a given set of parameters alpha and Beta and
document-specific total word counts, simulate a document-by-term matrix.
Additional structuring variables (the numbers of topics (k),
documents (M), terms (V)) are inferred from input objects.
Usage
sim_LDA_data(N, Beta, alpha = NULL, Theta = NULL, seed = NULL)
Arguments
| N | A vector of document sizes (total word counts). Must be integer conformable. Is used to infer the total number of documents. | 
| Beta | 
 | 
| alpha | Single positive numeric value for the Dirichlet distribution
parameter defining topics within documents. To specifically define
document topic probabilities, use  | 
| Theta | 
 | 
| seed | Input to  | 
Value
A document-by-term matrix of counts (dim: M x V).
Examples
  N <- c(10, 22, 15, 31)
  alpha <- 1.2
  Beta <- matrix(c(0.1, 0.1, 0.8, 0.2, 0.6, 0.2), 2, 3, byrow = TRUE)
  sim_LDA_data(N, Beta, alpha = alpha)
  Theta <- matrix(c(0.2, 0.8, 0.8, 0.2, 0.5, 0.5, 0.9, 0.1), 4, 2, 
               byrow = TRUE)
  sim_LDA_data(N, Beta, Theta = Theta)
Simulate TS data from a TS model structure given parameters
Description
For a given set of covariates X; parameters Eta,
rho, and err; and document-specific time stamps tD,
simulate a document-by-topic matrix. Additional structuring variables 
(numbers of topics (k), documents (M), segments (S), and 
covariates per segment (C)) are inferred from input objects.
Usage
sim_TS_data(X, Eta, rho, tD, err = 0, seed = NULL)
Arguments
| X | 
 | 
| Eta | 
 | 
| rho | Vector of integer-conformable time locations of changepoints or 
 | 
| tD | Vector of integer-conformable times of the documents. Must be
of length M (as determined by  | 
| err | Additive error on the link-scale. Must be a non-negative 
 | 
| seed | Input to  | 
Value
A document-by-topic matrix of probabilities (dim: M x k).
Examples
  tD <- c(1, 3, 4, 6)
  rho <- 3
  X <- cbind(rep(1, 4), 1:4)
  Eta <- cbind(c(0.5, 0.3, 0.9, 0.5), c(1.2, 1.1, 0.1, 0.5))
  sim_TS_data(X, Eta, rho, tD, err = 1)
  
Calculate the softmax of a vector or matrix of values
Description
Calculate the softmax (normalized exponential) of a vector of values or a set of vectors stacked rowwise.
Usage
softmax(x)
Arguments
| x | 
 | 
Value
The softmax of x.
Examples
  dat <- matrix(runif(100, -1, 1), 25, 4)
  softmax(dat)
  softmax(dat[,1])
Conduct a within-chain step of the ptMCMC algorithm
Description
This set of functions steps the chains forward one iteration 
of the within-chain component of the ptMCMC algorithm. step_chains
is the main function, comprised of a proposal (made by prop_step),
an evaluation of that proposal (made by eval_step), and then an 
update of the configuration (made by take_step). 
 
This set of functions was designed to work within TS and 
specifically est_changepoints. They are still hardcoded to
do so, but have the capacity to be generalized to work with any 
estimation via ptMCMC with additional coding work.
Usage
step_chains(i, cpts, inputs)
propose_step(i, cpts, inputs)
eval_step(i, cpts, prop_step, inputs)
take_step(cpts, prop_step, accept_step)
Arguments
| i | 
 | 
| cpts | 
 | 
| inputs | Class  | 
| prop_step | Proposed step output from  | 
| accept_step | 
 | 
Details
For each iteration of the ptMCMC algorithm, all of the chains have the potential to take a step. The possible step is proposed under a proposal distribution (here for change points we use a symmetric geometric distribution), the proposed step is then evaluated and either accepted or not (following the Metropolis-Hastings rule; Metropolis, et al. 1953, Hasting 1960, Gupta et al. 2018), and then accordingly taken or not (the configurations are updated).
Value
step_chains: list of change points, log-likelihoods, 
and logical indicators of acceptance for each chain. 
 
propose_step: list of change points and 
log-likelihood values for the proposal. 
 
eval_step: logical vector indicating if each 
chain's proposal was accepted. 
 
take_step: list of change points, log-likelihoods, 
and logical indicators of acceptance for each chain.
References
Gupta, S., L. Hainsworth, J. S. Hogg, R. E. C. Lee, and J. R. Faeder. 2018. Evaluation of parallel tempering to accelerate Bayesian parameter estimation in systems biology. link.
Hastings, W. K. 1970. Monte Carlo sampling methods using Markov Chains and their applications. Biometrika 57:97-109. link.
Metropolis, N., A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. 1953. Equations of state calculations by fast computing machines. Journal of Chemical Physics 21: 1087-1092. link.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights, 
                               TS_control())
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  ids <- prep_ids(TS_control())
  for(i in 1:TS_control()$nit){
    steps <- step_chains(i, cpts, inputs)
    swaps <- swap_chains(steps, inputs, ids)
    saves <- update_saves(i, saves, steps, swaps)
    cpts <- update_cpts(cpts, swaps)
    ids <- update_ids(ids, swaps)
  }
  # within step_chains()
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  i <- 1
  prop_step <- propose_step(i, cpts, inputs)
  accept_step <- eval_step(i, cpts, prop_step, inputs)
  take_step(cpts, prop_step, accept_step)
Summarize the regressor (eta) distributions
Description
summarize_etas calculates summary statistics for each
of the chunk-level regressors. 
 
measure_ets_vcov generates the variance-covariance matrix for 
the regressors.
Usage
summarize_etas(etas, control = list())
measure_eta_vcov(etas)
Arguments
| etas | Matrix of regressors (columns) across iterations of the 
ptMCMC (rows), as returned from  | 
| control | A  | 
Value
summarize_etas: table of summary statistics for chunk-level
regressors including mean, median, mode, posterior interval, standard
deviation, MCMC error, autocorrelation, and effective sample size for 
each regressor. 
 
measure_eta_vcov: variance-covariance matrix for chunk-level
regressors.
Examples
 etas <- matrix(rnorm(100), 50, 2)
 summarize_etas(etas)
 measure_eta_vcov(etas)
Summarize the rho distributions
Description
summarize_rho calculates summary statistics for each
of the change point locations.
 
measure_rho_vcov generates the variance-covariance matrix for the 
change point locations.
Usage
summarize_rhos(rhos, control = list())
measure_rho_vcov(rhos)
Arguments
| rhos | Matrix of change point locations (columns) across iterations of 
the ptMCMC (rows) or  | 
| control | A  | 
Value
summarize_rhos: table of summary statistics for change point
locations including mean, median, mode, posterior interval, standard
deviation, MCMC error, autocorrelation, and effective sample size for 
each change point location. 
 
measure_rho_vcov: variance-covariance matrix for change 
point locations.
Examples
 rhos <- matrix(sample(80:100, 100, TRUE), 50, 2)
 summarize_rhos(rhos)
 measure_rho_vcov(rhos)
Conduct a set of among-chain swaps for the ptMCMC algorithm
Description
This function handles the among-chain swapping based on 
temperatures and likelihood differentials.  
 
This function was designed to work within TS and 
specifically est_changepoints. It is still hardcoded to do
so, but has the capacity to be generalized to work with any estimation
via ptMCMC with additional coding work.
Usage
swap_chains(chainsin, inputs, ids)
Arguments
| chainsin | Chain configuration to be evaluated for swapping. | 
| inputs | Class  | 
| ids | The vector of integer chain ids. | 
Details
The ptMCMC algorithm couples the chains (which are taking their own walks on the distribution surface) through "swaps", where neighboring chains exchange configurations (Geyer 1991, Falcioni and Deem 1999) following the Metropolis criterion (Metropolis et al. 1953). This allows them to share information and search the surface in combination (Earl and Deem 2005).
Value
list of updated change points, log-likelihoods, and chain
ids, as well as a vector of acceptance indicators for each swap.
References
Earl, D. J. and M. W. Deem. 2005. Parallel tempering: theory, applications, and new perspectives. Physical Chemistry Chemical Physics 7: 3910-3916. link.
Falcioni, M. and M. W. Deem. 1999. A biased Monte Carlo scheme for zeolite structure solution. Journal of Chemical Physics 110: 1754-1766. link.
Geyer, C. J. 1991. Markov Chain Monte Carlo maximum likelihood. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface. pp 156-163. American Statistical Association, New York, USA. link.
Metropolis, N., A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. 1953. Equations of state calculations by fast computing machines. Journal of Chemical Physics 21: 1087-1092. link.
Examples
  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights, 
                               TS_control())
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  ids <- prep_ids(TS_control())
  for(i in 1:TS_control()$nit){
    steps <- step_chains(i, cpts, inputs)
    swaps <- swap_chains(steps, inputs, ids)
    saves <- update_saves(i, saves, steps, swaps)
    cpts <- update_cpts(cpts, swaps)
    ids <- update_ids(ids, swaps)
  }
Produce the trace plot panel for the TS diagnostic plot of a parameter
Description
Produce a trace plot for the parameter of interest (rho or 
eta) as part of TS_diagnostics_plot. A horizontal line 
is added to show the median of the posterior.
Usage
trace_plot(x, ylab = "parameter value")
Arguments
| x | Vector of parameter values drawn from the posterior distribution, indexed to the iteration by the order of the vector. | 
| ylab | 
 | 
Value
NULL.
Examples
 trace_plot(rnorm(100, 0, 1))
Verify the change points of a multinomial time series model
Description
Verify that a time series can be broken into a set of chunks based on input change points.
Usage
verify_changepoint_locations(data, changepoints = NULL, timename = "time")
Arguments
| data | Class  | 
| changepoints | Numeric vector indicating locations of the change 
points. Must be conformable to  | 
| timename | 
 | 
Value
Logical indicator of the check passing TRUE or failing
FALSE.
Examples
  data(rodents)
  dtt <- rodents$document_term_table
  lda <- LDA_set(dtt, 2, 1, list(quiet = TRUE))
  dct <- rodents$document_covariate_table
  dct$gamma <- lda[[1]]@gamma
  verify_changepoint_locations(dct, changepoints = 100, 
                               timename = "newmoon")