| Type: | Package | 
| Title: | Nonparametric Robust Estimation and Inference Methods using Local Polynomial Regression and Kernel Density Estimation | 
| Version: | 0.5.0 | 
| Date: | 2025-04-12 | 
| Description: | Tools for data-driven statistical analysis using local polynomial regression and kernel density estimation methods as described in Calonico, Cattaneo and Farrell (2018, <doi:10.1080/01621459.2017.1285776>): 'lprobust()' for local polynomial point estimation and robust bias-corrected inference, 'lpbwselect()' for local polynomial bandwidth selection, 'kdrobust()' for kernel density point estimation and robust bias-corrected inference, 'kdbwselect()' for kernel density bandwidth selection, and 'nprobust.plot()' for plotting results. The main methodological and numerical features of this package are described in Calonico, Cattaneo and Farrell (2019, <doi:10.18637/jss.v091.i08>). | 
| Depends: | R (≥ 3.1.1) | 
| License: | GPL-2 | 
| Imports: | Rcpp, ggplot2 | 
| LinkingTo: | Rcpp, RcppArmadillo | 
| Packaged: | 2025-04-13 13:03:38 UTC; dell5 | 
| NeedsCompilation: | yes | 
| Author: | Sebastian Calonico [aut, cre], Matias D. Cattaneo [aut], Max H. Farrell [aut] | 
| Maintainer: | Sebastian Calonico <scalonico@ucdavis.edu> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-04-14 07:50:06 UTC | 
Nonparametric Robust Estimation and Inference Methods using Local Polynomial Regression and Kernel Density Estimation
Description
This package provides tools for data-driven statistical analysis using local polynomial regression (LPR) and kernel density estimation (KDE) methods as described in Calonico, Cattaneo and Farrell (2018): lprobust for local polynomial point estimation and robust bias-corrected inference, lpbwselect for local polynomial bandwidth selection, kdrobust for kernel density point estimation and robust bias-corrected inference, kdbwselect for kernel density bandwidth selection, and nprobust.plot for plotting results. The main methodological and numerical features of this  package are described in Calonico, Cattaneo and Farrell (2019).
Details
| Package: | nprobust | 
| Type: | Package | 
| Version: | 0.5.0 | 
| Date: | 2025-04-03 | 
| License: | GPL-2 | 
Function for LPR estimation and inference: lprobust
Function for LPR bandwidth selection: lpbwselect
Function for KDE estimation and inference: kdrobust
Function for KDE bandwidth selection: kdbwselect
Function for graphical analysis: nprobust.plot
Author(s)
Sebastian Calonico, University of California, Davis, CA. scalonico@ucdavis.edu.
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Max H. Farrell, University of California, Santa Barbara, CA. maxhfarrell@ucsb.edu.
References
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association, 113(522): 767-779. doi:10.1080/01621459.2017.1285776.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8): 1-33. doi:10.18637/jss.v091.i08.
Bandwidth Selection Procedures for Kernel Density Estimation and Inference
Description
kdbwselect implements bandwidth selectors for kernel density point estimators and inference procedures developed in Calonico, Cattaneo and Farrell (2018). See also Calonico, Cattaneo and Farrell (2022) for related optimality results.
It also implements other bandwidth selectors available in the literature. See Wand and Jones (1995) for background references.
Companion commands are: kdrobust for kernel density point estimation and inference procedures.
A detailed introduction to this command is given in Calonico, Cattaneo and Farrell (2019). For more details, and related Stata and R packages useful for empirical analysis, visit https://nppackages.github.io/.
Usage
kdbwselect(x, eval = NULL, neval = NULL, kernel = "epa", 
bwselect = "mse-dpi", bwcheck=21, imsegrid=30, subset = NULL)
Arguments
| x | independent variable. | 
| eval | vector of evaluation point(s). By default it uses 30 equally spaced points over to support of  | 
| neval | number of quantile-spaced evaluation points on support of  | 
| kernel | kernel function used to construct the kernel estimators. Options are  | 
| bwselect | bandwidth selection procedure to be used. Options are: 
 
 
 
 
 
 Note: MSE = Mean Square Error; IMSE = Integrated Mean Squared Error; CE = Coverage Error; DPI = Direct Plug-in; ROT = Rule-of-Thumb. For details on implementation see Calonico, Cattaneo and Farrell (2019). | 
| bwcheck | if a positive integer is provided, then the selected bandwidth is enlarged so that at least  | 
| imsegrid | number of evaluations points used to compute the IMSE bandwidth selector. Default is  | 
| subset | optional rule specifying a subset of observations to be used. | 
Value
| Estimate | A matrix containing  | 
| opt | A list containing options passed to the function. | 
Author(s)
Sebastian Calonico, University of California, Davis, CA. scalonico@ucdavis.edu.
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Max H. Farrell, University of California, Santa Barbara, CA. maxhfarrell@ucsb.edu.
References
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association, 113(522): 767-779. doi:10.1080/01621459.2017.1285776.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8). doi:10.18637/jss.v091.i08.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2022. Coverage Error Optimal Confidence Intervals for Local Polynomial Regression. Bernoulli, 28(4): 2998-3022.
Fan, J., and Gijbels, I. 1996. Local polynomial modelling and its applications, London: Chapman and Hall.
Wand, M., and Jones, M. 1995. Kernel Smoothing, Florida: Chapman & Hall/CRC.
See Also
Examples
x   <- rnorm(500)
est <- kdbwselect(x)
summary(est)
Kernel Density Methods with Robust Bias-Corrected Inference
Description
kdrobust implements kernel density point estimators, with robust bias-corrected confidence intervals and inference procedures developed in Calonico, Cattaneo and Farrell (2018). See also Calonico, Cattaneo and Farrell (2022) for related optimality results.
It also implements other estimation and inference procedures available in the literature. See Wand and Jones (1995) for background references.
Companion commands: kdbwselect for kernel density data-driven bandwidth selection, and nprobust.plot for plotting results.
A detailed introduction to this command is given in Calonico, Cattaneo and Farrell (2019). For more details, and related Stata and R packages useful for empirical analysis, visit https://nppackages.github.io/.
Usage
kdrobust(x, eval = NULL, neval = NULL, h = NULL, b = NULL, rho = 1, 
kernel = "epa", bwselect = NULL, bwcheck = 21, imsegrid=30, level = 95, subset = NULL) 
Arguments
| x | independent variable. | 
| eval | vector of evaluation point(s). By default it uses 30 equally spaced points over to support of  | 
| neval | number of quantile-spaced evaluation points on support of  | 
| h | main bandwidth used to construct the kernel density  point estimator. Can be either scalar (same bandwidth for all evaluation points), or vector of same dimension as  | 
| b | bias bandwidth used to construct the bias-correction estimator. Can be either scalar (same bandwidth for all evaluation points), or vector of same dimension as  | 
| rho | Sets  | 
| kernel | kernel function used to construct local polynomial estimators. Options are  | 
| bwselect | bandwidth selection procedure to be used via  
 
 
 
 
 
 Note: MSE = Mean Square Error; IMSE = Integrated Mean Squared Error; CE = Coverage Error; DPI = Direct Plug-in; ROT = Rule-of-Thumb. For details on implementation see Calonico, Cattaneo and Farrell (2019). | 
| bwcheck | if a positive integer is provided, then the selected bandwidth is enlarged so that at least  | 
| imsegrid | number of evaluations points used to compute the IMSE bandwidth selector. Default is  | 
| level | confidence level used for confidence intervals; default is  | 
| subset | optional rule specifying a subset of observations to be used. | 
Value
| Estimate | A matrix containing  | 
| opt | A list containing options passed to the function. | 
Author(s)
Sebastian Calonico, University of California, Davis, CA. scalonico@ucdavis.edu.
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Max H. Farrell, University of California, Santa Barbara, CA. maxhfarrell@ucsb.edu.
References
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association, 113(522): 767-779. doi:10.1080/01621459.2017.1285776.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8): 1-33. doi:10.18637/jss.v091.i08.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2022. Coverage Error Optimal Confidence Intervals for Local Polynomial Regression. Bernoulli, 28(4): 2998-3022.
Fan, J., and Gijbels, I. 1996. Local polynomial modelling and its applications, London: Chapman and Hall.
Wand, M., and Jones, M. 1995. Kernel Smoothing, Florida: Chapman & Hall/CRC.
See Also
Examples
x   <- rnorm(500)
est <- kdrobust(x)
summary(est)
Bandwidth Selection Procedures for Local Polynomial Regression Estimation and Inference
Description
lpbwselect implements bandwidth selectors for local polynomial regression point estimators and inference procedures developed in Calonico, Cattaneo and Farrell (2018). See also Calonico, Cattaneo and Farrell (2022) for related optimality results. 
It also implements other bandwidth selectors available in the literature. See Wand and Jones (1995) and Fan and Gijbels (1996) for background references.
Companion commands: lprobust for local polynomial point estimation and inference procedures.
A detailed introduction to this command is given in Calonico, Cattaneo and Farrell (2019). For more details, and related Stata and R packages useful for empirical analysis, visit https://nppackages.github.io/.
Usage
lpbwselect(y, x, eval = NULL, neval = NULL, p = NULL, deriv = NULL,
kernel = "epa", bwselect = "mse-dpi", bwcheck = 21, bwregul = 1, 
imsegrid = 30, vce = "nn", cluster = NULL,
nnmatch = 3, interior = FALSE, subset = NULL)
Arguments
| y | dependent variable. | 
| x | independent variable. | 
| eval | vector of evaluation point(s). By default it uses 30 equally spaced points over to support of  | 
| neval | number of quantile-spaced evaluation points on support of  | 
| p | polynomial order used to construct point estimator; default is  | 
| deriv | derivative order of the regression function to be estimated. Default is  | 
| kernel | kernel function used to construct local polynomial estimators. Options are  | 
| bwselect | bandwidth selection procedure to be used. Options are: 
 
 
 
 
 
 
 Note: MSE = Mean Square Error; IMSE = Integrated Mean Squared Error; CE = Coverage Error; DPI = Direct Plug-in; ROT = Rule-of-Thumb. For details on implementation see Calonico, Cattaneo and Farrell (2019). | 
| bwcheck | if a positive integer is provided, then the selected bandwidth is enlarged so that at least  | 
| bwregul | specifies scaling factor for the regularization term added to the denominator of bandwidth selectors. Setting  | 
| imsegrid | number of evaluations points used to compute the IMSE bandwidth selector. Default is  | 
| vce | procedure used to compute the variance-covariance matrix estimator. Options are: 
 
 
 
 
 | 
| cluster | indicates the cluster ID variable used for cluster-robust variance estimation with degrees-of-freedom weights. By default it is combined with  | 
| nnmatch | to be combined with for  | 
.
| interior | if TRUE, all evaluation points are assumed to be interior points. This option affects only data-driven bandwidth selection via  | 
| subset | optional rule specifying a subset of observations to be used. | 
Value
| Estimate | A matrix containing  | 
| opt | A list containing options passed to the function. | 
Author(s)
Sebastian Calonico, University of California, Davis, CA. scalonico@ucdavis.edu.
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Max H. Farrell, University of California, Santa Barbara, CA. maxhfarrell@ucsb.edu.
References
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association, 113(522): 767-779. doi:10.1080/01621459.2017.1285776.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8): 1-33. doi:10.18637/jss.v091.i08.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2022. Coverage Error Optimal Confidence Intervals for Local Polynomial Regression. Bernoulli, 28(4): 2998-3022.
Fan, J., and Gijbels, I. 1996. Local polynomial modelling and its applications, London: Chapman and Hall.
Wand, M., and Jones, M. 1995. Kernel Smoothing, Florida: Chapman & Hall/CRC.
See Also
Examples
x   <- runif(500)
y   <- sin(4*x) + rnorm(500)
est <- lpbwselect(y,x)
summary(est)
Local Polynomial Methods with Robust Bias-Corrected Inference
Description
lprobust implements local polynomial regression point estimators, with robust bias-corrected confidence intervals and inference procedures developed in Calonico, Cattaneo and Farrell (2018). See also Calonico, Cattaneo and Farrell (2022) for related optimality results.
It also implements other estimation and inference procedures available in the literature. See Wand and Jones (1995) and Fan and Gijbels (1996) for background references.
Companion commands: lpbwselect for local polynomial data-driven bandwidth selection, and nprobust.plot for plotting results.
A detailed introduction to this command is given in Calonico, Cattaneo and Farrell (2019). For more details, and related Stata and R packages useful for empirical analysis, visit https://nppackages.github.io/.
Usage
lprobust(y, x, eval = NULL, neval = NULL, p = NULL, deriv = NULL, 
h = NULL, b = NULL, rho = 1, kernel = "epa", bwselect = NULL, 
bwcheck = 21, bwregul = 1, imsegrid = 30, vce = "nn", covgrid = FALSE, 
cluster = NULL, nnmatch = 3, level = 95, interior = FALSE, subset = NULL) 
Arguments
| y | dependent variable. | 
| x | independent variable. | 
| eval | vector of evaluation point(s). By default it uses 30 equally spaced points over to support of  | 
| neval | number of quantile-spaced evaluation points on support of  | 
| p | polynomial order used to construct point estimator; default is  | 
| deriv | derivative order of the regression function to be estimated. Default is  | 
| h | main bandwidth used to construct local polynomial point estimator. Can be either scalar (same bandwidth for all evaluation points), or vector of same dimension as  | 
| b | bias bandwidth used to construct the bias-correction estimator. Can be either scalar (same bandwidth for all evaluation points), or vector of same dimension as  | 
| rho | Sets  | 
| kernel | kernel function used to construct local polynomial estimators. Options are  | 
| bwselect | bandwidth selection procedure to be used via  
 
 
 
 
 
 
 Note: MSE = Mean Square Error; IMSE = Integrated Mean Squared Error; CE = Coverage Error; DPI = Direct Plug-in; ROT = Rule-of-Thumb. For details on implementation see Calonico, Cattaneo and Farrell (2019). | 
| bwcheck | if a positive integer is provided, then the selected bandwidth is enlarged so that at least  | 
| bwregul | specifies scaling factor for the regularization term added to the denominator of bandwidth selectors. Setting  | 
| imsegrid | number of evaluations points used to compute the IMSE bandwidth selector. Default is  | 
| vce | procedure used to compute the variance-covariance matrix estimator. Options are: 
 
 
 
 
 | 
| covgrid | if TRUE, it computes two covariance matrices (cov.us and cov.rb) for classical and robust covariances across point estimators over the grid of evaluation points. | 
| cluster | indicates the cluster ID variable used for cluster-robust variance estimation with degrees-of-freedom weights. By default it is combined with  | 
| nnmatch | to be combined with for  | 
.
| level | confidence level used for confidence intervals; default is  | 
| interior | if TRUE, all evaluation points are assumed to be interior points. This option affects only data-driven bandwidth selection via  | 
| subset | optional rule specifying a subset of observations to be used. | 
Value
| Estimate | A matrix containing  | 
| opt | A list containing options passed to the function. | 
Author(s)
Sebastian Calonico, University of California, Davis, CA. scalonico@ucdavis.edu.
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Max H. Farrell, University of California, Santa Barbara, CA. maxhfarrell@ucsb.edu.
References
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association, 113(522): 767-779. doi:10.1080/01621459.2017.1285776.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8): 1-33. doi:10.18637/jss.v091.i08.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2022. Coverage Error Optimal Confidence Intervals for Local Polynomial Regression. Bernoulli, 28(4): 2998-3022.
Fan, J., and Gijbels, I. 1996. Local polynomial modelling and its applications, London: Chapman and Hall.
Wand, M., and Jones, M. 1995. Kernel Smoothing, Florida: Chapman & Hall/CRC.
See Also
Examples
x   <- runif(500)
y   <- sin(4*x) + rnorm(500)
est <- lprobust(y,x)
summary(est)
Graphical Presentation of Results from nprobust Package.
Description
nprobust.plot plots estimated density and regression function using the nprobust package. A detailed introduction to this command is given in Calonico, Cattaneo and Farrell (2019).
Companion commands: lprobust for local polynomial point estimation and inference procedures, and kdrobust for kernel density point estimation and inference procedures.
For more details, and related Stata and R packages useful for empirical analysis, visit https://nppackages.github.io/.
Usage
nprobust.plot(..., alpha = NULL, type = NULL, CItype = NULL,
  title = "", xlabel = "", ylabel = "", lty = NULL, lwd = NULL,
  lcol = NULL, pty = NULL, pwd = NULL, pcol = NULL, CIshade = NULL,
  CIcol = NULL, legendTitle = NULL, legendGroups = NULL)
Arguments
| ... | |
| alpha | Numeric scalar between 0 and 1, the significance level for plotting confidence regions. If more than one is provided, they will be applied to data series accordingly. | 
| type | String, one of  | 
| CItype | String, one of  | 
| title,xlabel,ylabel | Strings, title of the plot and labels for x- and y-axis. | 
| lty | Line type for point estimates, only effective if  | 
| lwd | Line width for point estimates, only effective if  | 
| lcol | Line color for point estimates, only effective if  | 
| pty | Scatter plot type for point estimates, only effective if  | 
| pwd | Scatter plot size for point estimates, only effective if  | 
| pcol | Scatter plot color for point estimates, only effective if  | 
| CIshade | Numeric, opaqueness of the confidence region, should be between 0 (transparent) and 1. Default is 0.2. If more than one is provided, they will be applied to data series accordingly. | 
| CIcol | color for confidence region.  | 
| legendTitle | String, title of legend. | 
| legendGroups | String Vector, group names used in legend. | 
Details
Companion command: lprobust for local polynomial-based regression functions and derivatives estimation.
Value
A standard ggplot2 object is returned, hence can be used for further customization.
Author(s)
Sebastian Calonico, University of California, Davis, CA. scalonico@ucdavis.edu.
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Max H. Farrell, University of California, Santa Barbara, CA. maxhfarrell@ucsb.edu.
References
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8): 1-33. doi:10.18637/jss.v091.i08.
See Also
Examples
x   <- runif(500) 
y   <- sin(4*x) + rnorm(500)
est <- lprobust(y,x)
nprobust.plot(est)