Synthesise and correlate rating-scale data with predefined first & second moments (mean and standard deviation)
LikertMakeR synthesises Likert scale and related rating-scale data. Such scales are constrained by upper and lower bounds and discrete increments.
The package is intended for
“reproducing” rating-scale data for further analysis and visualisation when only summary statistics have been reported,
teaching. Helping researchers and students to better understand the relationships among scale properties, sample size, number of items, etc..
checking the feasibility of scale moments with given scale and correlation properties
Functions in this development version of LikertMakeR are:
lfast() draws repeated random samples from a scaled Beta distribution to approximate predefined first and second moments
lexact() attempts to produce a vector with exact predefined first and second moments
lcor() rearranges the values in the columns of a dataframe so that they are correlated to match a predefined correlation matrix
makeCorrAlpha constructs a random correlation matrix of given dimensions and predefined Cronbach’s Alpha
makeItems() is a wrapper function for lfast() and lcor_C() to generate synthetic rating-scale data with predefined first and second moments and a predefined correlation matrix
alpha() calculates Cronbach’s Alpha from a given correlation matrix or a given dataframe
eigenvalues() calculates eigenvalues of a correlation matrix, reports on positive-definite status of the matrix and, optionally, displays a scree plot to visualise the eigenvalues
A Likert scale is the mean, or sum, of several ordinal rating scales. They are bipolar (usually “agree-disagree”) responses to propositions that are determined to be moderately-to-highly correlated and capturing various facets of a theoretical construct.
Rating scales are not continuous or unbounded.
For example, a 5-point Likert scale that is constructed with, say, five items (questions) will have a summed range of between 5 (all rated ‘1’) and 25 (all rated ‘5’) with all integers in between, and the mean range will be ‘1’ to ‘5’ with intervals of 1/5=0.20. A 7-point Likert scale constructed from eight items will have a summed range between 8 (all rated ‘1’) and 56 (all rated ‘7’) with all integers in between, and the mean range will be ‘1’ to ‘7’ with intervals of 1/8=0.125.
Typically, a researcher will synthesise rating-scale data by sampling with a predetermined probability distribution. For example, the following code will generate a vector of values for a single Likert-scale item, with approximately the given probabilities.
n <- 128
sample(1:5, n, replace = TRUE,
prob = c(0.1, 0.2, 0.4, 0.2, 0.1)
)
This approach is good for testing Likert items but it does not help when working on complete Likert scales, or for when we want to specify means and standard deviations as they might be reported in published research.
The functions lfast()
, lfast()
, and lexact()
allow the user to specify exact univariate statistics as they might ordinarily be reported.
To download and install the package, run the following code from your R console.
From CRAN:
install.packages('LikertMakeR')
The latest development version is available from the author’s GitHub repository.
library(devtools)
install_github("WinzarH/LikertMakeR")
To synthesise a rating scale, the user must input the following parameters:
n: sample size
mean: desired mean
sd: desired standard deviation
lowerbound: desired lower bound
upperbound: desired upper bound
items: number of items making the scale. Default = 1
seed: optional seed for reproducibility
x <- lfast(
n = 256,
mean = 4.5, sd = 1.0,
lowerbound = 1,
upperbound = 7,
items = 5
)
x <- lfast(256, 2.5, 2.5, 0, 10)
LikertMakeR offers another function, lcor(), which rearranges the values in the columns of a data set so that they are correlated at a specified level. lcor() does not change the values - it swaps their positions in each column so that univariate statistics do not change, but their correlations with other columns do.
To create the desired correlations, the user must define the following objects:
data: a starter data set of rating-scales
target: the target correlation matrix
set.seed(42) ## for reproducibility
n <- 64
x1 <- lfast(n, 3.5, 1.00, 1, 5, 5)
x2 <- lfast(n, 1.5, 0.75, 1, 5, 5)
x3 <- lfast(n, 3.0, 1.70, 1, 5, 5)
x4 <- lfast(n, 2.5, 1.50, 1, 5, 5)
mydat4 <- data.frame(x1, x2, x3, x4)
head(mydat4)
cor(mydat4) |> round(3)
tgt4 <- matrix(
c(
1.00, 0.55, 0.60, 0.75,
0.55, 1.00, 0.25, 0.65,
0.60, 0.25, 1.00, 0.80,
0.75, 0.65, 0.80, 1.00
),
nrow = 4
)
new4 <- lcor(data = mydat4, target = tgt4)
cor(new4) |> round(3)
mydat3 <- data.frame(x1, x2, x3)
tgt3 <- matrix(
c(
1.00, -0.50, -0.85,
-0.50, 1.00, 0.60,
-0.85, 0.60, 1.00
),
nrow = 3
)
new3 <- lcor(mydat3, tgt3)
cor(new3) |> round(3)
makeCorrAlpha(), constructs a random correlation matrix of given dimensions and predefined Cronbach’s Alpha
Random values generated by makeCorrAlpha() are volatile. makeCorrAlpha() may not generate a feasible (positive-definite) correlation matrix, especially when
variance is high relative to
desired Alpha, and
desired correlation dimensions
makeCorrAlpha() will inform the user if the resulting correlation matrix is positive definite, or not.
If the returned correlation matrix is not positive-definite, because solutions are so volatile, a feasible solution still may be possible, and often is. The user is encouraged to try again, possibly several times, to find one.
items <- 4
alpha <- 0.85
variance <- 0.5
set.seed(42)
cor_matrix_4 <- makeCorrAlpha(items, alpha, variance)
alpha(cor_matrix_4)
eigenvalues(cor_matrix_4, 1)
items <- 12
alpha <- 0.90
variance <- 1.0
set.seed(42)
cor_matrix_12 <- makeCorrAlpha(items, alpha, variance)
alpha(cor_matrix_12)
eigenvalues(cor_matrix_12, 1)
makeItems() generates a dataframe of random discrete values from a scaled Beta distribution so the data replicate a rating scale, and are correlated close to a predefined correlation matrix.
makeItems() is a wrapper function for:
lfast(), which takes repeated samples selecting a vector that best fits the desired moments, and
lcor(), which rearranges values in each column of the dataframe so they closely match the desired correlation matrix.
n <- 16
dfMeans <- c(2.5, 3.0, 3.0, 3.5)
dfSds <- c(1.0, 1.0, 1.5, 0.75)
lowerbound <- rep(1, 4)
upperbound <- rep(5, 4)
corMat <- matrix(
c(
1.00, 0.25, 0.35, 0.40,
0.25, 1.00, 0.70, 0.75,
0.35, 0.70, 1.00, 0.80,
0.40, 0.75, 0.80, 1.00
),
nrow = 4, ncol = 4
)
df <- makeItems(
n = n,
means = dfMeans,
sds = dfSds,
lowerbound = lowerbound,
upperbound = upperbound,
cormatrix = corMat
)
print(df)
apply(df, 2, mean) |> round(3)
apply(df, 2, sd) |> round(3)
cor(df) |> round(3)
likertMakeR() includes two additional functions that may be of help when examining parameters and output.
alpha() calculates Cronbach’s Alpha from a given correlation matrix or a given dataframe
eigenvalues() calculates eigenvalues of a correlation matrix, a report on whether the correlation matrix is positive definite and an optional scree plot
alpha() accepts, as input, either a correlation matrix or a dataframe. If both are submitted, then the correlation matrix is used by default, with a message to that effect.
df <- data.frame(
V1 = c(4, 2, 4, 3, 2, 2, 2, 1),
V2 = c(4, 1, 3, 4, 4, 3, 2, 3),
V3 = c(4, 1, 3, 5, 4, 1, 4, 2),
V4 = c(4, 3, 4, 5, 3, 3, 3, 3)
)
corMat <- matrix(
c(
1.00, 0.35, 0.45, 0.70,
0.35, 1.00, 0.60, 0.55,
0.45, 0.60, 1.00, 0.65,
0.70, 0.55, 0.65, 1.00
),
nrow = 4, ncol = 4
)
alpha(cormatrix = corMat)
alpha(NULL, df)
alpha(corMat, df)
eigenvalues() calculates eigenvalues of a correlation matrix, reports on whether the matrix is positive-definite, and optionally produces a scree plot.
correlationMatrix <- matrix(
c(
1.00, 0.25, 0.35, 0.40,
0.25, 1.00, 0.70, 0.75,
0.35, 0.70, 1.00, 0.80,
0.40, 0.75, 0.80, 1.00
),
nrow = 4, ncol = 4
)
evals <- eigenvalues(cormatrix = correlationMatrix)
print(evals)
evals <- eigenvalues(correlationMatrix, 1)
Here’s how to cite this package:
Winzar, H. (2022). LikertMakeR: Synthesise and correlate rating-scale
data with predefined first & second moments,
The Comprehensive R Archive Network (CRAN),
<https://CRAN.R-project.org/package=LikertMakeR>
@software{winzar2022,
title = {LikertMakeR: Synthesise and correlate rating-scale data with predefined first & second moments},
author = {Hume Winzar},
abstract = {LikertMakeR synthesises Likert scale and related rating-scale data with predefined means and standard deviations, and optionally correlates these vectors to fit a predefined correlation matrix.},
journal = {The Comprehensive R Archive Network (CRAN)},
month = {12},
year = {2022},
url = {https://CRAN.R-project.org/package=LikertMakeR},
}