
Welcome to the R package ds4psy — a software companion to the books and courses Data Science for Psychologists and Introduction to Data Science.
This R package provides datasets and functions used in the ds4psy and i2ds books and corresponding course curricula. These books and courses introduce the principles and methods of data science for students of psychology and other biological or social sciences.
The current release of ds4psy is available from CRAN at https://CRAN.R-project.org/package=ds4psy:
install.packages('ds4psy')  # install ds4psy from CRAN client
library('ds4psy')           # load to use the packageThe current development version of ds4psy can be installed from its GitHub repository at https://github.com/hneth/ds4psy/:
# install.packages('devtools')  # (if not installed yet)
devtools::install_github('hneth/ds4psy')
library('ds4psy')  # load to use the packageThe most recent version of the ds4psy book is freely available at https://bookdown.org/hneth/ds4psy/.
This R package and the corresponding books and courses provide an introduction to data science that is tailored to the needs of students in psychology, but is also suitable for students of the humanities and other biological or social sciences. This audience typically has some knowledge of statistics, but rarely an idea how data is prepared and shaped to allow for statistical testing. By using various data types and working with many examples, we teach tools for transforming, summarizing, and visualizing data. By keeping our eyes open for the perils of misleading representations, the book fosters fundamental skills of data literacy and cultivates reproducible research practices that enable and precede any practical use of statistics.
Students of psychology and other social sciences are trained to analyze data. But the data they learn to work with (e.g., in courses on statistics and empirical research methods) is typically provided to them and structured in a (rectangular or “tidy”) format that presupposes many steps of data processing regarding the aggregation and spatial layout of variables. When beginning to collect their own data, students inevitably struggle with these pre-processing steps which — even for experienced data scientists — tend to require more time and effort than choosing and conducting statistical tests.
This course develops the foundations of data analysis that allow students to collect data from real-world sources and transform and shape such data to answer scientific and practical questions. Although there are many good introductions to data science (e.g., Grolemund & Wickham, 2017) they typically do not take into account the special needs — and often anxieties and reservations — of psychology students. As social scientists are not computer scientists, we introduce new concepts and commands without assuming a mathematical or computational background. Adopting a task-oriented perspective, we begin with a specific problem and then solve it with some combination of data collection, manipulation, and visualization.
Our main goal is to develop a set of useful skills in analyzing real-world data and conducting reproducible research. Upon completing this course, you will be able to use R to read, transform, analyze, and visualize data of various types. Many interactive exercises allow students to continuously check their understanding, practice their skills, and monitor their progress.
The courses using this package assume some basic familiarity with statistics and the R programming language, but enthusiastic programming novices are welcome.
This package and the corresponding books are still being developed and are updated as new materials become available.
The current version of the book Introduction to Data Science is available at https://bookdown.org/hneth/i2ds/.
The most recent version of the book Data Science for Psychologists is available at https://bookdown.org/hneth/ds4psy/.
The current R package ds4psy is available at https://CRAN.R-project.org/package=ds4psy.
For ds4psy sources, there are 2 GitHub repositories to be distinguished:
_book).The current textbook Introduction to Data Science is online at https://bookdown.org/hneth/i2ds/.
The most recent version of Data science for psychologists is online at https://bookdown.org/hneth/ds4psy/.
These books and courses were originally based on the classic textbook:
but provide more base R and less tidyverse content.
Please install the following open-source programs on your computer:
R Studio is an integrated development environment (IDE) for R.
# Tidyverse packages: 
install.packages('tidyverse')
# Course packages: 
install.packages('ds4psy')  # datasets and functions
install.packages('unikn')   # color palettes and functionsSee the books on R and data science available on https://bookdown.org.
Posit.co resources: RStudio IDE, R Markdown, and various cheat sheets
If you find these materials useful, or want to adopt or alter them for your purposes, please let me know.
To cite ds4psy in derivations and publications, please use:
A BibTeX entry for LaTeX users is:
@Manual{,
  title = {ds4psy: Data Science for Psychologists},
  author = {Hansjörg Neth},
  year = {2025},
  organization = {Social Psychology and Decision Sciences, University of Konstanz},
  address = {Konstanz, Germany},
  note = {R package (version 1.1.0, September 12, 2025); Textbook at <https://bookdown.org/hneth/ds4psy/>.},
  url = {https://CRAN.R-project.org/package=ds4psy},
  doi = {10.5281/zenodo.7229812},
}The stable URL of the ds4psy R package is https://CRAN.R-project.org/package=ds4psy.
Data science for psychologists (ds4psy) by Hansjörg Neth is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
[File README.md updated on 2025-09-12.]