rfars

CRAN_Status_Badge R CMD Check

The goal of rfars is to facilitate transportation safety analysis by simplifying the process of extracting data from official crash databases. The National Highway Traffic Safety Administration collects and publishes a census of fatal crashes in the Fatality Analysis Reporting System and a sample of fatal and non-fatal crashes in the Crash Report Sampling System (an evolution of the General Estimates System). The Fatality and Injury Reporting System Tool allows users to query these databases, and can produce simple tables and graphs. This suffices for simple analysis, but often leaves researchers wanting more. Digging any deeper, however, involves a time-consuming process of downloading annual ZIP files and attempting to stitch them together - after first combing through immense data dictionaries to determine the required variables and table names.

rfars allows users to download the last 10 years of FARS and GES/CRSS data with just one line of code. The result is a full, rich dataset ready for mapping, modeling, and other downstream analysis. Codebooks with variable definitions and value labels support an informed analysis of the data (see vignette("Searchable Codebooks", package = "rfars") for more information). Helper functions are also provided to produce common counts and comparisons.

Installation

You can install the latest version of rfars from GitHub with:

# install.packages("devtools")
devtools::install_github("s87jackson/rfars")

or the CRAN stable release with:

install.packages("rfars")

Then load rfars and some helpful packages:

library(rfars)
library(dplyr)

Getting and Using Data

The get_fars() and get_gescrss() are the primary functions of the rfars package. These functions download and process data files directly from NHTSA’s FTP Site, or pull the prepared data stored on your local machine, or (as of Version 2.0) pull the prepared data from Zenodo. The data files hosted on Zenodo are stable, have DOIs, and replicate the data that would be produced by get_fars() and get_gescrss(), but in a fraction of the time.

They take the parameters years and states (FARS) or regions (GES/CRSS). As the source data files follow an annual structure, years determines how many file sets are downloaded or loaded, and states/regions filters the resulting dataset. Downloading and processing these files can take several minutes. Before downloading, rfars will inform you that it’s about to download files and asks your permission to do so. To skip this dialog, set proceed = TRUE. You can use the dir and cache parameters to save an RDS file to your local machine. The dir parameter specifies the directory, and cache names the file (be sure to include the .rds file extension).

Executing the code below will download the prepared FARS and GES/CRSS databases for 2014-2023.

myFARS <- get_fars(proceed = TRUE)
myCRSS <- get_gescrss(proceed = TRUE)

get_fars() and get_gescrss() return a list with six dataframes: flat, multi_acc, multi_veh, multi_per, events, and codebook.

The tables below show records for randomly selected crashes to illustrate the content and structure of the data. The tables are transposed for readability.

Each row in the flat dataframe corresponds to a person involved in a crash. As there may be multiple people and/or vehicles involved in one crash, some variable-values are repeated within a crash or vehicle. Each crash is uniquely identified with id, which is a combination of year and st_case. Note that st_case is not unique across years, for example, st_case 510001 will appear in each year. The id variable attempts to avoid this issue. The GES/CRSS data includes a weight variable that indicates how many crashes each row represents.

The ‘flat’ dataframe (transposed for readability)
year 2014 2014 2014 2014 2014
state Minnesota Minnesota Minnesota South Dakota South Dakota
st_case 270304 270304 270304 460097 460097
id 2014270304 2014270304 2014270304 2014460097 2014460097
veh_no 1 2 2 1 1
per_no 1 1 2 1 2
county 113 113 113 11 11
city 0 0 0 0 0
lon -96.20801 -96.20801 -96.20801 -96.64642 -96.64642
lat 48.15133 48.15133 48.15133 44.23894 44.23894
acc_type Initial Opposite Directions (Left/Right) Initial Opposite Directions (Going Straight) Initial Opposite Directions (Going Straight) Drive Off Road Drive Off Road
age 68 Years 58 Years 83 Years 28 Years 24 Years
air_bag Deployed- Front Deployed- Front Deployed- Front Deployed- Front Deployed- Front
alc_det Not Reported Not Reported Not Reported Not Reported Not Reported
alc_res 0.00 % BAC Test Not Given Not Reported 0.25 % BAC Unknown if tested
alc_status Test Given Test Not Given Not Reported Test Given UnKnown if Tested
arr_hour 6:00pm-6:59pm 6:00pm-6:59pm 6:00pm-6:59pm Unknown EMS Scene Arrival Hour Unknown EMS Scene Arrival Hour
arr_min 21 21 21 Unknown EMS Scene Arrival Minutes Unknown EMS Scene Arrival Minutes
atst_typ Blood Test Not Given Not Reported Unknown Test Type Unknown if Tested
bikecgp NA NA NA NA NA
bikectype NA NA NA NA NA
bikedir NA NA NA NA NA
bikeloc NA NA NA NA NA
bikepos NA NA NA NA NA
body_typ Minivan (Chrysler Town and Country, Caravan, Grand Caravan, Voyager, Voyager, Honda-Odyssey, …) 4-door sedan, hardtop 4-door sedan, hardtop 4-door sedan, hardtop 4-door sedan, hardtop
bus_use Not a Bus Not a Bus Not a Bus Not a Bus Not a Bus
cargo_bt Not Applicable (N/A) Not Applicable (N/A) Not Applicable (N/A) Not Applicable (N/A) Not Applicable (N/A)
cdl_stat No (CDL) No (CDL) No (CDL) Valid Valid
cert_no ************ ************ ************ ************ ************
day 11 11 11 28 28
day_week Thursday Thursday Thursday Sunday Sunday
death_da Not Applicable (Non-Fatal) Not Applicable (Non-Fatal) 11 Not Applicable (Non-Fatal) 28
death_hr Not Applicable (Non-fatal) Not Applicable (Non-fatal) 20:00-20:59 Not Applicable (Non-fatal) 1:00-1:59
death_mn Not Applicable (Non-fatal) Not Applicable (Non-fatal) 5 Not Applicable (Non-fatal) 6
death_mo Not Applicable (Non-Fatal) Not Applicable (Non-Fatal) December Not Applicable (Non-Fatal) September
death_tm 8888 8888 2005 8888 106
death_yr Not Applicable (Non-fatal) Not Applicable (Non-fatal) 2014 Not Applicable (Non-fatal) 2014
deaths 0 1 1 1 1
deformed Disabling Damage Disabling Damage Disabling Damage Disabling Damage Disabling Damage
doa Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
dr_drink No No No Yes Yes
dr_hgt 69 59 59 999 999
dr_pres Yes Yes Yes Yes Yes
dr_wgt 200 lbs. 250 lbs. 250 lbs. Unknown Unknown
dr_zip NA NA NA NA NA
drinking Unknown (Police Reported) No (Alcohol Not Involved) Not Reported Yes (Alcohol Involved) Not Reported
drug_det Not Reported Not Reported Not Reported Not Reported Not Reported
drugs Unknown No (drugs not involved) Not Reported Not Reported Not Reported
drunk_dr 0 0 0 1 1
dstatus Test Given Test Not Given Not Reported Test Not Given Unknown if Tested
ej_path Not Ejected/Not Applicable Not Ejected/Not Applicable Not Ejected/Not Applicable Not Ejected/Not Applicable Through Side Window
ejection Not Ejected Not Ejected Not Ejected Not Ejected Totally Ejected
emer_use Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
extricat Not Extricated or Not Applicable Not Extricated or Not Applicable Not Extricated or Not Applicable Unknown Not Extricated or Not Applicable
fatals 1 1 1 1 1
fire_exp No or Not Reported No or Not Reported No or Not Reported No or Not Reported No or Not Reported
first_mo No Record No Record No Record No Record No Record
first_yr No Record No Record No Record No Record No Record
gvwr Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
harm_ev Motor Vehicle In-Transport Motor Vehicle In-Transport Motor Vehicle In-Transport Other Post, Other Pole or Other Supports Other Post, Other Pole or Other Supports
haz_cno Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
haz_id Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
haz_inv No No No No No
haz_plac Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
haz_rel Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
hispanic Not A Fatality (not Applicable) Not A Fatality (not Applicable) Non-Hispanic Not A Fatality (not Applicable) Non-Hispanic
hit_run No No No No No
hosp_hr 6:00pm-6:59pm 6:00pm-6:59pm 6:00pm-6:59pm Unknown Unknown
hosp_mn 47 47 47 Unknown EMS Hospital Arrival Time Unknown EMS Hospital Arrival Time
hospital EMS Ground EMS Ground EMS Ground EMS Ground EMS Ground
hour 6:00pm-6:59pm 6:00pm-6:59pm 6:00pm-6:59pm 0:00am-0:59am 0:00am-0:59am
impact1 1 Clock Point 12 Clock Point 12 Clock Point 12 Clock Point 12 Clock Point
inj_sev Suspected Serious Injury(A) Suspected Serious Injury(A) Fatal Injury (K) Suspected Serious Injury(A) Fatal Injury (K)
j_knife Not an Articulated Vehicle Not an Articulated Vehicle Not an Articulated Vehicle Not an Articulated Vehicle Not an Articulated Vehicle
l_compl Valid license for this class vehicle Valid license for this class vehicle Valid license for this class vehicle Valid license for this class vehicle Valid license for this class vehicle
l_endors No Endorsements required for this vehicle No Endorsements required for this vehicle No Endorsements required for this vehicle No Endorsements required for this vehicle No Endorsements required for this vehicle
l_restri Restrictions, Compliance Unknown Restrictions, Compliance Unknown Restrictions, Compliance Unknown Restrictions Complied With Restrictions Complied With
l_state Minnesota Minnesota Minnesota Minnesota Minnesota
l_status Valid Valid Valid Valid Valid
l_type Full Driver License Full Driver License Full Driver License Full Driver License Full Driver License
lag_hrs 999 999 1 999 0
lag_mins 99 99 53 99 51
last_mo No Record No Record No Record No Record No Record
last_yr No Record No Record No Record No Record No Record
lgt_cond Dark - Not Lighted Dark - Not Lighted Dark - Not Lighted Dark - Not Lighted Dark - Not Lighted
location Occupant of a Motor Vehicle Occupant of a Motor Vehicle Occupant of a Motor Vehicle Occupant of a Motor Vehicle Occupant of a Motor Vehicle
m_harm Motor Vehicle In-Transport Motor Vehicle In-Transport Motor Vehicle In-Transport Rollover/Overturn Rollover/Overturn
mak_mod Dodge Caravan/Grand Caravan Ford Taurus/Taurus X Ford Taurus/Taurus X Chevrolet Lumina Chevrolet Lumina
make Dodge Ford Ford Chevrolet Chevrolet
man_coll Angle Angle Angle Not a Collision with Motor Vehicle In-Transport Not a Collision with Motor Vehicle In-Transport
mcarr_i1 Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
mcarr_i2 Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
mcarr_id Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
milept NA NA NA NA NA
minute 12 12 12 15 15
mod_year NA NA NA NA NA
model 442 17 17 20 20
month December December December September September
motdir NA NA NA NA NA
motman NA NA NA NA NA
msafeqmt NA NA NA NA NA
nhs This section IS NOT on the NHS This section IS NOT on the NHS This section IS NOT on the NHS This section IS NOT on the NHS This section IS NOT on the NHS
not_hour 6:00pm-6:59pm 6:00pm-6:59pm 6:00pm-6:59pm Unknown Unknown
not_min 12 12 12 Unknown Unknown
numoccs 01 02 02 02 02
owner Driver (in this crash) Not Registered Owner (Other Private Owner Listed) Driver (in this crash) was Registered Owner Driver (in this crash) was Registered Owner Driver (in this crash) Not Registered Owner (Other Private Owner Listed) Driver (in this crash) Not Registered Owner (Other Private Owner Listed)
p_crash1 Turning Left Going Straight Going Straight Going Straight Going Straight
p_crash2 Turning left at junction From opposite direction over left lane line From opposite direction over left lane line Over the lane line on right side of travel lane Over the lane line on right side of travel lane
p_crash3 No Avoidance Maneuver Steering right Steering right Unknown Unknown
pbcwalk NA NA NA NA NA
pbswalk NA NA NA NA NA
pbszone NA NA NA NA NA
pcrash4 Tracking Tracking Tracking Tracking Tracking
pcrash5 Stayed in original travel lane Stayed in original travel lane Stayed in original travel lane Departed roadway Departed roadway
pedcgp NA NA NA NA NA
pedctype NA NA NA NA NA
peddir NA NA NA NA NA
pedleg NA NA NA NA NA
pedloc NA NA NA NA NA
pedpos NA NA NA NA NA
peds 0 0 0 0 0
pedsnr NA NA NA NA NA
per_typ Driver of a Motor Vehicle In-Transport Driver of a Motor Vehicle In-Transport Passenger of a Motor Vehicle In-Transport Driver of a Motor Vehicle In-Transport Passenger of a Motor Vehicle In-Transport
permvit 3 3 3 2 2
pernotmvit 0 0 0 0 0
persons 3 3 3 2 2
prev_acc None None None None None
prev_dwi None None None None None
prev_oth None None None None None
prev_spd None None None None None
prev_sus None None None None None
pvh_invl 0 0 0 0 0
race Not a Fatality (not Applicable) Not a Fatality (not Applicable) White Not a Fatality (not Applicable) White
rail Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
reg_stat Minnesota Minnesota Minnesota Minnesota Minnesota
rel_road On Roadway On Roadway On Roadway On Roadside On Roadside
reljct1 No No No No No
reljct2 Intersection Intersection Intersection Non-Junction Non-Junction
rest_mis No No No No No
rest_use Shoulder and Lap Belt Used Shoulder and Lap Belt Used Unknown None Used None Used
road_fnc Rural-Minor Arterial Rural-Minor Arterial Rural-Minor Arterial Rural-Major Collector Rural-Major Collector
rolinloc No Rollover No Rollover No Rollover On Roadside On Roadside
rollover No Rollover No Rollover No Rollover Rollover, Tripped by Object/Vehicle Rollover, Tripped by Object/Vehicle
route U.S. Highway U.S. Highway U.S. Highway State Highway State Highway
rur_urb Rural Rural Rural Rural Rural
sch_bus No No No No No
seat_pos Front Seat, Left Side Front Seat, Left Side Front Seat, Right Side Front Seat, Left Side Front Seat, Right Side
sex Male Female Female Male Male
sp_jur No Special Jurisdiction No Special Jurisdiction No Special Jurisdiction No Special Jurisdiction No Special Jurisdiction
spec_use No Special Use No Special Use No Special Use No Special Use No Special Use
speedrel No No No No No
str_veh 0 0 0 0 0
tow_veh No Trailing Units No Trailing Units No Trailing Units No Trailing Units No Trailing Units
towed Towed Due to Disabling Damage Towed Due to Disabling Damage Towed Due to Disabling Damage Towed Due to Disabling Damage Towed Due to Disabling Damage
trav_sp Unknown Unknown Unknown 031 MPH 031 MPH
tway_id US-59 US-59 US-59 SR-324 SR-324
tway_id2 CR-31 CR-31 CR-31 NA NA
typ_int Four-Way Intersection Four-Way Intersection Four-Way Intersection Not an Intersection Not an Intersection
underide No Underride or Override Noted No Underride or Override Noted No Underride or Override Noted No Underride or Override Noted No Underride or Override Noted
unittype Motor Vehicle In-Transport (Inside or Outside the Trafficway) Motor Vehicle In-Transport (Inside or Outside the Trafficway) Motor Vehicle In-Transport (Inside or Outside the Trafficway) Motor Vehicle In-Transport (Inside or Outside the Trafficway) Motor Vehicle In-Transport (Inside or Outside the Trafficway)
v_config Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable
valign Straight Straight Straight Straight Straight
ve_forms 2 2 2 1 1
ve_total 2 2 2 1 1
vin NA NA NA NA NA
vnum_lan Two lanes Two lanes Two lanes Two lanes Two lanes
vpavetyp Blacktop, Bituminous, or Asphalt Blacktop, Bituminous, or Asphalt Blacktop, Bituminous, or Asphalt Blacktop, Bituminous, or Asphalt Blacktop, Bituminous, or Asphalt
vprofile Level Level Level Level Level
vspd_lim 60 MPH 60 MPH 60 MPH 55 MPH 55 MPH
vsurcond Dry Dry Dry Dry Dry
vtcont_f No Controls No Controls No Controls No Controls No Controls
vtrafcon No Controls No Controls No Controls No Controls No Controls
vtrafway Two-Way, Not Divided Two-Way, Not Divided Two-Way, Not Divided Two-Way, Not Divided Two-Way, Not Divided
work_inj Not Applicable (not a fatality) Not Applicable (not a fatality) No Not Applicable (not a fatality) No
wrk_zone None None None None None
func_sys NA NA NA NA NA
rd_owner NA NA NA NA NA
cityname NA NA NA NA NA
countyname NA NA NA NA NA
statename NA NA NA NA NA
trlr1vin NA NA NA NA NA
trlr2vin NA NA NA NA NA
trlr3vin NA NA NA NA NA
nmhelmet NA NA NA NA NA
nmlight NA NA NA NA NA
nmothpre NA NA NA NA NA
nmothpro NA NA NA NA NA
nmpropad NA NA NA NA NA
nmrefclo NA NA NA NA NA
prev_sus1 NA NA NA NA NA
prev_sus2 NA NA NA NA NA
prev_sus3 NA NA NA NA NA
helm_mis NA NA NA NA NA
helm_use NA NA NA NA NA
gvwr_from NA NA NA NA NA
gvwr_to NA NA NA NA NA
icfinalbody NA NA NA NA NA
trlr1gvwr NA NA NA NA NA
trlr2gvwr NA NA NA NA NA
trlr3gvwr NA NA NA NA NA
vpicbodyclass NA NA NA NA NA
vpicmake NA NA NA NA NA
vpicmodel NA NA NA NA NA
underoverride NA NA NA NA NA
devmotor NA NA NA NA NA
devtype NA NA NA NA NA
acc_config NA NA NA NA NA
a1 0 0 0 25 25
a2 0 0 0 25 25
a3 0 0 0 25 25
a4 0 0 0 25 25
a5 0 0 0 25 25
a6 0 0 0 25 25
a7 0 0 0 25 25
a8 0 0 0 25 25
a9 0 0 0 25 25
a10 0 0 0 25 25
p1 0 0 NA 25 NA
p2 0 0 NA 25 NA
p3 0 0 NA 25 NA
p4 0 0 NA 25 NA
p5 0 0 NA 25 NA
p6 0 0 NA 25 NA
p7 0 0 NA 25 NA
p8 0 0 NA 25 NA
p9 0 0 NA 25 NA
p10 0 0 NA 25 NA

The multi_ dataframes contain those variables for which there may be a varying number of values for any entity (e.g., driver impairments, vehicle events, weather conditions at time of crash). Each dataframe has the requisite data elements corresponding to the entity: multi_acc includes st_case and year, multi_veh adds veh_no (vehicle number), and multi_per adds per_no (person number).

The ‘multi_acc’ dataframe
state st_case name value year
Minnesota 270304 weather1 Cloudy 2014
South Dakota 460097 weather1 Cloudy 2014
The ‘multi_veh’ dataframe
state st_case name value year
Minnesota 270304 weather1 Cloudy 2014
South Dakota 460097 weather1 Cloudy 2014
The ‘multi_per’ dataframe
state st_case veh_no per_no name value year
Minnesota 270304 1 1 drugtst1 Blood 2014
Minnesota 270304 1 1 drugtst2 Test Not Given 2014
Minnesota 270304 1 1 drugtst3 Test Not Given 2014
Minnesota 270304 1 1 drugres1 Tested, No Drugs Found/Negative 2014
Minnesota 270304 1 1 drugres2 Test Not Given 2014
Minnesota 270304 1 1 drugres3 Test Not Given 2014
Minnesota 270304 2 1 drugtst1 Test Not Given 2014
Minnesota 270304 2 1 drugtst2 Test Not Given 2014
Minnesota 270304 2 1 drugtst3 Test Not Given 2014
Minnesota 270304 2 1 drugres1 Test Not Given 2014
Minnesota 270304 2 1 drugres2 Test Not Given 2014
Minnesota 270304 2 1 drugres3 Test Not Given 2014
Minnesota 270304 2 2 drugtst2 Test Not Given 2014
Minnesota 270304 2 2 drugtst3 Test Not Given 2014
Minnesota 270304 2 2 drugres2 Test Not Given 2014
Minnesota 270304 2 2 drugres3 Test Not Given 2014
South Dakota 460097 1 1 drugtst1 Test Not Given 2014
South Dakota 460097 1 1 drugtst2 Test Not Given 2014
South Dakota 460097 1 1 drugtst3 Test Not Given 2014
South Dakota 460097 1 1 drugres1 Test Not Given 2014
South Dakota 460097 1 1 drugres2 Test Not Given 2014
South Dakota 460097 1 1 drugres3 Test Not Given 2014
South Dakota 460097 1 2 drugtst1 Unknown if Tested 2014
South Dakota 460097 1 2 drugtst2 Test Not Given 2014
South Dakota 460097 1 2 drugtst3 Test Not Given 2014
South Dakota 460097 1 2 drugres1 Unknown if Tested 2014
South Dakota 460097 1 2 drugres2 Test Not Given 2014
South Dakota 460097 1 2 drugres3 Test Not Given 2014

The events dataframe provides a sequence of events for each vehicle in each crash. See the vignette(“Crash Sequence of Events”, package = “rfars”) for more information.

The ‘events’ dataframe
state st_case veh_no aoi soe veventnum year
Minnesota 270304 1 1 Clock Point Motor Vehicle In-Transport 1 2014
Minnesota 270304 2 12 Clock Point Motor Vehicle In-Transport 1 2014
Minnesota 270304 2 Non-Harmful Event Ran Off Roadway - Right 2 2014
Minnesota 270304 2 12 Clock Point Ditch 3 2014
South Dakota 460097 1 Non-Harmful Event Ran Off Roadway - Right 1 2014
South Dakota 460097 1 12 Clock Point Other Post, Other Pole or Other Supports 2 2014
South Dakota 460097 1 Non-Collision Rollover/Overturn 3 2014

The codebook dataframe provides a searchable codebook for the data, useful if you know what concept you’re looking for but not the variable that describes it. rfars also includes pre-loaded codebooks for FARS and GESCRSS (rfars::fars_codebook and rfars::gescrss_codebook). See vignette('Searchable Codebooks', package = 'rfars') for more information.

Counts

See vignette("Counts", package = "rfars") for information on the pre-loaded annual_counts dataframe and the counts() and compare_counts() functions. Also see vignette("Alcohol Counts", package = "rfars") for details on how BAC values are imputed and reported in Traffic Safety Facts.