## Introduction: Manipulate codelists This vignette introduces a set of functions designed to manipulate and explore codelists within an OMOP CDM. Specifically, we will learn how to:
First of all, we will load the required packages and connect to a mock database.
library(DBI)
library(duckdb)
library(dplyr)
library(CDMConnector)
library(CodelistGenerator)
# Download mock database
requireEunomia(datasetName = "synpuf-1k", cdmVersion = "5.3")
# Connect to the database and create the cdm object
con <- dbConnect(duckdb(), eunomiaDir("synpuf-1k", "5.3"))
cdm <- cdmFromCon(con = con,
cdmName = "Eunomia Synpuf",
cdmSchema = "main",
writeSchema = "main",
achillesSchema = "main")We will start by generating a codelist for acetaminophen
using getDrugIngredientCodes()
acetaminophen <- getDrugIngredientCodes(cdm,
name = "acetaminophen",
nameStyle = "{concept_name}",
type = "codelist")Subsetting a codelist will allow us to reduce a codelist to only those concepts that meet certain conditions.
We will now subset to those concepts that have
domain = "Drug". Remember that, to see the domains
available in your codelist, you can use
associatedDomains().
We can use the negate argument to exclude concepts with
a certain domain:
We will now subset the codelist to only include concepts from RxNorm
vocabulary. You can also use associatedVocabularies() to
explore the vocabularies available in your codelist.
We will now filter to only include concepts with specified dose
units. Remember that you can use associatedDoseUnits() to
explore the dose units available in your codelist.
acetaminophen_mg_unit <- subsetOnDoseUnit(acetaminophen_rxnorm,
cdm,
c("milligram", "unit"))
acetaminophen_mg_unitAs before, we can use argument negate = TRUE to exclude
instead.
We can now subset on those drugs with 3 to 30 ingredients:
acetaminophen_ingredient <- subsetOnIngredientRange(acetaminophen_drug,
cdm,
ingredientRange = c(3, 30))
acetaminophen_ingredientNotice that negate = TRUE would keep all those concepts
with less than 3 ingredients or more than 30 (without including those
with 3 or 30 ingredients).
We will now subset to those concepts that do not have an
“unclassified_route” or “transmucosal_rectal”. See
associatedRouteCategories() to explore route categories
available in your codelist.
We will now subset to those concepts with specific dose forms. See
associatedDseForms() to explore dose forms available in
your codelist.
Instead of filtering, stratification allows us to split a codelist into subgroups based on defined vocabulary properties.
We can also add specific concepts to our codelist. For example, we will add the ingredient “acetaminophen” to all our codelists:
acetaminophen_routes1 <- addConcepts(acetaminophen_routes,
cdm,
concepts = c(1125315L))
acetaminophen_routes1Or we can add acetaminophen + descendants, and only to some of the codelists
x <- getDescendants(cdm = cdm, conceptId = c(1125315L))
acetaminophen_routes2 <- addConcepts(acetaminophen_routes,
cdm,
concepts = x$concept_id,
codelistName = "acetaminophen_unclassified_route_category")
acetaminophen_routes2And similarly, we can exclude specific concepts and their descendants from our codelist:
acetaminophen_routes3 <- excludeConcepts(acetaminophen_routes,
cdm,
concepts = x$concept_id,
codelistName = "acetaminophen_inhalable")
acetaminophen_routes3Notice that in this case, the codelist “acetaminophen_inhalable” is removed as there are no elements left after the exclusion of the code 35873016 and its descendants.
Notice that all the functions introduced previously are “pipeble”, allowing for a tidy and clear codelist construction:
acetaminophen <- getDrugIngredientCodes(cdm,
name = "acetaminophen",
nameStyle = "{concept_name}",
type = "codelist")
new_codelist <- acetaminophen |>
addConcepts(cdm,
concepts = c(1L, 2L, 3L)) |>
subsetOnDomain(cdm,
domain = "Drug") |>
stratifyByDoseUnit(cdm = cdm) |>
excludeConcepts(cdm,
concepts = c(1127898))
new_codelistNow we will compare two codelists to identify overlapping and unique codes.
acetaminophen <- getDrugIngredientCodes(cdm,
name = "acetaminophen",
nameStyle = "{concept_name}",
type = "codelist_with_details")
hydrocodone <- getDrugIngredientCodes(cdm,
name = "hydrocodone",
doseUnit = "milligram",
nameStyle = "{concept_name}",
type = "codelist_with_details")Compare the two sets: