px_micro() exists to support a specific use case for Statistics Greenland.
They use it to create small PX-files to showcase and present metadata from a lager data set which cannot be made publicly available. See an example on Statistic Greenland’s microdata for Research and Analysis.
px_micro()Apart from px_save(), px_micro() is the
only other function that can save px objects as PX-files.
px_micro() turns a px object into many smaller PX-files,
each containing a subset of the variables in the original px object.
The basis of micro files are usually a data set which doesn’t have a
count variable (like most PX-files). px_micro() will
instead create a count of each individual variable.
In this example we will use the built-in data data set
greenlanders.
library(pxmake)
greenlanders |> dplyr::sample_n(10) |> dplyr::arrange_all()
#> # A tibble: 10 × 4
#>    cohort gender   age municipality
#>    <chr>  <chr>  <int> <chr>       
#>  1 A      female    21 Sermersooq  
#>  2 A      female    59 Qeqertalik  
#>  3 A      male      24 Kujalleq    
#>  4 A      male      48 Qeqqata     
#>  5 A      male      63 Avannaata   
#>  6 B      female    32 Kujalleq    
#>  7 B      female    73 Avannaata   
#>  8 B      male      44 Qeqertalik  
#>  9 B      male      62 Sermersooq  
#> 10 B      male      98 QeqertalikCreate a px object with px(), and pass it to
px_micro().
# Create px object
x <- px(greenlanders)
# Create folder for micro files
micro_dir <- file.path("micro_files")
dir.create(micro_dir)
# Write micro files to folder
px_micro(x, out_dir = micro_dir)The folder now contains three PX-files, one for each variable except ‘age’.
The reason ‘age’ didn’t get a PX-file is because it is the HEADING
variable in x, and px_micro() creates a file
for each non-HEADING variable. Instead the HEADNING variable is used in
all the created PX-files.
# Print HEADING variables
px_heading(x)
#> [1] "age"
# Print non-HEADING variables
c(px_stub(x), px_figures(x))
#> [1] "cohort"       "gender"       "municipality"In this case, we want ‘cohort’ to be heading, and to create a PX-file for ‘gender’, ‘age’ and ‘municipality’.
The folder now contains the files we wanted.
Each file contains one of the three variables as STUB, ‘cohort’ as HEADING, and a variable ‘n’ which is the count of each combination of the variables.
px(file.path(micro_dir, 'age.px'))$data
#> # A tibble: 120 × 3
#>    age   cohort     n
#>    <chr> <chr>  <dbl>
#>  1 18    A         NA
#>  2 18    B          1
#>  3 19    A         NA
#>  4 19    B          1
#>  5 20    A         NA
#>  6 20    B          1
#>  7 21    A          3
#>  8 21    B         NA
#>  9 22    A          1
#> 10 22    B         NA
#> # ℹ 110 more rows
px(file.path(micro_dir, 'gender.px'))$data
#> # A tibble: 4 × 3
#>   gender cohort     n
#>   <chr>  <chr>  <dbl>
#> 1 female A         31
#> 2 female B         18
#> 3 male   A         23
#> 4 male   B         28
px(file.path(micro_dir, 'municipality.px'))$data
#> # A tibble: 10 × 3
#>    municipality cohort     n
#>    <chr>        <chr>  <dbl>
#>  1 Avannaata    A         11
#>  2 Avannaata    B         11
#>  3 Kujalleq     A         12
#>  4 Kujalleq     B          7
#>  5 Qeqertalik   A         13
#>  6 Qeqertalik   B          5
#>  7 Qeqqata      A          9
#>  8 Qeqqata      B         11
#>  9 Sermersooq   A          9
#> 10 Sermersooq   B         12In general the keyword values from the px object are carried over to the micro files. This is the case for keywords like ‘MATRIX’, ‘SUBJECT-CODE’, ‘CONTACT’, ‘LANGUAGE’, ‘CODEPAGE’, etc.
To change keywords across all the micro files, the easiest is to
change them in the px object before calling px_micro().
# Change CONTACT in all micro files
x2 |>
  px_contact("Johan Ejstrud") |>
  px_micro(out_dir = micro_dir)However, some keywords need to be changed individually for each micro file. To do so, create a data frame with the column ‘variable’ and a column for each px keyword to change.
individual_keywords <- tibble::tribble(~variable     ,      ~px_description,
                                       "age"         ,    "Age count 18-99",
                                       "gender"      ,       "Gender count",
                                       "municipality",  "Municipality 2024"
                                       )Supply this dataframe to the keyword_values argument of
px_micro().
DESCRIPTION is changed in the micro files:
px(file.path(micro_dir, 'age.px')) %>% px_description()
#> [1] "Age count 18-99"
px(file.path(micro_dir, 'gender.px')) %>% px_description()
#> [1] "Gender count"
px(file.path(micro_dir, 'municipality.px')) %>% px_description()
#> [1] "Municipality 2024"For multilingual files add a ‘language’ column to
keyword_values.
x3 <-
  x2 |>
  px_language("en") |>
  px_languages(c("en", "kl"))
individual_keywords_ml <- 
  tibble::tribble(
       ~variable, ~language,     ~px_description, ~px_matrix,
           "age",      "en",   "Age count 18-99",      "AGE",
           "age",      "kl",       "Ukiut 18-99",         NA,
        "gender",      "en",      "Gender count",      "GEN",
        "gender",      "kl",      " Suiaassuseq",         NA,
  "municipality",      "en", "Municipality 2024",      "MUN",
  "municipality",      "kl",      "Kommuni 2024",         NA
  )
px_micro(x3, out_dir = micro_dir, keyword_values = individual_keywords_ml)Here ‘px_description’ varies for each language, and ‘px_matrix’ is only set for one of the languages, since it is not a language dependent keywords. For language independant keywords it doesn’t matter which language the value is set for.
The filenames of the micro files are by default the name of the variable, however these can also be changed by passing a ‘filename’ column to ‘keyword_values’
individual_keywords2 <- 
  individual_keywords |>
  dplyr::mutate(filename = paste0(variable, "_2024", ".px"))
# Clear folder
unlink(file.path(micro_dir, "*.px"))
px_micro(x2, out_dir = micro_dir, keyword_values = individual_keywords2)
list.files(micro_dir)
#> [1] "age_2024.px"          "gender_2024.px"       "municipality_2024.px"