Using PxWebApi v2 with PxWebApiData

Øyvind Langsrud and Jan Bruusgaard

Preface

This vignette describes the functionality in the R package PxWebApiData related to PxWebApi v2. The new functions follow the recommended snake_case naming convention.

For older functionality, see Using PxWebApi v1 with PxWebApiData.

The vignette first describes the function get_api_data() for retrieving data from a pre-made URL. Note that this function is not limited to PxWebApi. At the end of the vignette, it is also shown how data from Eurostat can be retrieved using get_api_data().

The main function for PxWebApi v2 in the package can be considered to be api_data(). It is closely related to ApiData() for PxWebApi v1. The function api_data() retrieves data in one step. Internally, the functions query_url() and get_api_data() are called.

As described later in the vignette, the package also provides dedicated functions for retrieving metadata associated with PxWebApi v2.


get_api_data(): retrieve data from a pre-made URL

When a data URL is already available, the data can be retrieved using get_api_data(), as illustrated in the example below.

url <- paste0(
  "https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en",
  "&valueCodes[Region]=0301,324*",
  "&valueCodes[ContentsCode]=???????",
  "&valueCodes[Tid]=top(2)"
)

get_api_data(url)
$`04861: Area and population of urban settlements, by region, contents and year`
         region            contents year  value
1 Oslo - Oslove Number of residents 2024 714630
2 Oslo - Oslove Number of residents 2025 720631
3      Eidsvoll Number of residents 2024  23154
4      Eidsvoll Number of residents 2025  23599
5        Hurdal Number of residents 2024   1174
6        Hurdal Number of residents 2025   1217

$dataset
  Region ContentsCode  Tid  value
1   0301      Bosatte 2024 714630
2   0301      Bosatte 2025 720631
3   3240      Bosatte 2024  23154
4   3240      Bosatte 2025  23599
5   3242      Bosatte 2024   1174
6   3242      Bosatte 2025   1217

To return a single data frame with labels only, use the function get_api_data_1. The function get_api_data_2 returns codes only. To return a data frame containing both labels and codes, use get_api_data_12.

The output from get_api_data() is identical to the output from api_data(), which is shown below. As shown, the functions info() and note() can also be used to display additional information.


query_url(): generate a URL from specifications

The function query_url() can be used to generate a data URL. The URL used in the example above can be generated as follows:

query_url("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
          Region = c("0301", "324*"), 
          ContentsCode = "???????", 
          Tid = "top(2)")
[1] "https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en&valueCodes[Region]=0301,324*&valueCodes[ContentsCode]=???????&valueCodes[Tid]=top(2)"

The function query_url() can be used in many different ways.

A more detailed description is given below in the section on api_data(). The input to the two functions is identical.


api_data(): specify and retrieve data in one step

Specification by codes, *, ?, and top(n)

The dataset considered here has three variables: Region, ContentsCode, and Tid. These variables can be used as input parameters.

Each variable can be specified using codes corresponding to the coding used in PxWebApi URL queries.

Codes can be specified directly. It is also possible to truncate codes using an asterisk (*) or to mask individual characters using a question mark (?). In the example below, seven characters are masked.

Using top(2) returns the first two values from the start position.

api_data("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
         Region = c("0301", "324*"), 
         ContentsCode = "???????", 
         Tid = "top(2)")
$`04861: Area and population of urban settlements, by region, contents and year`
         region            contents year  value
1 Oslo - Oslove Number of residents 2024 714630
2 Oslo - Oslove Number of residents 2025 720631
3      Eidsvoll Number of residents 2024  23154
4      Eidsvoll Number of residents 2025  23599
5        Hurdal Number of residents 2024   1174
6        Hurdal Number of residents 2025   1217

$dataset
  Region ContentsCode  Tid  value
1   0301      Bosatte 2024 714630
2   0301      Bosatte 2025 720631
3   3240      Bosatte 2024  23154
4   3240      Bosatte 2025  23599
5   3242      Bosatte 2024   1174
6   3242      Bosatte 2025   1217

A list of two data frames is returned: one with labels and one with codes.

To return a single data frame with labels only, use the function api_data_1. The function api_data_2 returns codes only. To return a data frame containing both labels and codes, use api_data_12.

Internally, a data URL is first constructed and the data are then retrieved using the function get_api_data().

To obtain the generated URL, replace api_data() with query_url(). The URL for this example has already been generated using query_url() in the example above.

Specification using (default) indexing

Numeric values are interpreted as indexing, either as row numbers in the metadata or as indices. See the parameter use_index for further details.

As specified by the parameter default_query, unspecified variables are set to c(1, -2, -1). In the example below, Tid is unspecified, which therefore corresponds to the first and the two last years.

api_data_12("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
           Region = 14:17, 
           ContentsCode = 2)
      region            contents year Region ContentsCode  Tid value
1      Asker Number of residents 2000   3203      Bosatte 2000     0
2      Asker Number of residents 2024   3203      Bosatte 2024 93769
3      Asker Number of residents 2025   3203      Bosatte 2025 95455
4 Lillestrøm Number of residents 2000   3205      Bosatte 2000     0
5 Lillestrøm Number of residents 2024   3205      Bosatte 2024 87596
6 Lillestrøm Number of residents 2025   3205      Bosatte 2025 89143
 [ reached 'max' / getOption("max.print") -- omitted 6 rows ]

Specification using TRUE, FALSE, imaginary values (e.g. 3i), and labels

All possible values are obtained by TRUE and this is equivalent to "*". Elimination of a variable is obtained by FALSE. Imaginary values represent top, for example 3i is equivalent to "top(3)".

api_data_2("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
          Region = FALSE, 
          ContentsCode = TRUE, 
          Tid = 3i)
  ContentsCode  Tid      value
1        Areal 2023    2266.99
2        Areal 2024    2279.97
3        Areal 2025    2285.89
4      Bosatte 2023 4554562.00
5      Bosatte 2024 4619969.00
6      Bosatte 2025 4662945.00

Labels can also be used as an alternative to codes.

obj <- api_data("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
                Region = c("Asker", "Hurdal"), 
                ContentsCode = TRUE, 
                Tid = 2i)

To show either label version or code version.

obj[[1]]
  region                        contents year    value
1  Asker Area of urban settlements (km²) 2024    51.89
2  Asker Area of urban settlements (km²) 2025    52.00
3  Asker             Number of residents 2024 93769.00
4  Asker             Number of residents 2025 95455.00
5 Hurdal Area of urban settlements (km²) 2024     1.19
6 Hurdal Area of urban settlements (km²) 2025     1.20
7 Hurdal             Number of residents 2024  1174.00
8 Hurdal             Number of residents 2025  1217.00
obj[[2]]
  Region ContentsCode  Tid    value
1   3203        Areal 2024    51.89
2   3203        Areal 2025    52.00
3   3203      Bosatte 2024 93769.00
4   3203      Bosatte 2025 95455.00
5   3242        Areal 2024     1.19
6   3242        Areal 2025     1.20
7   3242      Bosatte 2024  1174.00
8   3242      Bosatte 2025  1217.00

Use default_query = TRUE to retrieve entire tables

out <- api_data_2("https://data.ssb.no/api/pxwebapi/v2/tables/10172/data?lang=en", 
                   default_query = TRUE)
out[14:20, ]  # 9 rows printed  
   Vekst ContentsCode  Tid value NAstatus
14    50  VarigKultur 2015    NA        :
15    50  VarigKultur 2020    NA        .
16    51 Veksthusbedr 2012    NA       ..
17    51 Veksthusbedr 2015    94     <NA>
18    51 Veksthusbedr 2020    89     <NA>
19    51  BedrMedBiol 2012    NA       ..
20    51  BedrMedBiol 2015    67     <NA>

In this case, the NAstatus variable is included. See the api_data() parameter make_na_status.

Show additional information

Use info() and note() (or comment()) to list additional dataset information.

info(obj)
                                                                          label
"04861: Area and population of urban settlements, by region, contents and year"
                                                                         source
                                                            "Statistics Norway"
                                                                        updated
                                                         "2025-10-27T07:00:00Z"
                                                                        tableid
                                                                        "04861"
                                                                       contents
                             "04861: Area and population of urban settlements,"
note(obj)
                                                                                                    [1]
                                  "Not included persons lacking information on type of residence area."
                                                                                                    [2]
"As of 1. January 2013 Statistics Norway implemented a new method for defining urban settlements, resulting in a more accurate delimitation. Due to this figures before and after 2013 are not directly comparable."
                                                                                                    [3]
                                                                                "Year 2010 is missing."
                                                                                                    [4]
"[See list over changes in regional classifications](https://www.ssb.no/offentlig-sektor/kommunekatalog/endringer-i-de-regionale-inndelingene) (in Norwegian).
"

Use note() for explanation of NA status codes

note(out)
                                                                                                    [1]
". = Category not applicable. Figures do not exist at this time, because the category was not in use when the figures were collected."
                                                                                                    [2]
         ": = Confidential. Figures are not published so as to avoid identifying persons or companies."
                                                                                                    [3]
".. = Data not available. Figures have not been entered into our databases or are too unreliable to be published."

Specification by list() for advanced queries

Advanced queries can be specified using named lists, where the names correspond to the encoding used in PxWebApi URL queries.

 api_data_2("https://data.ssb.no/api/pxwebapi/v2/tables/07459/data?lang=en",
            Region = list(codelist = "agg_KommSummer", 
                          valueCodes = c("K-3101", "K-3103"), 
                          outputValues = "aggregated"),
            Kjonn = TRUE,
            Alder = list(codelist = "agg_TodeltGrupperingB", 
                         valueCodes = c("H17", "H18"),
                         outputValues = "aggregated"),
            ContentsCode = 1,
            Tid = 2i)  
  Region Kjonn Alder ContentsCode  Tid value
1 K-3101     2   H17    Personer1 2024  2944
2 K-3101     2   H17    Personer1 2025  2926
3 K-3101     2   H18    Personer1 2024 12843
4 K-3101     2   H18    Personer1 2025 12954
5 K-3101     1   H17    Personer1 2024  3078
6 K-3101     1   H17    Personer1 2025  3021
7 K-3101     1   H18    Personer1 2024 13070
 [ reached 'max' / getOption("max.print") -- omitted 9 rows ]

In this case, the generated URL is:

 url <- query_url("https://data.ssb.no/api/pxwebapi/v2/tables/07459/data?lang=en",
            Region = list(codelist = "agg_KommSummer", 
                          valueCodes = c("K-3101", "K-3103"), 
                          outputValues = "aggregated"),
            Kjonn = TRUE,
            Alder = list(codelist = "agg_TodeltGrupperingB", 
                         valueCodes = c("H17", "H18"),
                         outputValues = "aggregated"),
            ContentsCode = 1,
            Tid = 2i)
 cat(gsub("&", "\n&", url))
https://data.ssb.no/api/pxwebapi/v2/tables/07459/data?lang=en
&codelist[Region]=agg_KommSummer
&valueCodes[Region]=K-3101,K-3103
&outputValues[Region]=aggregated
&valueCodes[Kjonn]=*
&codelist[Alder]=agg_TodeltGrupperingB
&valueCodes[Alder]=H17,H18
&outputValues[Alder]=aggregated
&valueCodes[ContentsCode]=Personer1
&valueCodes[Tid]=top(2)

To improve readability, cat() together with gsub() is used to print the long URL across multiple lines.

This query is constructed using information available in the metadata; see the section below.


Obtaining metadata

meta_frames()

Metadata for a data set can be obtained using meta_frames().

mf <- meta_frames("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en")
print(mf)
$Region
   code index           label
1  3101     0          Halden
2  3103     1            Moss
3  3105     2       Sarpsborg
4  3107     3     Fredrikstad
5  3110     4          Hvaler
6  3112     5            Råde
7  3114     6 Våler (Østfold)
8  3116     7        Skiptvet
9  3118     8   Indre Østfold
10 3120     9       Rakkestad
11 3122    10          Marker
12 3124    11         Aremark
13 3201    12           Bærum
14 3203    13           Asker
 [ reached 'max' / getOption("max.print") -- omitted 938 rows ]

$ContentsCode
     code index                           label unit.base unit.decimals
1   Areal     0 Area of urban settlements (km²)       km²             2
2 Bosatte     1             Number of residents   persons             0

$Tid
   code index label
1  2000     0  2000
2  2002     1  2002
3  2003     2  2003
4  2004     3  2004
5  2005     4  2005
6  2006     5  2006
7  2007     6  2007
8  2008     7  2008
9  2009     8  2009
10 2011     9  2011
11 2012    10  2012
12 2013    11  2013
13 2014    12  2014
14 2015    13  2015
 [ reached 'max' / getOption("max.print") -- omitted 10 rows ]

Information about whether variables can be eliminated is stored as an attribute and can be retrieved for all variables at once:

sapply(mf, attr, "elimination") # elimination info for all variables
      Region ContentsCode          Tid 
        TRUE        FALSE        FALSE 

Code list information is stored as a data frame in another attribute:

attr(mf[["Region"]], "code_lists")
                     id                                          label
1        agg_KommFylker          Counties 2024, aggregated time series
2       agg_KommForrige                       Municipalities 2020-2023
3        agg_KommSummer    Municipalities 2024, aggregated time series
4    agg_LandsdelKommun                                   Regions 2025
5  agg_Politidistrikt16                           Police district 2016
6       agg_RegionerBVF Regions (Child welfare and family counselling)
7    agg_SentralIndeksA       Centrality (can not be used before 1977)
8  agg_OkonomRegion2020                          Economic regions 2024
9      agg_Valgdistrikt                            Electoral districts
10            vs_Kommun                             All municipalities
          type
1  Aggregation
2  Aggregation
3  Aggregation
4  Aggregation
5  Aggregation
6  Aggregation
7  Aggregation
8  Aggregation
9  Aggregation
10    Valueset
                                                                        links
1        https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_KommFylker?lang=en
2       https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_KommForrige?lang=en
3        https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_KommSummer?lang=en
4    https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_LandsdelKommun?lang=en
5  https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_Politidistrikt16?lang=en
6       https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_RegionerBVF?lang=en
7    https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_SentralIndeksA?lang=en
8  https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_OkonomRegion2020?lang=en
9      https://data.ssb.no/api/pxwebapi/v2/codeLists/agg_Valgdistrikt?lang=en
10            https://data.ssb.no/api/pxwebapi/v2/codeLists/vs_Kommun?lang=en

meta_code_list()

Metadata for code lists referenced in this output can be retrieved using meta_code_list().

meta_data()

To download raw metadata without further processing, use meta_data().

Note that it does not matter whether the input URL refers to data or metadata; this is handled automatically.


Eurostat data

Eurostat REST API offers JSON-stat version 2. It is possible to use this package to obtain data from Eurostat by using get_api_data or the similar functions with 1, 2 or 12 at the end

This example shows HICP total index, latest two periods for EU and Norway. See Eurostat guidelines for more.

url_eurostat <- paste0(   # Here the long url is split into several lines using paste0 
  "https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/prc_hicp_mv12r", 
  "?format=JSON&lang=EN&lastTimePeriod=2&coicop=CP00&geo=NO&geo=EU")
url_eurostat
[1] "https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/prc_hicp_mv12r?format=JSON&lang=EN&lastTimePeriod=2&coicop=CP00&geo=NO&geo=EU"
get_api_data_12(url_eurostat)
No encoding supplied: defaulting to UTF-8.
  Time frequency                         Unit of measure
1        Monthly Moving 12 months average rate of change
2        Monthly Moving 12 months average rate of change
3        Monthly Moving 12 months average rate of change
4        Monthly Moving 12 months average rate of change
  Classification of individual consumption by purpose (COICOP)
1                                               All-items HICP
2                                               All-items HICP
3                                               All-items HICP
4                                               All-items HICP
                                                                                   Geopolitical entity (reporting)
1 European Union (EU6-1958, EU9-1973, EU10-1981, EU12-1986, EU15-1995, EU25-2004, EU27-2007, EU28-2013, EU27-2020)
2 European Union (EU6-1958, EU9-1973, EU10-1981, EU12-1986, EU15-1995, EU25-2004, EU27-2007, EU28-2013, EU27-2020)
3                                                                                                           Norway
4                                                                                                           Norway
     Time freq         unit coicop geo    time value
1 2025-11    M RCH_MV12MAVR   CP00  EU 2025-11   2.5
2 2025-12    M RCH_MV12MAVR   CP00  EU 2025-12   2.5
3 2025-11    M RCH_MV12MAVR   CP00  NO 2025-11   2.7
4 2025-12    M RCH_MV12MAVR   CP00  NO 2025-12   2.8


Background

PxWeb and it’s API, PxWebApi is used as output database (Statbank) by many statistical agencies in the Nordic countries and several others, i.e. Statistics Norway, Statistics Finland, Statistics Sweden. See list of installations.

For hints on using PxWebApi v2 in general see PxWebApi v2 User Guide.