stacodelists: use standard, language-independent variable codes to help international data interoperability and machine reuse in R
An R data package with all the SDMX standard codelists
The goal of statcodelists
is to promote the reuse and exchange of statistical information and related metadata with making the internationally standardized SDMX code lists available for the R user. SDMX – the Statistical Data and Metadata eXchange has been published as an ISO International Standard (ISO 17369). The metadata definitions, including the codelists are updated regularly according to the standard. The authoritative version of the code lists made available in this package is https://sdmx.org/?page_id=3215/.
Purpose
Cross-domain concepts in the SDMX framework describe concepts relevant to many, if not all, statistical domains. SDMX recommends using these concepts whenever feasible in SDMX structures and messages to promote the reuse and exchange of statistical information and related metadata between organisations.
Code lists are predefined sets of terms from which some statistical coded concepts take their values. SDMX cross-domain code lists are used to support cross-domain concepts. What are these cross-domain coded concepts?
- Geographical codes, like
NL
: the Netherlands in the CL_AREA code list. - Standard industry codes
J631
for Data processing, hosting and related activities in Europe. (NACE Rev 2 in Europe, beware, it isJ592
in Australia and New Zealand, see CL_ACTIVITY_ANZSIC06.) - Occupations, like
OC2521
forDatabase designers and administrators
in CL_OCCUPATIONS - Time fomatting standards, like
CCYY
for annual data series in CL_TIME_FORMAT.
Check out the available codlists on the package homepage.
The use of common code lists will help users to work even more efficiently, easing the maintenance of and reducing the need for mapping systems and interfaces delivering data and metadata to them. A very obvious advantage of using the code systems is that you can retrieve data from national sources indifferent of the natural language used in North Macedonia, Japan, the U.S. or the Netherlands. While the data labels may change to be locally human-readable, computers and geeks can read the codes and understand them immediately. Provided that they use the standard codes.
Our data observatories are rolling out SDMX coding across all datasets to help data ingestion and interoperability, data findability and data reuse. statcodelists
can help the use of standard SDMX codes in your R workflow–both for downloading data from statistical agencies and to produce publication-ready datasets that the rest of the world (and even APIs) will understand.
Installation
You can install statcodelists
from CRAN:
install.packages("statcodelists")
Further recommended code values for expressing general statistical concepts like not applicable
, etc., can be found in section Generic codes
of the Guidelines for the creation and management of SDMX Cross-Domain Code Lists.
For further codelists used by reliable statistical agency but not harmonized on SDMX level please consult the SDMX Global Registry Codelists page.
The creator of this package is not affiliated with SDMX, and this package was has not been endorsed by SDMX.
Code of Conduct
Please note that the statcodelists
project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.