Register paths and datastores

Confirmed paths and access methods for all registers on project 708421

Published

June 6, 2026

Warning

Check the modification date on cleaned-data before running the pipeline.

file.info("E:/workdata/708421/cleaned-data/parquet-registers/")$mtime

The registers are not necessarily updated to today. Confirm that coverage matches your study period.


Base paths

# All paths used as constants at the top of scripts
path_parquet_reg  <- "E:/workdata/708421/cleaned-data/parquet-registers/"
path_parquet_ext  <- "E:/workdata/708421/cleaned-data/parquet-external/"
path_dm_pop       <- "E:/workdata/708421/cleaned-data/diabetes-register-pop/dm_population_1977_2022.rds"
path_output       <- "E:/workdata/708421/workspaces/[yourName]/BS_demens/datasets/"

Overview β€” all registers on project 708421

All confirmed via colnames() on the DST server 2026-05-15. Most registers are updated to end of 2024 as of 2026 (Anders Aasted Isaksen/DARTER team). Column names shown after rename_with(tolower).

Register Access Join key Period Critical column
BEF load_database("bef") pnr All years koen, foed_dag, familie_id
DODSAARS load_database("dodsaars") pnr ~1970–2001 d_dodsdto (death date)
DOD not in cleaned-data pnr ~2001–2024 doddato β€” see extraction guide
VNDS load_database("vnds") pnr All years indud_kode, haend_dato
LPR2 contacts load_database("lpr_adm") recnum Up to March 2019 d_inddto, c_pattype
LPR2 diagnoses load_database("lpr_diag") recnum Up to March 2019 c_diag, c_diagtype
LPR2 psych contacts load_database("t_psyk_adm") k_recnum β†’ recnum 1995–March 2019 v_cpr β†’ pnr
LPR2 psych diagnoses load_database("t_psyk_diag") v_recnum β†’ recnum 1995–March 2019 c_diag, c_diagtype
LPR3 contacts load_database("lpr_a_kontakt") dw_ek_kontakt March 2019+ kont_starttidspunkt (datetime)
LPR3 diagnoses load_database("lpr_a_diagnose") dw_ek_kontakt March 2019+ diag_kode, diag_kode_type, senere_afkraeftet
LPR3 procedures load_database("procedurer_kirurgi") dw_ek_forloeb 2019+ procedurekode, dato_start
LPR2 procedures load_database("lpr_sksopr") recnum 1996–2018 c_opr, d_odto
LMDB load_database("lmdb") pnr Approx. 1994+ atc, eksd
UDDA load_database("udda") pnr All years hfaudd, aar
FAIK load_database("faik") familie_id All years famaekvivadisp_13
AKM load_database("akm") pnr All years socio13, aar
DBSO load_database("dbso") pnr 2010+ datoper_prim, surgery flags
OSDC readRDS(path_dm_pop) PNR β†’ rename to pnr 1977–2022 diabetes_type, do_dm
Laboratory results load_database("laboratorieproevesvar_") pnr Approx. 1994+ npu, samplingdato, samplevalue (character)

Critical notes

DODSAARS and DOD: dodsaars is in cleaned-data but covers only ~1970–2001. For post-2001 deaths, extraction from the raw SAS file is required β€” contact your data manager. See pitfall 1.

LPR3 β€” duplicate risk: lpr_a_kontakt and lpr_a_diagnose contain data from two formats (LPR_F and LPR_A). Always filter on lprindberetningssystem == "LPR3". See pitfall 5.

Laboratory results β€” use only one source: laboratorieproevesvar_ replaces lab_forsker/lab_dm_forsker. The old files still exist β€” use only one to avoid duplicates. See pitfall 6.

procedurer_kirurgi: dw_ek_kontakt is NA for all rows on DST. Join to lpr_a_kontakt via dw_ek_forloeb to fetch pnr.

DBSO: The identifier column is cpr in raw parquet β€” renamed to pnr by 00_prepare_dbso.R. All code uses pnr after that.

OSDC: PNR is uppercase in the raw file β€” rename with rename(pnr = PNR) after loading.


Loading templates

library(dstDataPrep)   # load_database() β€” access to DST registers
library(dplyr)         # rename_with, rename, left_join, select

# Standard register β€” parquet via load_database:
bef <- load_database("bef") %>% rename_with(tolower)   # lazy connection; lowercase columns

# Psychiatric LPR2 β€” requires renaming v_cpr and k_recnum:
psyk_adm <- load_database("t_psyk_adm") %>%
  rename_with(tolower) %>%                          # lowercase columns
  rename(pnr = v_cpr, recnum = k_recnum)            # v_cpr β†’ pnr; k_recnum β†’ recnum

# DBSO β€” parquet-external (converted from SAS via 00_prepare_dbso.R):
dbso <- load_database("dbso") %>% rename_with(tolower)   # lazy connection

# OSDC β€” RDS file with pre-computed diabetes classification:
dm_pop <- readRDS(path_dm_pop) %>% rename(pnr = PNR)   # PNR is uppercase in raw file β€” rename

# LPR3 procedures β€” join via dw_ek_forloeb (NOT dw_ek_kontakt β€” is NA for all rows):
proc <- load_database("procedurer_kirurgi") %>%
  rename_with(tolower) %>%                          # lowercase columns
  left_join(
    load_database("lpr_a_kontakt") %>%
      rename_with(tolower) %>%
      select(dw_ek_forloeb, pnr),                  # fetch pnr via the forloeb key
    by = "dw_ek_forloeb"                            # join key β€” dw_ek_kontakt does not work
  )
# proc is still lazy β€” add filter() and collect() before use

See also

Back to top