DARTER β€” Project 708421

Project-specific guide to the BS & Dementia cohort study

Published

June 6, 2026

DARTER β€” Diabetes And inteRgenerational Transmission of hEalth determinants over the life couRse (project 708421).

This section is only for those working on the DARTER project. The content here builds on the general DST guide and adds the project-specific material.

Tip

Searchable variable and register overview for DARTER All variables and registers applied for in the project are collected in a searchable table: steno-aarhus.github.io/darter-project β†’

New to the project? Start with the general guide and return here: β†’ Phase 1 β€” Plan your study β†’ Phase 2 β€” R: the bare essentials β†’ Phase 3 β€” Log in to DST

Tip

Practical introduction from the project itself In Luke’s folder on the server you will find a thorough guide to working with data on DST written specifically for this project:

E:/workdata/708421/workspaces/luke/dstDataPrep/

Find the .qmd file in this folder β€” it walks through dstDataPrep, load_database() and the practical workflow on DARTER step by step with real examples from the project. Good supplementary reading to this guide.

In this section

Page Contents
This page Setup (dstDataPrep + duckplyr) and a reusable LPR extraction function
Register paths and datastores Confirmed paths and access methods for all registers on 708421
DARTER-specific pitfalls Quirks specific to this project

Initial setup steps for DARTER

Two steps must be completed before you write code on DARTER:

Important

Step 1 β€” Build dstDataPrep

dstDataPrep is the package that provides access to load_database() and all register data. It must be built manually, as the DST server resets installed packages.

  1. File β†’ Open Project in New Session
  2. Navigate to E:/workdata/708421/workspaces/luke/dstDataPrep/dstDataPrep.Rproj
  3. Press Ctrl+Shift+B (Build) β€” or on Mac via the menu: Build β†’ Install Package
  4. Wait for β€œDone” and close the session

Do it again if library(dstDataPrep) reports Error: there is no package called 'dstDataPrep'.

Important

Step 2 β€” Reinstall duckplyr at the start of each session

install.packages("duckplyr")   # run before library() β€” resets at logout
Tip

Alternative: use compute() for DuckDB connection load_database() returns an Arrow connection that only supports a subset of dplyr functions. If you need DuckDB-specific functionality or experience slow performance, you can pipe to compute():

lmdb <- load_database("lmdb") %>%
  filter(pnr %in% !!kohort$pnr) %>%   # filter in Arrow BEFORE compute β€” reduce data
  compute()                             # convert to DuckDB connection

Always reduce data with filter()/select() before compute(). See osdc documentation for DuckDB configuration and memory limits.


Before running a script: verify that path_output at the top of each script points to your workspace folder.

Recommendation: create a helper function for LPR extractions

LPR extractions require combining LPR2 somatic, LPR2 psychiatric and LPR3 β€” and doing the same for each new outcome in the project. It pays off to encapsulate this in one reusable function rather than copying the code repeatedly.

Advantages: - One place to fix if something changes (e.g. a new register or a new column) - The code block for each outcome is reduced from ~40 lines to one function call - Errors are introduced in one place instead of in each copy

How to create the function β€” define it at the top of your script or in a separate functions.R file:

See the full get_lpr_diagnoses() function
library(dstDataPrep)
library(dplyr)

get_lpr_diagnoses <- function(pnr_vector, diagtypes = c("A", "B"), inpatient_only = FALSE) {
  # Open registers
  lpr_adm   <- load_database("lpr_adm")   %>% rename_with(tolower)   # LPR2 somatic contacts
  lpr_diag  <- load_database("lpr_diag")  %>% rename_with(tolower)   # LPR2 somatic diagnoses
  psyk_adm  <- load_database("t_psyk_adm")  %>% rename_with(tolower) %>%
    rename(pnr = v_cpr, recnum = k_recnum)                            # LPR2 psychiatric contacts
  psyk_diag <- load_database("t_psyk_diag") %>% rename_with(tolower) %>%
    rename(recnum = v_recnum)                                          # LPR2 psychiatric diagnoses
  lpr3_k    <- load_database("lpr_a_kontakt")  %>% rename_with(tolower) %>%
    filter(lprindberetningssystem == "LPR3")                               # CRITICAL: avoid duplicated rows from LPR_F format
  lpr3_d    <- load_database("lpr_a_diagnose") %>% rename_with(tolower)  # LPR3 diagnoses

  # Filter on admission type if desired
  if (inpatient_only) {
    lpr_adm <- lpr_adm %>% filter(c_pattype == "0")          # "0" = inpatient in LPR2
    lpr3_k  <- lpr3_k  %>% filter(kont_type == "ALCA00")     # "ALCA00" = inpatient in LPR3
  }

  # LPR2 somatic
  lpr2_dx <- lpr_adm %>%
    filter(pnr %in% !!pnr_vector) %>%
    select(pnr, recnum, date_contact = d_inddto) %>%
    inner_join(
      lpr_diag %>% filter(c_diagtype %in% !!diagtypes) %>% select(recnum, c_diag),
      by = "recnum"
    ) %>%
    collect() %>%
    mutate(icd3 = substr(c_diag, 2, 4))                       # strip D-prefix

  # LPR2 psychiatric
  lpr2_psyk_dx <- psyk_adm %>%
    filter(pnr %in% !!pnr_vector) %>%
    select(pnr, recnum, date_contact = d_inddto) %>%
    inner_join(
      psyk_diag %>% filter(c_diagtype %in% !!diagtypes) %>% select(recnum, c_diag),
      by = "recnum"
    ) %>%
    collect() %>%
    mutate(icd3 = substr(c_diag, 2, 4))

  # LPR3
  lpr3_dx <- lpr3_k %>%
    filter(pnr %in% !!pnr_vector) %>%
    select(pnr, dw_ek_kontakt, date_contact = kont_starttidspunkt) %>%
    inner_join(
      lpr3_d %>%
        filter(diag_kode_type %in% !!diagtypes,
               is.na(senere_afkraeftet) | senere_afkraeftet != "Ja") %>%
        select(dw_ek_kontakt, c_diag = diag_kode),
      by = "dw_ek_kontakt"
    ) %>%
    collect() %>%
    mutate(date_contact = as.Date(date_contact),               # datetime β†’ date
           icd3 = substr(c_diag, 2, 4))

  bind_rows(lpr2_dx, lpr2_psyk_dx, lpr3_dx)                   # return combined table
}
Use the function β€” one call per extraction, only change CODES
kohort     <- readRDS("datasets/full_cohort.rds")
pnr_list   <- unique(kohort$pnr)

# Fetch all diagnoses for the cohort (Phase 1 β€” see hospital contacts page)
alle_dx <- get_lpr_diagnoses(
  pnr_vector    = pnr_list,
  diagtypes     = c("A", "B"),
  inpatient_only = FALSE
)
# Returns: pnr | date_contact | c_diag | icd3

# Extract one outcome β€” only change CODES (Phase 2)
CODES <- c("F00", "F01", "F02", "F03", "G30", "G31")   # dementia

dementia <- alle_dx %>%
  filter(icd3 %in% CODES) %>%
  inner_join(kohort %>% select(pnr, index_date), by = "pnr") %>%
  filter(date_contact > index_date) %>%
  group_by(pnr) %>% arrange(date_contact) %>% slice(1) %>% ungroup() %>%
  select(pnr, dementia_date = date_contact)

result <- kohort %>% select(pnr) %>% left_join(dementia, by = "pnr")
saveRDS(result, "datasets/extract_dementia.rds")
Note

This is the DARTER variant (using load_database() and the confirmed register names for 708421, as of June 2026). The general open_dataset() version and the explanation behind the pattern are in Phase 9b β€” LPR extraction.


See also

get_lpr_diagnoses() above wraps the pattern from the general guide:

Back to top