Register-based research at DST

Who is this guide for, and what will you learn?

Published

June 6, 2026

This guide is for anyone working with register-based research at Statistics Denmark (DST).


Where do you start?

New to register research?

Start at phase 1 and follow the phases in order β€” a little R experience makes the start easier.

β†’ Start at phase 1

Already know R?

Skip the introductory phases and go straight to the server, files and extractions.

β†’ Jump to phase 3

Working on DARTER?

Project-specific setup steps and guidance.

β†’ DARTER β€” overview and pipeline

Tip

Looking for something specific? Use the search box in the top right β€” it searches across the entire guide.


SDS and DST

As a researcher, you work on Statistics Denmark’s (DST) servers. DST receives and processes data from, among others, the Danish Health Data Authority (SDS), which holds the raw national health registers (LPR, LMDB, cancer register etc.) and makes them available to researchers via a secure remote connection.


The phases of the guide

The guide is built as 16 phases. The roadmap shows the natural path through, from planning to export; the table below gives a quick overview with links. You don’t have to read everything in order β€” use Phase 15 β€” Reference to look things up as you go.

flowchart LR
    S1["β‘   Get started<br>1 Β· 2 Β· 3"] --> S2["β‘‘  Get data<br>4 Β· 5 Β· 6 Β· 7 Β· 8"] --> S3["β‘’  Build the study<br>9 Β· 10 Β· 11"] --> S4["β‘£  Code and variables<br>12 Β· 13"] --> S5["β‘€  Analyse and finish<br>14 Β· 16"]
    S3 -.-> R15["15 Β· Reference<br>look up as you go"]
    classDef stage fill:#eaf2fb,stroke:#4a78b5,stroke-width:1px,color:#173a5e;
    classDef ref fill:#f6f6f6,stroke:#aaaaaa,color:#555555;
    class S1,S2,S3,S4,S5 stage
    class R15 ref

Phase Contents
1 β€” Plan your study Research question, key concepts and data model
2 β€” R: the bare essentials The minimum of R you need to get going
3 β€” Log in to DST Access to the server and the first overview
4 β€” File types and loading Parquet and SAS β€” formats and conversion
5 β€” Extracting data step by step The universal extraction pattern: open_dataset β†’ filter β†’ collect
6 β€” First extraction Your first real extraction with synthetic data
7 β€” Inspect your data Check structure, types and distributions before analysis
8 β€” Know your registers Find the right registers for exposure, outcome and covariates
9 β€” Hospital contacts (LPR) LPR2/LPR3 and ICD codes β€” 9a understand Β· 9b extract
10 β€” Build your study population Cohort, index date, in/exclusion and censoring
11 β€” Assemble your extracts Joins, pivots and handling missing data
12 β€” Good code practice Structure, naming and reproducible code
13 β€” Socioeconomic variables Education, income and employment from registers
14 β€” Algorithms and packages Ready-made algorithms β€” 14b OSDC Β· 14c NMI
15 β€” Reference Look up as you go: functions, pitfalls and registers
16 β€” Export and repatriation Get your results safely out of DST
Back to top