flowchart LR
S1["β Get started<br>1 Β· 2 Β· 3"] --> S2["β‘ Get data<br>4 Β· 5 Β· 6 Β· 7 Β· 8"] --> S3["β’ Build the study<br>9 Β· 10 Β· 11"] --> S4["β£ Code and variables<br>12 Β· 13"] --> S5["β€ Analyse and finish<br>14 Β· 16"]
S3 -.-> R15["15 Β· Reference<br>look up as you go"]
classDef stage fill:#eaf2fb,stroke:#4a78b5,stroke-width:1px,color:#173a5e;
classDef ref fill:#f6f6f6,stroke:#aaaaaa,color:#555555;
class S1,S2,S3,S4,S5 stage
class R15 ref
Register-based research at DST
Who is this guide for, and what will you learn?
This guide is for anyone working with register-based research at Statistics Denmark (DST).
Where do you start?
New to register research?
Start at phase 1 and follow the phases in order β a little R experience makes the start easier.
β Start at phase 1
Already know R?
Skip the introductory phases and go straight to the server, files and extractions.
β Jump to phase 3
Working on DARTER?
Project-specific setup steps and guidance.
Looking for something specific? Use the search box in the top right β it searches across the entire guide.
SDS and DST
As a researcher, you work on Statistics Denmarkβs (DST) servers. DST receives and processes data from, among others, the Danish Health Data Authority (SDS), which holds the raw national health registers (LPR, LMDB, cancer register etc.) and makes them available to researchers via a secure remote connection.
The phases of the guide
The guide is built as 16 phases. The roadmap shows the natural path through, from planning to export; the table below gives a quick overview with links. You donβt have to read everything in order β use Phase 15 β Reference to look things up as you go.
| Phase | Contents |
|---|---|
| 1 β Plan your study | Research question, key concepts and data model |
| 2 β R: the bare essentials | The minimum of R you need to get going |
| 3 β Log in to DST | Access to the server and the first overview |
| 4 β File types and loading | Parquet and SAS β formats and conversion |
| 5 β Extracting data step by step | The universal extraction pattern: open_dataset β filter β collect |
| 6 β First extraction | Your first real extraction with synthetic data |
| 7 β Inspect your data | Check structure, types and distributions before analysis |
| 8 β Know your registers | Find the right registers for exposure, outcome and covariates |
| 9 β Hospital contacts (LPR) | LPR2/LPR3 and ICD codes β 9a understand Β· 9b extract |
| 10 β Build your study population | Cohort, index date, in/exclusion and censoring |
| 11 β Assemble your extracts | Joins, pivots and handling missing data |
| 12 β Good code practice | Structure, naming and reproducible code |
| 13 β Socioeconomic variables | Education, income and employment from registers |
| 14 β Algorithms and packages | Ready-made algorithms β 14b OSDC Β· 14c NMI |
| 15 β Reference | Look up as you go: functions, pitfalls and registers |
| 16 β Export and repatriation | Get your results safely out of DST |