Traditionally, treatment guidelines and health intervention recommendations are developed based on results of large cohort studies or randomized controlled trials (RCTs). However, the analysis of such studies only yields estimates of average effects. Hence, these results do not allow meaningful predictions whether an intervention will help a given single individual. In the advent of digital solutions, personalized approaches have been on the rise. N-of-1 trials and other modern study designs allow to derive individual treatment effects, but also to use the data to obtain and improve the precision of population-level effect estimates of health interventions. 

This seminar covers N-of-1 trials and other modern study designs such as micro-randomized trials. After an overview of different study types and their characteristics, the main focus of the class will be on methodological approaches for planning and analyzing N-of-1 trials. At the beginning of the class, we will gather data from N-of-1 trials that will be used throughout the course to illustrate the statistical methods. 


  • Overview of classic and modern study designs
  • Introduction to N-of-1 trials
  • Micro-randomized trials & other modern study designs
  • Ethics, data privacy and other requirements of digital studies
  • Standard methods for individual analysis of N-of-1 trials
  • Standard methods for aggregated analysis of N-of-1 trials
  • Bayesian regression models for N-of-1 trials
  • Meta analysis & network meta analyses
  • Sample size calculation for N-of-1 trials
  • Adaptive designs
  • Statistical methods for the analysis of micro-randomized trials


  • Introductory lectures with discussion of main concepts of N-of-1 trials and study designs
  • Weekly readings of a paper as homework and discussion in class. One group of students will give a short presentation and lead the discussions
  • Joint statistical analysis of N-of-1 trial data gathered in class by applying the discussed statistical models.
  • Final project with presentation in class

Learning goals:

At the end of the course, the students will be able to

  • understand the main concepts of planning & conducting N-of-1 trials and selected other study designs
  • perform individual-level and aggregated analysis of N-of-1 trials using state-of-the-art methods


In an increasingly interconnected world in which almost every device is essentially both a data generator and collector, composing large datasets of complex personal information is ever more easy to achieve. This is in spite of the fact that privacy legislation such as GDPR provides measures to prohibit the collection and storage of personal data without explicit user consent. A further point of alarm is the growing number of reports in popular media on de-anonymization incidents that have paved the way for related security subversion incidents such as leaks of personal data. In this seminar, we study several anonymized datasets in effort to understand why and how de-anonymizations occur. Specifically, we focus on designing reverse-clustering algorithms to discover outlier data points, and determine how these can be used either individually or in combination with auxiliary data, to de-anonymize data points within the original dataset. As a final point, we will discuss the properties of the outlier data points in terms of how they enabled the de-anonymizations and what possible counter-measures to apply. 

This is a bridging module aimed speci cally at students with a background in health professions/life sciences.

The goal is to provide a platform to enable students learn about the use of Information Technology (IT)

systems in the healthcare environment. The module covers topics such as the use of digital patient data, secure

messaging, e-health, computerised clinical decision-making tools, and the impact of IT systems on e ective

patient care. In addition, the students will learn about the impact of integrating IT systems in healthcare from

the human resource, economics and government policy perspective. After completing this module, you should

be able to think critically about IT Systems in the digital health context.

HPI - Kurs Process Mining

This course teaches (i) basic epidemiological concepts and (ii) biostatistical methods and their application for data analysis of large epidemiological datasets using the statistical software R ( and the graphical interface RStudio ( To this aim, the class starts with an introduction to R and RStudio. R Markdown will be used as a tool for documentation and reporting of the analysis results. Next, the class covers data processing steps and introduces epidemiological study designs as well as theoretical and practical aspects of basic and more advanced biostatistical methods. In addition to classical biostatistical approaches such as linear and linear mixed models, newer methods how to deal with missing values, how to perform meta analyses, and for causal inference will be discussed and applied.

General Information

  • Lecturer: Dr. Stefan Konigorski (
  • SWS: 4+2
  • ECTS: 6
  • Graded: Yes
  • Enrolment Deadline: 01.10. - 22.10.2021
  • Enrolment Type: Compulsory Elective Module
  • Course Language: English 


  • Introduction to R, RStudio
  • Documentation and report writing using R Markdown
  • Data setup: create, import, export datasets in R
  • Format datasets in R: transform variables and manipulate datasets
  • Descriptive statistics
  • Tables and graphics to visualize data and results
  • Epidemiological study designs and study planning
  • Introduction to statistical parameter estimation and hypothesis testing
  • Statistical methods for dealing with missing values
  • Linear & logistic regression models
  • Linear mixed models for the analysis of clustered and longitudinal data
  • Meta analysis
  • Survival analysis
  • Statistical methods for causal inference

Learning goals

At the end of the course, the students will be able to

  • understand the main concepts of basic and more advanced biostatistical methods and select appropriate methods for data analysis of epidemiological studies
  • import and manipulate datasets in R for statistical analysis
  • perform the data analysis in R considering measurement error and missing values
  • document the analysis and report the results using R Markdown.

Teaching form

  • Lectures (via zoom) with interactive practical exercises in R
  • Video snippets (provided asynchronously) with additional information on the lecture content
  • Tutorials with discussion of homework


Laptop with R and RStudio installation:

Condition for admission to final exam

  • Hand in solutions to 10 of the 12 weekly assignments

Final grade

  • Open book take home final exam (100%)


  • In the first tutorial on October 26, 2021, problems with installing or setting up R, RStudio, or other formal/technical questions can be clarified. This is possible in person from 17:00 - 18:00 in room HS 2 at the HPI, or from 18:00 - 19:00 via zoom.
  • All other lectures and tutorials will be zoom only.
  • Please enrol to the course here on Moodle to obtain the zoom link for the classes, or send an email to Stefan Konigorski