Personal Health Train

Personal Health Train

Distributed Privacy-Preserving Data Analysis


In public health research, data sharing is often challenging due to the sensitive nature and heterogeneous characteristics of health data. Moreover, the lack of standardization and semantics further exacerbate the problems of data fragments and data silos, which makes data analytics difficult. While there are great benefits for collaborative analyses across multiple epidemiological studies, data protection concerns and the lack of IT infrastructure currently limit widespread cross-institute research projects. The aim of NFDI4Health is to enable institutes to participate in such

research projects without actually ceding control over their data. One approach to overcome these challenges is federated data analysis with Personal Health Train (PHT). The PHT complements the Central Search Hub (CSH) and Local Data Hub (LDH) by providing the means of executing analysis in a distributed and privacy-preserving manner. Unlike centralized analysis, the PHT brings the algorithm to the data instead of vice versa, enabling data owners to retain control of their data and keeping the data in its origin.


The Personal Health Train Concept is an innovative approach to foster data-driven innovations in medicine. It provides a distributed analysis infrastructure that enables research on sensitive data without prior data sharing, while supporting diverse data formats. The PHT originates from an analogy from the real world, resembling a railway system with trains, stations, and train depots. In the PHT ecosystem, the train encapsulates an analytical task, represented by the goods in the analogy. The data provider plays the role of a reachable station, accessed by the train. The station executes the task, processing the available data. The Central Service (CS)

serves as the depot, managing train orchestration, operational logic, business logic, and data management. This design paradigm ensures algorithms are brought to the data instead of bringing confidential data to the algorithm, ensuring compliance with data protection requirements. Hereby, the PHT provides a distributed, flexible approach to using data in a network of participants, incorporating the FAIR principles. Within Germany different implementation initiatives such as PADME or PHT-meDIC are cooperating closely as part of the international PHT Go FAIR implementation network.

Implementation under NFDI4Health

To demonstrate the effectiveness of the NFDI4Health infrastructure, we conducted 2 Use Cases in collaboration with the University Hospital Cologne Radiology Department (UHC) and Fraunhofer MEVIS. Hereby, the PHT is used to generate synthetic data in a distributed manner, which can later aid in data harmonization efforts. The Use Case developed with Fraunhofer MEVIS focuses on the recognition of kidney tumors in patients using
computer tomography images from 2 different locations, where known tumor patients have already been treated. With this data, our Use Case aims at studying the overall outcome of different therapy approaches. In the Use Case conducted with the UHC Radiology Department, we targeted at the recognition of lung-cancer patients, where data-harmonization is a pressing issue due to many different devices and protocols found in practice.

Our Services

Health Study Hub

The German Central Health Study Hub allows researchers to publish their project characteristics, documents and data related to their research project in a FAIR manner or to find information about past and ongoing studies.

Data Train

The Data Train cross-disciplinary graduate training programme, a core element of the NFDI4Health training approach, aims at building the next generation of data-savvy researchers in the biomedical sciences.

Personal Health Train

To foster data-driven innovation in medicine, we developed a distributed analysis infrastructure that enables research on sensitive data without prior data sharing while supporting diverse data formats.

Local Data Hub

The LDH is the local node in the federated concept of NFDI4Health. We develop and promote the dissemination of a unified data sharing platform based on the FAIR principles in line with the NFDI4Health standards.

Data publication

Health data, as collected in clinical trials and epidemiological, as well as public health studies, cannot be freely published, but are valuable datasets whose reuse is of high importance for health research. NFDI4Health has established a metadata standard and process for the publication of health studies to make health data FAIR.

Data harmonisation

To make health studies and their data FAIR we have developed guidelines and standards for metadata description and data sharing. We have developed data publication guidelines, common metadata description standards and adaptations of health data interoperability standards to harmonize the description of studies and their corresponding metadata.

Data Quality Assessment

It is a paradox: on the one hand, good scientific work depends on high data quality. On the other hand, a lot of effort is put into the design and conduct of studies, but not into data quality assessments. We help to facilitate the efficient performance of such assessments with versatile concepts and tools


Expansion of decentralised research projects with DataSHIELD: Until now, data protection concerns and the lack of special IT infrastructure have prevented the expansion of cross-institutional research projects. DataSHIELD is intended to solve that problem.
We use cookies

We use cookies on our website. Some of them are essential for the operation of the site, while others help us to improve this site and the user experience (tracking cookies). You can decide for yourself whether you want to allow cookies or not. Please note that if you reject them, you may not be able to use all the functionalities of the site.