Personal Health Train
Distributed Privacy-Preserving Data Analysis
In public health research, data sharing is often challenging due to the sensitive nature and heterogeneous characteristics of health data. Moreover, the lack of standardization and semantics further exacerbate the problems of data fragments and data silos, which makes data analytics difficult. While there are great benefits for collaborative analyses across multiple epidemiological studies, data protection concerns and the lack of IT infrastructure currently limit widespread cross-institute research projects. The aim of NFDI4Health is to enable institutes to participate in such
research projects without actually ceding control over their data. One approach to overcome these challenges is federated data analysis with Personal Health Train (PHT). The PHT complements the Central Search Hub (CSH) and Local Data Hub (LDH) by providing the means of executing analysis in a distributed and privacy-preserving manner. Unlike centralized analysis, the PHT brings the algorithm to the data instead of vice versa, enabling data owners to retain control of their data and keeping the data in its origin.
The Personal Health Train Concept is an innovative approach to foster data-driven innovations in medicine. It provides a distributed analysis infrastructure that enables research on sensitive data without prior data sharing, while supporting diverse data formats. The PHT originates from an analogy from the real world, resembling a railway system with trains, stations, and train depots. In the PHT ecosystem, the train encapsulates an analytical task, represented by the goods in the analogy. The data provider plays the role of a reachable station, accessed by the train. The station executes the task, processing the available data. The Central Service (CS)
Implementation under NFDI4Health
The German Central Health Study Hub allows researchers to publish their project characteristics, documents and data related to their research project in a FAIR manner or to find information about past and ongoing studies.
The Data Train cross-disciplinary graduate training programme, a core element of the NFDI4Health training approach, aims at building the next generation of data-savvy researchers in the biomedical sciences.
Health data, as collected in clinical trials and epidemiological, as well as public health studies, cannot be freely published, but are valuable datasets whose reuse is of high importance for health research. NFDI4Health has established a metadata standard and process for the publication of health studies to make health data FAIR.