Building the Next Generation of Data Savvy Biomedical Researchers

The Data Train cross-disciplinary graduate training programme, a core element of the NFDI4Health training approach, aims at building the next generation of data-savvy researchers in the biomedical sciences.


Researchers, but also allied professionals in the biomedical and health fields need the skills to manage research data in a structured way, and to take the steps from contextualising data to gaining knowledge and to making informed decisions. This ensures transparency, quality, reproducibility, and compliance with regulatory requirements. To equip the researchers with the appropriate understanding and the necessary skills in research data management (RDM) and data science, NFDI4Health offers a modular, reusable training programme that is tailored to different audiences based on the FAIR principles for data stewardship. A central component of this approach is to provide doctoral researchers with a broad grounding in RDM and data science, coupled with a practical and group-oriented immersion in data stewardship or data science or both.

The U Bremen Research Alliance, with the support of the State of Bremen, has established the Data Train cross-institutional and cross-disciplinary training programme for doctoral researchers from its member institutions. Data Train's mission is to strengthen basic skills in data literacy, RDM and data science, while providing a platform for doctoral researchers to build an interdisciplinary and inter-institutional network. The nine NFDI consortia represented in Bremen, including NFDI4Health, are involved in the development and delivery of the courses. In return, courses are opened for the NFDI-communities whenever possible.

A total of 40 lecturers delivers the programme in two runs since 2021, with 14 lectures, 17 workshops and data stories in each run, totalling more than 350 hours. In these two runs, more than 222 doctoral students participated. Since the programme was offered virtually, it was opened to technical staff, master students and other interested researchers. Thus, more than 2,600 participations have been registered. Data Train will now be offered every year, and several institutions outside Bremen have already expressed interest in collaborating.

NFDI4Health is contributing lecturers (Prof. Dr. Iris Pigeot, Prof. Dr Benedikt Buchner, Prof. Dr. Dennis-Kenji Kipker). Two data stories were also contributed by NDFI4Health members (Lessons learned from Covid-19 surveillance: How to avoid data unFAIRness by Dr. Anatol-Fiete Näher; How to make large scale phenotypic data internationally competitive within a population-based cohort study by Prof. Dr. Carsten Oliver Schmidt).

The highly successful Data Train covers the entire data value chain, fosters an interdisciplinary perspective, and makes an important contribution to data literacy training for science. Data Train is designed to showcase best and discipline-specific practices through data stories. It is also open to external collaboration. NFDI4Health is working with Data Train to provide a reusable and adaptable version of the training programme with health-specific add-ons.

Schematics of the curriculum concept. Light red boxes stand for add-on discipline-specific courses that could be offered in the scientific domains or the NFDI consortia. Hörner et al. 2021, doi: 10.17192/bfdm.2021.3.8343

Data Story, contributed by Fiete Näher of NFDI4Health, reports on building up a surveillance system for disease outbreak detection & key aspects for the governance of high-quality health data.

Health Study Hub

The German Central Health Study Hub allows researchers to publish their project characteristics, documents and data related to their research project in a FAIR manner or to find information about past and ongoing studies.

Data Train

The Data Train cross-disciplinary graduate training programme, a core element of the NFDI4Health training approach, aims at building the next generation of data-savvy researchers in the biomedical sciences.

Personal Health Train

To foster data-driven innovation in medicine, we developed a distributed analysis infrastructure that enables research on sensitive data without prior data sharing while supporting diverse data formats.

Local Data Hub

The LDH is the local node in the federated concept of NFDI4Health. We develop and promote the dissemination of a unified data sharing platform based on the FAIR principles in line with the NFDI4Health standards.

Data publication

Health data, as collected in clinical trials and epidemiological, as well as public health studies, cannot be freely published, but are valuable datasets whose reuse is of high importance for health research. NFDI4Health has established a metadata standard and process for the publication of health studies to make health data FAIR.

Data harmonisation

To make health studies and their data FAIR we have developed guidelines and standards for metadata description and data sharing. We have developed data publication guidelines, common metadata description standards and adaptations of health data interoperability standards to harmonize the description of studies and their corresponding metadata.

Data Quality Assessment

It is a paradox: on the one hand, good scientific work depends on high data quality. On the other hand, a lot of effort is put into the design and conduct of studies, but not into data quality assessments. We help to facilitate the efficient performance of such assessments with versatile concepts and tools


Expansion of decentralised research projects with DataSHIELD: Until now, data protection concerns and the lack of special IT infrastructure have prevented the expansion of cross-institutional research projects. DataSHIELD is intended to solve that problem.
We use cookies

We use cookies on our website. Some of them are essential for the operation of the site, while others help us to improve this site and the user experience (tracking cookies). You can decide for yourself whether you want to allow cookies or not. Please note that if you reject them, you may not be able to use all the functionalities of the site.