Data harmonisation and publication

Data harmonisation - bridging across the health domains through standards

To make health studies and their data FAIR (Findable, Accessible, Interoperable and Reusable) we have developed a publication policy and a metadata schema that provide standards for description and sharing of data.

Image

The publication policy describes recommendations and requirements for the publication of research data from health-related studies, with a focus on the services developed by NFDI4Health. It classifies resource types to be published, such as study descriptions or types of study documents or data collections, outlines licensing for such resources and usage of universal identifiers, as well as defines formatting and metadata description requirements for the published resources and their data.
For the metadata that describes the resources, the publication policy refers to the tailored NFDI4Health metadata schema, which is implemented in our services. This metadata schema combines common elements and their controlled vocabularies relevant for all domains and use cases covered by NFDI4Health. It is designed in a modular fashion and also comprises modules that are more specific for certain sub-domains.

Publication policy

One of the purposes of NFDI4Health is to make data collected in health-related studies FAIR. The first step towards this is the publication of comprehensive information about the context and accessibility of the data and metadata. In addition, study documents contain further detailed information needed for the correct interpretation of the collected data and should therefore be published as well. The NFDI4Health publication policy describes the requirements for the publication of research data from health-related studies in the German Central health Study Hub. Researchers aiming at making their studies discoverable and corresponding research data FAIR are highly encourged to stick to the NFDI4Health standards.

Image

Metadata schema

To make clinical, epidemiological, and public health research data FAIR (Findable, Accessible, Interoperable and Reusable), the NFDI4Health metadata schema enables the standardised publication of health resources’ metadata on the German Central Health Study Hub. Though originally developed by the NFDI4Health Task Force COVID-19 and tailored to COVID-19 studies, the generic nature of the schema enables the registration of further types of resources such as registries, secondary data sources, and various study documents. The schema also extends to other health domains by adopting a modular structure, comprising core and domain-specific metadata items in generic and use-case-dedicated modules. Most items were primarily adapted from established standards and models, including DataCite, ClinicalTrials.gov, DRKS, Maelstrom, and MIABIS.

The schema’s core module captures information commonly collected by any type of health resource, while further resource-type- and/or use-case-specific modules gather descriptions of resources of certain types or belonging to certain health domains. Bibliographic information, such as the resource’s title, description, and acronyms, is included in the core module, along with information about contributors and identifiers of relevant resources registered on the health study hub or elsewhere. Items to trigger other modules as well as provenance details about the publication of the resource are also included.

For design and data access information, the schema provides a design module, comprising characteristics pertaining to certain resource types. For studies and substudies, the module distinguishes between interventional and non-interventional study designs and provides dedicated sections for aspects of each design type. The module also provides descriptive information about the study conditions and population, including recruitment area and sample size information. An administrative information section covers details about the ethics committee approval, status, and dates of the study, along with dedicated sections for eligibility criteria and outcome measures/time points. Information about data sharing is also included, triggering the record linkage module, when applicable. Due to multiple overlapping characteristics, most sections also apply to registries and secondary data sources.

The nutritional epidemiology module provides domain-specific information, mainly related to dietary assessment instruments applied in relevant studies. The chronic diseases module specifies whether prevalent or incident disease data were collected and indicates the sources from which the data were generated. The third dedicated module provides legal, consent, and budget information required for enabling record linkage. Modules providing clinical trials and imaging/radiomics metadata are yet to be implemented. All modules incorporate mandatory, conditional, and optional items.

The schema is currently available in human-readable Excel format. Yet, towards a machine-readable version, it has been represented in ART-DECOR and mapped to HL7's FHIR. Accordingly, FHIR profiles have been created and published on Simplifier. To facilitate sharing metadata by data holding organisations, the schema is also being implemented by Local Data Hubs at several NFDI4Health partner locations.

The current version V3_3 of the NFDI4Health metadata schema can be found here.

Image

Our Services

Health Study Hub

The German Central Health Study Hub allows researchers to publish their project characteristics, documents and data related to their research project in a FAIR manner or to find information about past and ongoing studies.

Data Train

The Data Train cross-disciplinary graduate training programme, a core element of the NFDI4Health training approach, aims at building the next generation of data-savvy researchers in the biomedical sciences.

Personal Health Train

To foster data-driven innovation in medicine, we developed a distributed analysis infrastructure that enables research on sensitive data without prior data sharing while supporting diverse data formats.

Local Data Hub

The LDH is the local node in the federated concept of NFDI4Health. We develop and promote the dissemination of a unified data sharing platform based on the FAIR principles in line with the NFDI4Health standards.

Data publication

Health data, as collected in clinical trials and epidemiological, as well as public health studies, cannot be freely published, but are valuable datasets whose reuse is of high importance for health research. NFDI4Health has established a metadata standard and process for the publication of health studies to make health data FAIR.

Data harmonisation

To make health studies and their data FAIR we have developed guidelines and standards for metadata description and data sharing. We have developed data publication guidelines, common metadata description standards and adaptations of health data interoperability standards to harmonize the description of studies and their corresponding metadata.

Data Quality Assessment

It is a paradox: on the one hand, good scientific work depends on high data quality. On the other hand, a lot of effort is put into the design and conduct of studies, but not into data quality assessments. We help to facilitate the efficient performance of such assessments with versatile concepts and tools

DataSHIELD

Expansion of decentralised research projects with DataSHIELD: Until now, data protection concerns and the lack of special IT infrastructure have prevented the expansion of cross-institutional research projects. DataSHIELD is intended to solve that problem.
We use cookies

We use cookies on our website. Some of them are essential for the operation of the site, while others help us to improve this site and the user experience (tracking cookies). You can decide for yourself whether you want to allow cookies or not. Please note that if you reject them, you may not be able to use all the functionalities of the site.