Background
For planning a clinical trial (including design and power calculation), information is needed on the disease, the target population and the biometrical properties of the endpoints used. Clinical trial planning should be based on best available evidence. However, in many cases, published results of prior or similar trials are available, but relevant details are not provided. Examples of such information are statistical properties of certain variables in a target patient population, time course of response variables, statistical properties of before-after differences, effect sizes of treatment differences in trial endpoints, distribution of prognostic factors in the target population, frequency of conditions considered as exclusion criteria in the intended trial, and many more.
Clinical trials prospectively collect data based on a predefined documentation concept derived from the study protocol, which often includes several hundred variables in structured case report forms. Great efforts are made to specify and enforce quality metrics for each data element (e.g., mandatory fields, formatting requirements, units of measurement, reference intervals, query actions). While local investigators through the study protocol know the exact meaning of variables and measurement methods, specific training or standard operating procedures (SOPs), it is difficult for external researchers to understand the exact meaning because of lacking or limited descriptions. A detailed description incl. concepts from medical terminologies, plausibility checks and information on the origin of the data (inquired/measured/calculated/ transferred from secondary system), would increase the quality of the documentation concepts and support the reuse of the data. In the long end, harmonization and standardization of clinical trial data elements would be helpful both in the creation of new study concepts (since one could refer to gold standard data elements) and in meta-analyses comparing studies on the same topic.
Clinical trials centers (CTC) at German medical faculties support a wide range of academic, publicly funded clinical trials. About 400 academic clinical trials are activated in Germany each year. Once a clinical trial is completed and its results published, data should be made available for scientific reuse. This is strongly encouraged by public funders of clinical trials such as DFG or BMBF. Currently, no infrastructure exists to support clinical trials data sharing concerning findability, accessibility and reusability and in compliance with the legislative and regulatory requirements for personal medical data.
To overcome these limitations, this use case will first develop a catalogue of typical characteristics of clinical data sets (e.g., disease area, type of clinical trial, trial unique identifier, trial design characteristics, type of intervention, full text search in the trial synopsis, target population, outcome variable, type of therapy) in collaboration with T2.1 and T2.2. The characteristics will be implemented as searchable and filterable facets in T3.1. Together with TA6, concepts for different use and access mechanisms will be developed and thus implemented in T3.4 or as distributed analysis in T3.7. As part of this activity, the implemented service and their orchestration will be evaluated in real-life scenarios in a CTC.
Deliverables of Task 5.4 | |||
---|---|---|---|
|
Contact

Dr. Oana Brosteanu
