Research focus and projects in the field of Scientific Data Management

Efficient and scalable methods for the integration of large amounts of data as well as knowledge representation and discovery are central challenges of the research program of the Scientific Data Management Research Group. The developed applications are used in various domains (especially biomedicine and digital libraries) to turn heterogeneous data into usable knowledge.

The research plan includes the development of state-of-the-art infrastructures for managing heterogeneous scientific data, extracting knowledge from these data and developing new relationships and patterns. These infrastructures facilitate the integration and analysis of large and complex data sets into scientific knowledge graphs and facilitate the cooperation of all actors in value-added chains around scientific data. The challenges that the research group is working on include:

  • Knowledge graphs that not only encode the meaning and connections of scientific data, but also contain knowledge about provenance, privacy, quality and uncertainty.
  • Domain-specific ontologies and link discovery techniques capable of promoting the interoperability of heterogeneous and large scientific data sets in a scalable manner.
  • Integration methods for heterogeneous and extensive scientific data sources, e. g. legacy, structured and unstructured data, static data and continuous data streams.
  • Storage and distribution of extensive scientific data and knowledge graphs.
  • Access control methods to enforce privacy regulations for sensitive data.
  • Federated query engines for scientific knowledge graphs.
  • Data analysis and methods of knowledge discovery through scientific knowledge graphs.

The developed infrastructure components are evaluated on the basis of various data sets. Scientific data from publications archived in the TIB's databases (e. g. via RADAR or DataCite) are particularly suitable for this purpose. Scientists will be able to use the developed scientific data management infrastructures to sustainably increase the effectiveness and productivity of their research work.


The research group will work on third-party funded projects that have been transferred and newly acquired by the University of Bonn. This includes in particular:

  • H2020 iASiS: Integration and analysis of heterogeneous big data for precision medicine and suggested treatments for different types of patients (4/2017 to 3/2020).
  • H2020 BigMedilytics: Big Data for Medical Analytics (2017 to 2020)

The infrastructures developed in these two projects are also used for the management, research and analysis of scientific publications in the TIB databases.

Joint Lab Data Science & Open Knowledge

Some of the research on these topics takes place within the framework of the Joint Lab Data Science & Open Knowledge.

The Joint Lab will be established together with Leibniz Universität Hannover (LUH), the Faculty of Electrical Engineering and Computer Science and the L3S Research Center of LUH.