Multimodal Information Retrieval is dedicated to research methods and approaches that combine complementary multimodal information into task-specific knowledge representation. Semantic and multimodal gaps are bridged by combining task-independent fusion methods, high-level knowledge modelling, cross-model indexing, and advanced visualization.
Thanks to the combination of advanced methods for knowledge graph modelling, named entity recognition, statistical data analysis and indexing, our prototypes implement complex multi-modal data aggregation, search and visualisation solutions applied to specific domains (e.g. Digital Cultural Heritage, Digital Humanities). DSAI is a key technology supplier for the European Commission in the context of Europeana and eArchiving infrastructures. By leveraging multimodal approaches, institutions and researchers can efficiently analyse and create enhanced representations of cultural heritage assets. In turn, DSAI solutions effectively support documentation, preservation, exploration, and sharing of our rich cultural history.
Our research creates methods for efficient multimodal information retrieval in large, heterogeneous data collections facilitated by cross-modal search (e.g. text to image, image to text, text and image to image) and user-friendly representations of results (e.g. curated data collections). Our solutions are using embedding-based semantic indexing and LLMs supporting natural language queries against heterogeneous data sources including both structured data and unstructured text. The goal is to transform legacy information into rich semantic representations (i.e. knowledge graphs).
Goals
- Facilitate knowledge organization and discovery of cultural assets
- Leverage multimodal clustering and classification techniques
- Streamline workflows for indexing and retrieval of heterogeneous data
- Offer solutions for 2D/3D object annotation
Application Domains
- Digital Cultural Heritage
- Digital Humanities