Jump to content
Symbolfoto: Das AIT ist Österreichs größte außeruniversitäre Forschungseinrichtung

AIT develops Masketeer: A tool for pseudonymisation of free German medical texts

08.08.2024
The AIT Digital Health Team stands for the secure handling of sensitive health data.
 

The AIT Digital Health team at the Center for Health and Bioresources has developed the Masketeer algorithm, an important solution for pseudonymising unstructured medical free text. This tool aims to ensure the data protection of sensitive health data and at the same time enable its use for scientific purposes.

The challenge of anonymising free medical text

Medical free texts often contain essential information about patients and their health conditions. However, these texts are usually unstructured and contain sensitive data that is subject to data protection. The challenge is to anonymise this information without losing the context. "German texts in particular often lack a comprehensive research basis," explains Martin Baumgartner, who managed the development of the algorithm.

Innovative approach of the Masketeer algorithm

AIT's Masketeer algorithm uses a combination of different masking logics to remove data from clinical notes that could reveal the identity of individuals based on the HIPAA Safe Harbor Guidelines. Labelling what information has been removed (e.g. patient name, phone number) and assigning consistent pseudonyms across all notes helps to preserve contextual information in the texts. "Our algorithm offers an exceptionally high sensitivity of 0.943 and a specificity of 0.933," emphasises Karl Kreiner, Senior Research Engineer, who is overseeing the developments.

Future prospects for data protection and research

The Masketeer algorithm is already used in the Health Data Space Nodes, which are also to be used in the current Smart FOX data donation project, as a central element for improving data protection and safeguarding patient privacy. Future developments could become even more effective by integrating large-scale language models to improve anonymisation while promoting the use of valuable medical data for research purposes. The AIT digital health team will continue to drive the development of data protection solutions in the healthcare sector, supporting progress in medical research and care.

Link to the paper: https://www.mdpi.com/1999-5903/16/8/281