Jump to content

File Format Metadata Aggregator

Description

This service presents an approach for the automatic estimation of preservation risks for file formats. The service aims at mitigating the risk of digital obsolescence by providing risk management reports to content providers e.g. libraries, archives, museums. The service performs inspection of format related information and statistical analysis of the file formats at hand to categorize them based on their preservation risk. Service makes use of available preservation community resources such as technical registries (like PRONOM, DBPEDIA and FREEBASE) for policy extraction.


AIT Contribution

The main contribution of this work is the definition of risk factors with associated severity levels and their automatic computation. We have developed a tool for aggregating rich and trusted file format descriptions. It exploits available linked data resources and uses expert models to infer knowledge regarding the long-term preservation of digital content. The ontology mapping technique is employed for collecting the information from the web of linked data and integrating it in a common representation. A Web service is created to support programmatic access to format and risk analysis reports and to disseminate the results among digital preservation community.

Demonstration

  1. Go to "File format preservation risks evaluation" section.
  2. In input field “File format extension” write a file format extension e.g. "pdf" or "jpg".
  3. Press the button "Generate risk score report". This generates risk score report for file format that comprises risk levels.

The remaining fields are necessary for the experts to collect information for analysis.