Automated means of reliably distilling raw visual input into high-level concepts such as objects, motion paths and object appearances are crucial analysis components of many vision systems (visual surveillance, media analysis). All stages of such a visual processing chain are associated with ambiguities, which imposes the primary scientific challenge when developing visual analytic functionalities.

Application example: People counting in a crowd

A Bayesian part-based pedestrian detection framework using shape and motion cues has been developed which relies on a maximum a posteriori (MAP) solution for human configurations in a crowd. The spatial configuration of detection hypotheses is optimized in a greedy manner w.r.t. the posterior probability and considering the occlusion status of individual hypotheses. Our patented protected integral contour-based methodology provides an extensible framework, where variable shape templates for new object classes can be quickly integrated. The human detection framework is coupled with analytics functionalities quantifying the spatio-temporal human density, an information relevant in many application domains (retail, transportation).

Application: Content filtering and search

Image content contained within the Internet encompasses vast amounts of visual data. Driven primarily by social responsibility internet providers and authorities strive to pinpoint data which can be gauged as critical in terms of visual content. Typical critical content images involve nudity and unwanted (e.g. right-extremist) propaganda. We have developed an automated recognitionframework which employs a visual model of high representational power resulting in highly accurate filtering and search results when analyzing image data from the internet.