synERGY - AIT Austrian Institute Of Technology

Common security solutions such as firewalls, intrusion detection systems (IDS) and antivirus programmes mainly apply blacklisting approaches. These solutions process signatures created in modern malware labs or from community data collected in cloud-based malware detection systems and regularly distributed to endpoints to enable detection of known malicious activity. While they work reliably for known attack attempts, they fail when new attack techniques are used or unknown vulnerabilities or weaknesses are exploited. As a result, anomaly detection is required, which can detect even minor deviations in system behaviour that can lead to traces of unknown attacks.

Unfortunately, most modern anomaly detection solutions are not readily applicable to cyber-physical systems (CPS) and control technology (OT), which are fundamentally different from enterprise IT networks in terms of complexity, size and widely distributed installations. In addition, the few OT security solutions available are not designed to work across different infrastructure layers (i.e. correlating information from OT and IT systems as well as data from the network layer and log data from endpoints) and are therefore mostly unsuitable for detecting modern multi-stage cyber attacks.

Another reason why today's anomaly detection systems are largely unsuitable for CPS is that CPS not only differ from companies' IT networks, but also differ greatly from one another. On the one hand, CPS have variable network infrastructures and, on the other, the integrated components are diverse. Therefore, machine learning approaches are required that do not depend on the specifics of a particular system to recognise attacks, but adapt to different usage scenarios. In addition, the operational characteristics of CPS differ from corporate IT. The processes are usually based on highly deterministic machine-to-machine communication, which allows for more sensitive anomaly detection with lower thresholds for system behaviour. However, existing anomaly detection approaches hardly utilise this. Attackers can exploit vulnerabilities and weaknesses at different levels and in different areas of complex CPS as entry points for successful multi-step attacks. To counter these often advanced malicious activities, we argue that an appropriate composition of different detection approaches for individual infrastructure layers (WAN, LAN, field level) improves the effectiveness of anomaly detection in CPS. These cross-layer solutions correlate several interesting data streams measured at different locations and OSI layers of the protocol stack. With the right combination and methods of analysis, a cross-layer approach can greatly increase overall security awareness in CPS. Nevertheless, the selection of suitable data streams and observation parameters (e.g. observation points, time periods, granularity and layers) remains a challenging task. In addition to the quality of detection, the resources required for detection (time, budget, effort) are also important.

The synERGY project aimed to develop an architecture and demonstrate a proof-of-concept implementation of a modern reactive security solution specifically designed for large-scale and complex CPS. The design criteria of this architecture are:

It utilises operational data sources of all kinds, especially network and endpoint data, as well as different areas of a CPS from the field level to the enterprise level.
It applies best-in-class anomaly detection mechanisms, not just signature-based solutions, and uses machine learning to optimise them.
It utilises cross-correlation techniques to increase confidence in results and detect new multi-stage attacks progressing through an infrastructure.
It facilitates the interpretation of detected anomalies using contextual data from the organisation.

The synERGY project investigated how to design, develop and validate an adaptive, self-learning, cross-layer anomaly detection system based on open standards that can be applied to a range of CPS in a vendor-independent manner and detect the traces of a variety of modern cyber-attacks with limited human effort by applying cross-correlation techniques of numerous data streams.

Working on the associated challenges as part of a cooperative research project was important in order to develop a vendor-independent solution. For this reason, open source solutions and open standards were selected for development.

In the synERGY project, we worked on a special use case that was motivated by the increasing number of cyber attacks on energy suppliers or other operators of critical infrastructure, including Stuxnet, Crashoverride, Black Energy and Petya. In terms of power distribution networks, a modern substation comprises a range of security, monitoring and control equipment. For the cross-correlation use case, we assume that all technologically possible security measures are implemented and function correctly. However, since the life cycle of industrial components in the energy industry is very long compared to standard IT, the existence of legacy equipment with additional protection requirements is quite common. Due to these conditions, the application of anomaly detection is a promising means to further protect such systems. When applying advanced anomaly detection systems, the primary protection goals in the area of industrial safety, availability and integrity are of utmost importance.

In particular, the real-time properties of industrial systems must not be restricted. This is a significant difference to the classic office IT world, where confidentiality is usually the top priority. A key concept in the field of industrial security is "defence-in-depth", which is based on the realisation that protection against cyber attacks on industrial plants, such as the power grid in our use case, involves all stakeholders, including operators, integrators and manufacturers.

In addition to building a defence concept, the correlation of detected anomalies from the different layers of the shell model is crucial to detect well-hidden attackers earlier and to increase the quality of alarms, i.e. to reduce the false positive rate.

false positive rate. From an overall security perspective, correlating detected anomalies in network traffic with physical data sources such as access control, work order database, etc. has great potential. The synERGY use case therefore takes these factors into account.

Unauthorised access to the process network is one of the biggest threats to the overall system and forms the basis for a variety of threat scenarios. Such access can occur either physically or logically via the network to practically every component and every area of the network. Unauthorised access to components in the field is particularly critical, as an attacker may find little or no physical access protection there. Access allows the attacker to copy, intercept or modify the transmitted data and use it for their own purposes. This use case deals specifically with the following four attacks (also labelled as stars in Fig. 1):

1. the attacker physically breaks into a transformer station (secondary substation) and gains access to an RTU.

2. the attacker physically breaks into a substation (primary substation) and gains access to an MTU.

3. the attacker gains access to the network via remote maintenance access.

4. the attacker gains access to the network via a compromised device, e.g. a PC (engineering workstation) or a maintenance notebook.

Figure 1: A typical infrastructure of a utility provider. The yellow stars mark weak spots on the attack surface.

Multiple types of anomalies are already detectable with state-of-the-art technologies. Some of the relevant anomalies for the system given in Fig. 1 are listed in Table 1. For discovering traces of intrusions and detecting single anomalies as described in Table 1, various often specialized solutions exist. The focus of synERGY however is on the correlation of such ’simple’ anomalies across layers and components to detect complex (potentially multistage) attacks more precisely on a higher level.

traces/sources	further description and remarks
failed logins	series of failed login attempts (especially those where a successful login follows immediately)
permission changes	changes of r/w permissions, creation of a new administrator account
configuration changes	configuration changes (i.e., date, user who made the change and what has been changed) on switches, particularly the creation of a new VLAN or other security-related configurations, including changes to cryptographic parameters (updates of keys, roles and access permissions)
implausible data	comparison of sensor values with historical data may lead to the detection of deviations
time base settings	deviations of the timestamps of log data of the RTUs and the SCADA server
user authentication	registration of a successful authentication of a default or administrator user, deviating parameters of the authentication, e.g. used authentication method (password, SSH), SSH key fingerprint, protocol settings used (hmac-sha1, aes, ...), session settings (TERM variable, ...)
authorization	execution of commands that according to the user role concept are not allowed
altered or deleted log entries	manipulations on the network devices (router, switch, firewall) or host systems
device boot up	visible in the log entries due to numerous unique startup events
traffic statistics	anomalous traffic patterns, captured with tools, such as Flowmon, for instance, communication from one MTU to another MTU
comparison of traffic profiles	time windows-based comparison of current traffic volumes (netflows) with historic data; classification of flows based on number of packages and sizes. Note that traffic profiles of similar substations can be compared to one another, if they possess similar sensors and serve similar actuators
broadcast	broadcast storm directly on the switch
device authentication	failed authentication attempts of fake devices in the process network through the NAC of the switches (MAC-based and where technically possible over 802.1X)
ARP spoofing	independently detected by the switch and relayed as alarms
loss of communication	An interface, which goes down, is quite common during normal operations, but together with other anomalies a good indicator that something is odd.
overload of a communication interface	e.g., by a DoS attack. This can easily be detected if a station is not reachable (failure messages in the SCADA system).
changes in the protocol 104	injection of custom non-standard packages used to manipulate stations
data from protocols other than 104	other ports or package formats are used
neu hinzugefügte oder fehlende Geräte	Devices with sender IP address are visible (either via ARP requests or DHCP requests) in the network which can not be found in the asset DB.
network Port goes down	All unnecessary ports are disabled by default; however, ports often also go down temporarily in normal operation.
ethernet connection parameters change	e.g. “Eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None”. Since embedded devices in particular often do not use the maximum power settings, the temporary attachment of a notebook instead of the original device can be detected, even if the MAC has been correctly spoofed.

The frontend of synERGY is built by the SecurityAdvisor (SA). SA is a SIEM solution that is used for collecting and correlating all relevant network-, log-, context and anomaly data. This data is visually displayed inside a web-based, graphical user interface (GUI). For supporting the operator or analyst the ‘Browse module’ within the SA GUI offers various search, filter and visualization options, e.g. a full-text search which helps finding specific messages or events within the whole dataset. Besides the textual representation, the data is also displayed within a timeline, which makes it possible to detect outliers in the collected and analyzed data at first sight. Additionally, the operator can use a dashboard to create specific graphical representations (widgets) to gain further insights. The browse and dashboard modules are both useful for analyzing huge amounts of data and giving an overview of the whole dataset and infrastructure. For deeper analysis of a specific event, the SecurityAdvisor offers a ‘Detail-View’, as shown in Figure 2. This view displays a confidence and risk score, data related to the affected assets, and contextual data, e.g. data collected from the organisation’s time-tracking and task-tracking databases, as well as from a CMDB. For a deeper understanding of the anomaly, all the data (network traffic, log) which triggered the anomaly, is correlated by the backend, and displayed to the operator.

Figure 2: Screenshot of SecurityAdvisor’s (SA) anomaly details.

In order to realize the synERGY approach, we implemented a proof-of-concept system following the designed architecture, and challenged it in a realistic, yet illustrative, way to show its potential with respect to intrusion detection in a utility provider’s infrastructure. Within the project, we discussed the numerous pitfalls when it comes to the implementation and deployment of such an anomaly detection (AD) system and highlighted the importance of cross-correlating AD results from distinct detection systems, as well as the correlation with organizational context to enable the proper interpretation of security events and reinforce the significance of reports.

Even after the three-year’s project there remain numerous open research challenges. Smaller near-term goals relate to how such a system can be set into operations more efficiently, i.e., create parsers, rules and set parameters with minimum human involvement. This is important to lower the entry barrier for interested stakeholders of the synERGY system. Currently, experts are required to validate and tune the parsers for complex log data, train the different machine learning models and come up with initial rule sets for the detection engines. The medium-term goal is to make the operational phase of the synERGY system economically feasible by involving even more organizational knowledge to interpret anomalies more accurately, which is the basis for justified decision making. Our long-term goals are centered on improving synERGY’s ability to adapt to changes in the monitored environment, which means that the detection and interpretation must be changed too (including, models, learned parameters, weights etc.) – preferably in a (semi-)automatic manner.

Partner: TU Wien – Institut für Telekommunikation, Huemer IT-Solution, Universität Klagenfurt – Forschungsgr. Systemsicherheit, MOOSMOAR Energies OG, Energie AG Oberösterreich Telekom GmbH, LINZ STROM GmbH, Bundesministerium für Landesverteidigung
Projektlaufzeit: 01/2017 –12/2019
Förderprogramm: FFG IKT der Zukunft - 4. Ausschreibung 2015