Choosing an ICS cybersecurity monitoring system: evaluation criteria


Tuesday, 25 May, 2021

Choosing an ICS cybersecurity monitoring system: evaluation criteria

The changing cyber-threat landscape is having a large effect on how industrial systems are managed.

A rapidly changing threat landscape, combined with convergence between traditional IT and industrial control networks, is having a dramatic impact on the risk to industrial systems. Industrial control systems (ICS) include national critical infrastructure ranging from energy production and distribution to transportation, manufacturing and building control systems that are all important underpinnings to global business and everyday life. Risks to these systems range from disruption to destruction of critical assets with the potential for collateral impact to the safety of employees and citizens.

Historically, ICS threats emanated from well-equipped nation states that steered clear of destructive attacks, which were viewed as ‘red lines’ that couldn’t be crossed without a retaliatory response. With Stuxnet and attacks on the Ukrainian power grid, to name just a few, the proverbial red lines have clearly been crossed. But new threats, beyond nation states, are also emerging.

Cybercriminals are exploring ways to extend ransomware campaigns, from grandma’s photos, which may command a few hundred dollars, to holding a production line hostage, which can be much more lucrative. Meanwhile, terrorist organisations can now engage a burgeoning underground market to buy or rent the skills, tools and infrastructure necessary to launch a campaign focused on destroying infrastructure or harming people.

One clear effect has a been a dramatic uptick in the attention corporate boards and C-level executives are now paying to the cyber risks associated with the industrial assets they are responsible for. This attention has spurred many organisations to review governance plans — with a clear trend towards accountability for protecting ICS networks being assigned to a chief information security officer (CISO). These events have spawned keen interest in tools that can provide CISOs and security teams with the same level of visibility into ICS networks that they are accustomed to with traditional IT networks.

Key requirements — an outcomes perspective

As with any technology selection, there are layers of requirements that teams need to wade through before choosing an ideal solution. Selecting a monitoring system to protect some of the most critical assets the organisation owns is no different. It is, however, important to focus on the most important criteria first so the team can narrow the field and choose a solution that addresses the essential objectives. Otherwise it is easy to get lost in a plethora of low-level requirements and miss the proverbial “forest for the trees”.

A particularly useful exercise in refining the list of requirements is to flip the discussion from a ‘requirements view’ to an ‘outcomes perspective’. That is, express the key outcomes you want to ensure that the system can deliver. This view will help the team cut through the clutter and focus in on the important goals.

The top five most important outcomes for ICS monitoring systems include:

  1. Do no harm and require no downtime.
  2. Give complete visibility.
  3. Provide early warning.
  4. Detect malicious and accidental threats.
  5. Enable rapid response (reduce mean time to resolution).

Do no harm and require no downtime

Mixing metaphors, if the proverbial “pill is worse than the ill, then we have a problem Houston”. Many industrial networks are 15, 20 or even 30 years old. The industrial assets and the underlying networks in many of these environments are quite brittle compared with today’s standards. And while newer plants — with modern network equipment and contemporary industrial assets — are often more robust and resilient to network traffic delays or other unexpected interruptions, even modern ICS networks can be finicky.

Therefore, ICS monitoring systems need to be designed to ensure they don’t harm industrial networks or adversely impact the industrial process. Many traditional IT vendors and security teams learned hard lessons when they tried to run, for example, vulnerability scanning systems that queried control assets that were simply not designed to be interrogated and added traffic to the network. ICS devices sometimes failed, and not gracefully, occasionally taking plants offline in the process.

Systems that actively poll (or query) endpoints such as controllers can harm the process. This can be a somewhat less severe risk in modern networks or at the upper layers of the Purdue Model where the workstations or servers are more resilient to interrogation.

Another important consideration is whether the solution requires plant downtime for installation or maintenance. Systems that need to be installed on endpoints or systems that need to be placed ‘in line’ on the ICS network require downtime to set up or must be installed, configured and modified only during plant maintenance windows. This can cause significant implementation project delays or emergency downtime to fix issues.

Give complete visibility

First the solution must be able to monitor both TCP/IP and non-IP nodes — for example, serial connections such as Profibus or Modbus — that are a critical part of many industrial environments. Further, the system needs to be able to understand network topology and provide visibility into ‘the other side’ of gateway devices. For example, this could be a PLC with a network card that is a gateway to a potentially expansive segment of the network on the other side of the device. IT teams will often assume that they have a secure network perimeter, but fail to understand that assets on the other side of the gateway may be connected to a network that is connected directly to the internet or DMZ. In a nutshell, blind spots are bad because they limit the team’s visibility into potential attack vectors and make good hiding places for adversaries seeking persistence.

Secondly, the system must cover the full range of ICS protocols that are present in the specific environment. This does not mean cursory coverage where the system can simply identify the presence of a specific ICS protocol, or systems that only understand the network address of the nodes involved in the conversation. This means a deep understanding of the open and proprietary protocols, so the tool can discern types of devices that are communicating and understand the actual conversations. Otherwise the system will not be able to provide important insights, detect anomalies or provide rich alerts and the contextual information you will need to meet other very important objectives noted below.

Many IT security vendors have tried to repurpose traditional security tools for use in ICS networks (eg, IDS/IPS, next-gen firewalls, etc). For these limited protocol devices, being implemented in an ICS network is like being dropped into a United Nations session without the special translation headphones. You know there are many conversations going on but can only really understand one or maybe two. A tool that is ICS protocol blind means it cannot understand the important industrial control conversations that need to be monitored for anomalous behaviour or other important activities that may pose a risk. The degree to which the system can provide insights into network configuration issues or build a fine-grained anomaly detection model is directly proportional to the depth of understanding that the given system has into the protocols being used in the network. Without significant protocol inspection depth, the monitoring system’s detection and alerting capability will be conspicuously limited.

Provide early warning

An attack on a system requires the adversary to successfully execute multiple steps in a process. The steps an adversary takes to execute a cyber attack were well documented by a team at Lockheed Martin, using the established kill chain model employed by the military. A key premise of the Lockheed Martin Cyber Kill Chain is that if you can detect a threat early in the chain, you can disrupt (kill) it before it has its intended impact.

With this as a premise, it is very important for the ICS monitoring technology to be able to identify an attack as early as possible in the kill chain — for example, when the attacker is trying to establish a foothold on the industrial network or is working to enumerate the network to identify key targets such as controllers. Because early detection is not ensured and response times often vary, it is also important for a monitoring tool to be able to detect adversary activity all along the kill chain — up to and including attempts to manipulate settings to impact the underlying process (such as changing a controller’s settings).

Detect malicious and accidental threats

Industrial systems and the processes they control face many different types of issues. As any well-schooled ICS security practitioner (unfortunately there are not enough of them) or plant floor operator can tell you, human error and accidents are far more common (at least today) than actual cyber attacks trying to inflict harm. Thus, ICS monitoring solutions must be capable of alerting security and shop floor teams about both malicious activity and other actions that could potentially harm assets, process or people.

On the malicious side of the equation there are two main categories — external and insider threats. External threats must gain a footprint on the network or co-opt an insider to have an adverse impact. In the former case, we discussed detection throughout the kill chain steps. The insider is the most difficult case: where an external actor co-opts an insider, or where a disgruntled employee is acting on their own. In this situation, the insider would typically have legitimate credentials and be authorised to make changes to the network environment or controllers that could harm the process. Thus, the monitoring tool must be able to detect obviously malicious activity as well as high-risk changes that may be perfectly legitimate but could also be initiated by an insider attempting to do harm.

Enable rapid response (reduce mean time to resolution)

This is an easily stated, but often very hard-to-deliver outcome for many systems. To achieve this outcome there are a few underlying requirements that need to be addressed.

First, the system needs to provide security operations centre (SOC) analysts — typically the primary user — immediate ‘situational awareness’. To achieve this requirement the system must provide concise, well-crafted, human-understandable alerts that shorten the time and effort needed to investigate and resolve alerts.

Rather than a single, consolidated alert that indicates exactly what is going on, far too many systems provide SOC analysts with a long stream of anomalous events — often requiring significant effort just to understand what is happening and whether it is important. Assembling a stream of alerts for an ICS network that SOC analysts are often unfamiliar with, into something meaningful, is beyond the skill set of most analysts. Even if the team has the skills, the extra cycles are something that these teams certainly don’t have and the lost time can prove critical during an attack.

Secondly, in addition to immediate situational awareness, advanced systems will provide the contextual security data required for SOC teams leading the investigation of alerts in the early part of the kill chain. SOC teams understand security events associated with adversaries gaining a foothold, enumerating the network and attempting to move laterally. This activity is directly in their ‘wheel house’ and the more contextual information the analyst has regarding the attack the more quickly and efficiently they can investigate and resolve the issue.

The same applies for alerts associated with the latter stages of the kill chain; for example, where an adversary may be trying to change controller settings to impact the underlying industrial process. In these cases, the SOC team will most likely need to interact with their counterparts at the plant. For this conversation to be efficient and productive, the SOC analyst will need to provide shop floor personnel contextual information associated with the process itself.

For example, instead of relaying network addresses of the assets involved, which plant personnel may or may not be able to quickly associate, the SOC analyst can improve the situation by providing asset names. Further, to ensure that plant personnel can efficiently investigate issues, SOC teams can provide information such as the set points being manipulated and other contextual information the operations and engineering teams need. Armed with actual process-related context, plant personnel can readily query SCADA and DCS systems, understand the potential impact and take steps to mitigate the attack.

To help SOC and shop-floor teams collaboratively investigate and solve these later-stage, process-affecting alerts, the system must provide detailed process context. In all scenarios the ability to rapidly understand the situation that caused the alarm and the ability to quickly and efficiently investigate the issue is crucial.


With the list of the top five outcomes ICS monitoring systems need to deliver, teams can focus on the most important requirements — those that deliver true risk management ROI.

Image: © Nagy

Related Articles

The cyber-physical manufacturing journey

It is time for manufacturers to start their own digitalisation journey and ride the wave of the...

Securing the smart factory: cybersecurity for advanced manufacturing

Threats to industrial operations have outpaced the capabilities of most OT cybersecurity...

AI in engineering: no immediate solutions for specific projects

Will AI ever replace the imaginative and creative engineering professional? Maybe, but not yet.

  • All content Copyright © 2024 Westwick-Farrow Pty Ltd