Introduction
ICS are connected to computer networks, which in turn are connected to the internet and are employed to control equipment and safeguard people, plant and environment. This arrangement creates the need for cybersecurity risk assessments with the objective of recording the possible business impacts of a successful cyber-attack against ICS, ensuring threat vectors are identified and understood and countermeasures are considered.
Unlike cybersecurity incidents on IT systems, which tend to result almost exclusively in economic consequences, OT cybersecurity incidents can create a number of potential consequences. This directly depends on the objectives of the attacker—like process impacts to safety, the environment, and business risk. However, organization’s risk matrix should be modified to consider other impacts such as reputational damage or loss of intellectual property/ competitive advantage.
The objective of cybersecurity risk assessment is to evaluate attack scenarios, to determine security level targets and to provide cost‐benefit information to justify the investment in security countermeasures. In order to conduct risk assessment, information like zone and conduit drawings, cybersecurity policies and procedures, corporate risk criteria, inventory of cyber assets manufacturer (i.e., device) cybersecurity manuals, and lists of third-party connections etc. are required.
Cybersecurity Risk
Cybersecurity risk is concerned with intentional or unintentional interference with the potential to compromise ICS by means of computer connected systems. Today, we have come to a point where cyber risk assessments are considered an essential activity in any project similar to safety risk assessments.
Many different approaches for conducting cyber risk assessments have been provided in international standards such as IEC 62443. Likelihood of a cybersecurity event can be a malicious attack (intentional) or an operator error (unintentional), and is the product of threats, vulnerabilities and target attractiveness. Consequence severity of a cybersecurity incident is dependent on the inherent characteristics of the target (e.g., quantity of hazardous chemicals, type of hazardous chemicals), objective of the attacker, and the components of the process control system that are compromised.
Understanding how the consequence, type of threat, attractiveness of the target, and number of vulnerabilities affects the likelihood of a cybersecurity attack is a key step in performing cybersecurity risk assessment.
Why We Need Cyber Risk Assessment
Examples from many previous cybersecurity attacks demonstrate that both insider and outsider cybersecurity risks have the potential to significantly impact both the continued operations and safety of ICS. A cyberattack does not need to initiate an event directly; rather, they can disable a protective measure and wait for an initiating event to occur.
For example, an ICS like Safety Instrumented System (SIS) which is designed to prevent or mitigate the consequences of a hazard scenario due to initiating events like Distributed Control System (DCS) failure stand by passively until a specific process deviation occurs and then take action.
Cyber-attacks on SIS need not be sophisticated like we saw during TRITON / TRISIS, and by simply changing an interlock trip point or placing an interlock in bypass can cause a process upset to propagate into a high consequence loss of containment event with safety, environmental and financial implications.
1.) Identify the System Under Consideration and select Zone / Conduit
The first step is to identify System Under Consideration (SUC) or the scope of the assessment to clearly define which systems are being reviewed. Zone and conduit diagrams provides the basis for grouping the assets into zones and evaluating the security requirements for assets based on the network segmentation strategy.
Once a zone or conduit has been selected for review, the assets within the zone should be evaluated as cybersecurity nodes to consider the impact of a cybersecurity threat on each asset within the zone.
2.) System Screening
Screening is performed based on degree of programmability, use of Removable Media or Portable Computers, Connectivity to other OT networks (e.g. L1 serial links, L3 Personal Communications, Network (PCN) or L3.5 Process Control Access Domain (PCAD)) and Connectivity to non-OT networks (e.g. L4 Corporate, virtual private network (VPN) or Internet connections).
Once the scope is clearly defined, a specific device is selected for further analysis which could be engineering workstations, operator workstations, programmable logic controllers, servers and network equipment.
3.) Determine the Worst-Case Consequence
Evaluate what possible events could occur if system function is lost. For example, a system is powered off for a period of five days—or even compromised leading to a skilled, knowledgeable, motivated attacker having full administrative access to the system.
The severity of the consequence is determined for each applicable type of criteria (e.g., safety, environmental, financial) based on the consequence without considering any countermeasures i,e. assuming all non-mechanical countermeasure devices failed (e.g., firewalls, anti-virus scanning). Below is an example of consequence matrix:
4.) Determine and Record the Threat Vectors & Likelihood
There are many threat vectors that threat agents can exploit. These can range from introducing malware via a USB port, adding an unauthorized wireless access point and attaching an unauthorized computer to a network, to connecting control networks with business or other networks, and many others.
This is the step where we document the various pathways by which a threat could reach the OT system—like direct physical access to a system (includes access to system consoles, network devices, cabling, removable media, portable computers, asset disposal) or remote access via hardwired network connections to other IT/OT systems (includes copper/fiber network connections, serial links, VPNs).
Each of the threat vectors is drawn on the network architecture diagram, and initiating event likelihood is recorded. The threat likelihood is defined considering the threat source, level of skill required and pathway only. At this stage, the unmitigated system risk is being assessed—so risk reduction from countermeasures is not considered.
5.) Identify and Record the Countermeasures & Determine the Security Target Level
The consequence severity and frequency or likelihood rankings based on the organization’s corporate risk criteria determine whether there is any risk tolerance gap and define the Security Level Target for the OT system.
NR in the below table implies that security level is not required or defined for that zone or system. IEC 62443 has defined five security level grades, ranging from 0 to 4, with SL 0 identified as the minimum level of risk and SL 4 as the maximum or ‘most vulnerable’ level and requires more significant countermeasures.
For each threat vector, the following risk details are documented and countermeasures are identified to decrease the likelihood of the attack being successful or reduce the severity of the consequence.
- Unmitigated risk combines the dominant risk category with the initiating event likelihood. From the risk matrix table, if the consequence rating of a system is 3 and frequency is E, then the Target Security Level of that system i,e. SL-T will be 2, or LOPA gap is 2 or 2 independent countermeasures to be designed for different threat vectors.
- Inherent Security Risk is the risk left after taking credits of the countermeasures which are inherent in the design like DCS, SIS, PSV or enabling events like time at risk and conditional modifiers like ignition sources and occupancy, etc. These countermeasures may apply only to certain risk categories (e.g., occupancy only reduces likelihood for the H&S risk category)— this may change the dominant risk category to be used.
If we have a system which is compromised but there is another independent system like Plant SIS which can prevent the hazardous consequence, then we can take 1 credit so Inherent Risk in the above example reduces by 1 factor or LOPA gap is 1.
- Tolerable Security Risk is the risk left after taking credits of the preventive countermeasures added externally like firewalls, unidirectional gateway/data diode, access control, system hardening, encryption, or mitigative countermeasures like intrusion detection/prevention, administrative procedures, cybersecurity alarms, and security operations like Centre. As per table 5, we had inherent security risk of 1, and if we add a firewall in network design and take this credit—then LOPA gap becomes 0.
6.) Determine if the Tolerable Risk Criteria are Met
For each device, the current risk ranking is reviewed to determine if it is tolerable and devices determined to have minimal risk would not require further investigation. Whereas devices with significant and severe risks would require additional treatment to reduce risk further like redesign the network architecture to eliminate threat vectors or reduce the initiating event likelihoods of existing threat vectors or modify the process/equipment or operational mode. The security level targets can be used to group assets into zones based on cybersecurity criticality and support the determination of network segmentation.
Conclusion
Accurately assessing cybersecurity risk poses a number of unique challenges. Adopting commonly used process safety techniques and calibrating them for cybersecurity studies can help to provide consistent risk assessment scenarios, and reduce the necessary development time for the cybersecurity risk identification and analysis method.
Strategies to mitigate cybersecurity risks will require continual assessment and the implementation of comprehensive standards like IEC 62443 that provides the necessary guidance.
Identifying potential cybersecurity hazards and estimating risk can be a difficult undertaking for OT systems. This is due to a number of factors including misconceptions about cybersecurity as it relates to the process industry, limited industry databases on cybersecurity events, rapidly changing technology and continually evolving threat landscape.
Useful Acronyms
DCS – Distributed Control System
H&S – Health & Safety
ICS – Industrial Control System
IEC – International Electrotechnical Commission
LOPA – Layers of Protection Analysis
OT – Operations Technology
PCAD – Process Control Access Domain
PCN – Process Control Network
PSV – Pressure Safety Valve
SIS – Safety Instrumented System
SL-T – Target Security Level
SUC – System Under Consideration
VPN – Virtual Private Network