As cybersecurity for industrial automation continues to evolve, it becomes increasingly important to fundamentally understand, evaluate, and manage cybersecurity risks. Recent attacks such as the one on the Oldsmar Water Treatment Facility further emphasize the need for cybersecurity risk management and demonstrate how cyber incidents have the potential to cause not just financial, but also significant safety and environmental consequences.
The objective of effective cybersecurity management should be to maintain the industrial automation system consistently with corporate risk criteria. In many organizations, ownership for industrial automation cybersecurity concerns falls to controls engineers or similar positions that may have limited time available to focus on security concerns, making it essential that cybersecurity risk is managed in a manner that is both time-efficient and effective.
The first step in managing risk is to understand the current level of risk within a system. The process for conducting a cybersecurity risk assessment as outlined in the ISA/IEC 62443-3-2 standard is split into two parts:
- Initial Risk Assessment
- Detailed Risk Assessment
Initial Risk Assessment
The Initial Risk Assessment (previously referred to as the High-Level Cybersecurity Risk Assessment) is the starting point for risk analysis activities. Its purpose is to define the scope of future assessments, establish the zone and conduit diagram, establish initial security level targets for devices, and identify high-risk areas for further analysis.
The steps for completing these objectives for a major process area are detailed in the workflow below.
Initial Risk Assessment
The fundamental method behind the Initial Risk Assessment is that it assumes a threat likelihood of one and focuses on evaluating the worst-case scenario if a cyber asset is compromised. This allows for a relatively quick method to determine the highest areas of risk within an automation system. This method provides easy progression from defining device security level targets to establishing an effective network segmentation strategy by grouping devices with like security requirements into zones and separating zones with boundary devices such as firewalls or data diodes. Combining the results of the Initial Risk Assessment with the operability requirements of the automation system leads to a network architecture that supports both efficient and secure communication between devices.
Although establishing effective network segmentation as described above is easier for new projects, the results of the Initial Risk Assessment still provide benefits to existing facilities by providing an understanding of the highest risk cyber assets in the automation system. This narrows the focus of the Detailed Risk Assessment to the areas that most need it, leading to a reduction in the overall cost and time required for cybersecurity risk assessment activities.
Detailed Risk Assessment
The Detailed Cybersecurity Risk Assessment is the second risk analysis performed for cybersecurity. Its purpose is to gain a definite understanding of the current level of risk within a facility considering potential threat vectors and existing/planned countermeasures, ensure that corporate risk criteria are met, and provide detailed cybersecurity requirements for each zone.
The steps for completing a Detailed Risk Assessment for a major process area are detailed in the following workflow.
Detailed Risk Assessment
The starting point for the detailed risk assessment is the output of the Initial Risk Assessment. In addition to the initial risk assessment results, the full PHA hazards and corporate risk criteria should be available if further questions regarding consequence ranking arise for the site.
The other input for the detailed risk assessment is the vulnerability analysis. This can either be done as part of the detailed risk assessment method or before the detailed assessment begins.
The vulnerability analysis reviews the existing network, connected devices, configurations, software versions, and additional factors to identify what vulnerabilities are currently present within a facility and could be targeted by attackers. This provides an important input to the Detailed Risk Assessment when considering the entry points into the system and evaluating how likely a successful attack is—and how easily an attacker can move between devices in the control network.
The first step in the completion of the Detailed Risk Assessment is documenting the potential threat vectors that would provide attackers entry into the system. Depending on the approach taken, this can be a daunting task. I have seen evaluations where asset owners were given a list of more than 300 threat vectors to review and identify which could provide entry into the system.
Although this approach is attempting to be comprehensive by considering detailed threat vectors, it fails to be effective for a couple of fundamental reasons. First, by breaking threat vectors down into so many parts, the amount of time required to complete the assessment is greatly increased, because even for threat vectors that don’t apply to the system under consideration, many granularities must be considered. Second, the level of detail completely overwhelms plant personnel because they are not familiar with the detailed ins and outs of cybersecurity analysis, and have now been given hundreds of new terms that they do not understand. Lastly, it does not end up resulting in a more complete analysis of the system because the plant personnel with the knowledge required to evaluate the system can’t speak to the same level of granularity as the selected threat vectors.
Instead of overly confusing the first portion of the risk assessment with hundreds of individual threat vectors, it is helpful to look at manageable categories of attacks. This method helps to provide a complete look at the ways attackers could enter the system, but is still understandable to the plant personnel involved with the risk assessment. The Common Attack Pattern Enumeration and Classification (CAPECTM) database provides common areas of attack that can greatly assist with this process.
If you are thinking that nothing about the “Common Attack Pattern Enumeration and Classification” database sounds less confusing, don’t worry. They provide six areas of attack that can be understood by anyone regardless of their level of cybersecurity experience:
- Social Engineering: getting into the system by manipulating or exploiting people
- Supply Chain: altering the system during production of components, storage, or delivery
- Communications: blocking, manipulating, or stealing communications
- Physical Security: getting into the system by overcoming weak security measures
- Software: getting into the system via vulnerabilities in software applications
- Hardware: getting into the system by manipulating the physical hardware of network devices
By starting with broad categories and then moving to the level of detail necessary to evaluate the threats, a Detailed Risk Assessment can be both more efficient and more complete because the personnel with the critical knowledge for the control system will be able to actively contribute to the discussion and provide their valuable knowledge about the system. The level of granularity required is a key difference between the high-level (Initial) and Detailed Risk Assessments.
Another difference from the Initial Risk Assessment, where the likelihood was assumed to be one, is that the likelihood of a threat must be considered. When determining the likelihood of an attack, after considering the area of attack, it is typically helpful to start by asking key questions about the threat agent:
- What threat agents could execute this attack?
- Internal or external?
- Skilled or unskilled?
- Are nation state-level resources required?
The above questions can be helpful for understanding how likely the attack would be. It is also important to understand the differences between likelihood from a functional safety perspective and cybersecurity perspective.
A control engineer must consider a loss of containment event that has a tolerable frequency of 10-4 years, whereas IT personnel must consider the hundreds of thousands of attempted cybersecurity intrusions each year. Due to the lack of current well-maintained cybersecurity incident repositories, it is difficult to estimate the likelihood of cybersecurity events with the same level of confidence as causes for a safety risk assessment. As a result, the security community is somewhat split on the best approach for determining likelihood in the Detailed Risk Assessment.
Some experts believe that—because the likelihood cannot be accurately determined—it should be estimated at one, and only consequence severity should be used to prioritize between risks. The other approach is to make conservative estimates that consider the level of skill and access required to execute the attack. There is not one simple answer, but when adopting either approach, it is important to maintain focus on the objective of cybersecurity risk assessment: providing an accurate picture of relative cybersecurity risk to focus resources in the most efficient areas.
In many cases, the consequences identified in the Initial Risk Assessment can be directly applied to the Detailed Risk Assessment, but they should be reviewed to ensure that they are accurate and that no other consequences could potentially result in a higher risk.
After identifying threat consequence pairs for a system, the next step is to identify what countermeasures are in place to prevent a successful attack. These countermeasures are any protection that reduces the likelihood of a successful attack.
This step can be achieved by reducing the potential for an attacker to enter the system (i.e., by using properly configured firewalls/managed switches, devices with better security capabilities/features, or the least privilege method for assigning access accounts); increasing the likelihood that an attack would be identified and stopped before its final objective (i.e., by reviewing firewall logs for unusual access patterns, implementing intrusion detection systems, and verifying code signatures before downloading to the logic solver); or having measures to stop or mitigate the end objective of an attack or means of safety risk reduction not susceptible to attack (i.e., hard-coded endpoints in logic solver configurations, pressure relief valves, and pneumatic control loops).
Once the Detailed Risk Assessment has been completed, the zone and conduit diagrams and security level targets from the Initial Risk Assessment should be finalized. The Initial Risk Assessment serves to provide a quick understanding of high-risk areas, and the Detailed Risk Assessment provides a robust understanding of what the threats and countermeasures in those high-risk areas are. The results of the risk assessments provide the key inputs for defining security requirements and the subsequent design phase of the IACS, including the security level verification. They also promote the effective flow of information between lifecycle steps. By understanding the level of risk for a facility, informed decisions addressing cybersecurity concerns can be made to promote safe and secure operation.
Common Attack Pattern Enumeration and Classification: A Community Resource for Identifying and Understanding Attacks, MITRE Corporation, 2018.
Stay tuned for a forthcoming whitepaper from the ISA Global Cybersecurity Alliance on the ISA/IEC 62443-3-2 standard.
Interested in reading more articles like this? Subscribe to the ISAGCA blog and receive weekly emails with links to the latest thought leadership, tips, research, and other insights from automation cybersecurity leaders.