The following blog is adapted from the book Industrial Cybersecurity Case Studies and Best Practices, authored by Steve Mustard. This is the fifth and final excerpt in a series. See Excerpt #1 here. See Excerpt #2 here. See Excerpt #3 here. See Excerpt #4 here.
NEW: See Steve Mustard's October 2022 appearance on NasdaqTV here.
Safeguarding Operational Support
One of the distinguishing features of operational technology (OT) is the operational life of the equipment. Information technology (IT) is refreshed every 18 months to 3 years to keep pace with the demands of users and their applications. Conversely, OT equipment is designed for a specific, limited set of functions. Once deployed, there is little desire to change it.
Safety is a major concern in industrial environments, yet cybersecurity, despite being a potential initiating cause in these hazards, is not respected in the same way as safety is. Many organizations begin meetings or presentations with the refrain that safety is the number one concern. But in those same meetings, there may be comments to the effect that “We have more important priorities than cybersecurity.” Clearly, there is still much to do before cybersecurity receives the attention it requires in operational environments.
Some important operational considerations are:
- Monitoring the effectiveness of cybersecurity controls
- People management
- Inventory management
- Incident response
- Suppliers, vendors, and subcontractors
Monitoring the Effectiveness of Cybersecurity Controls
Barrier model analysis is widely used in process industries to help analyze and visualize the status of the layers of protection required to maintain a safe operation.
Organizations may use different means to visualize their layers of protection. One approach is to use bowtie diagrams. Another is to use the Swiss cheese model, shown in Figure 1. The methodology in either case is simple: identify a set of barriers to prevent and mitigate an incident or accident. A failure of one of the barriers may not be sufficient to cause an accident: however, should a series of failures occur across several barriers, there is the potential for an incident to occur. The bowtie or Swiss cheese model is used in organizations to answer the question are we still safe to operate? by interrogating data related to each barrier.
Integrating cybersecurity into such a reporting tool helps to make cybersecurity a key factor. Now the barrier representation being reviewed clearly shows the status of cybersecurity at the facility. The question are we still safe to operate? now includes the status of cybersecurity.
Figure 1. Barrier representation of cybersecurity controls.
Employers need a means of overseeing their employees to quickly identify any issues that may lead to a cybersecurity incident, particularly from disgruntled persons.
- Background checks –The depth of the background check should be appropriate to the role being filled. Background checks must be conducted in accordance with relevant employment laws. Some form of ongoing or continuous screening may be required, along with strict oversight.
- Separation of duties – This involves ensuring that more than one person is required to complete a particular task where safety or security might be at risk. This approach reduces the risk of fraud, theft, and human error. Typical separation of duties may involve the following: Separate electronic authorization for actions such as to change set points in a control system; The use of multiple security keys (physical or electronic) held by separate personnel.
- Joiners, Movers, and Leavers - User roles should be at a sufficiently granular level that no person has access to data or functions they do not need to do their job. Once someone is in a role, a periodic review process will ensure access is still required. Changes should be made with immediate effect. Most importantly, an individual leaving an organization should trigger prompt action to remove all physical and electronic access.
When a product vulnerability is announced, the first question to answer is: Does this affect my organization, and if so, where, and how much? It is impossible to answer this question without an accurate and up-to-date equipment inventory. An equipment inventory can be as simple as an Excel spreadsheet or can be a purpose-made relational database and application. IT and OT security vendors offer inventory management systems.
Consider the following points when creating an OT device inventory:
- The range of device types is much larger and includes many firmware and software solutions that are not designed to interact with asset management solutions.
- Many devices that are networked may only respond to the most basic industrial protocol commands. Rarely do these commands support the return of configuration information.
- There is no guarantee that devices are accessible on a common communications network. Many installations will contain serially connected (RS-232, RS-485, RS-422) devices that only respond to basic industrial protocol commands.
- In more modern OT networks, there may be industrial firewalls or data diodes that isolate devices from the wider network. This design limits communications to very few industrial protocol commands.
Figure 2. Legacy Devices Can Be Hard To Identify in Inventory Systems.
Incident response planning is not just about preparing for the inevitable incident. Considering plausible scenarios facilitates a review of business risk and the identification of additional mitigations to reduce this risk.
Consider the Oldsmar example. In early February 2021, an operator at a water treatment plant in Oldsmar, Flordia, noticed someone remotely accessing an HMI at the plant. Later that day the operator noticed a second remote access session on the HMI. This time, the remote user navigated through various screens and eventually modified the set point for sodium hydroxide (lye) to a level that would be toxic to humans. The remote user logged off, and the operator immediately reset the sodium hydroxide level to normal.
Although it resulted in a near miss, the Oldsmar incident highlighted gaps in process and people elements:
- The operator should have known who was initially accessing the HMI, and whether they were authorized to do so. Unauthorized access should have triggered an immediate incident response.
- The company that developed the SCADA system used in the facility exhibited poor information security behavior. They maintained a page on its website displaying a screen from the HMI, providing details of plant processes. It was easy to see the button that would enable navigation to the sodium hydroxide page. Such a screenshot is extremely valuable in terms of planning a potential attack1.
- There appeared to be no assessment of the remote access requirement, or the cybersecurity risks associated with it. Was remote access necessary, or was it nice to have? If remote access was required for viewing process state, why was read-only access enforced in the remote access scenario.
- The functionality of the SCADA system should have prevented a user from setting dangerous levels in any part of the treatment process. This should have been risk assessed and mitigated during the design stage. Addressing the remote access risk does not remove the risk of unauthorized physical access to the same system.
Suppliers, Vendors, and Subcontractors
In many cases, the personnel from third party organizations are in place so long that they become indistinguishable from asset-owner personnel. Few asset owners properly manage the cybersecurity risks arising from these arrangements:
- Third-party computers may not have the necessary security controls, yet they may be connected to business-critical systems or networks.
- Vendors may not have sufficient controls in place to manage user credentials for their clients’ systems.
- Vendors may not have procedures in place to manage system backups.
- Suppliers, vendors, and subcontractors may not have adequate security management systems in place in their organization.
- Suppliers, vendors, and subcontractors may not provide adequate security awareness training to their personnel.
A key step to establishing control is contract management. Contracts should be tailored to specific arrangements. Contract clauses should reflect the controls required to manage cybersecurity risks.
There is an established cyber insurance market focused on IT cybersecurity risks, and insurers and brokers are now developing policies to cover threats to OT infrastructure. Insurers and brokers are still learning what risks an asset owner is exposed to from an OT cybersecurity incident. Tom Finan of Willis Towers Watson, a global insurance broking company, points out that “having a cyber insurance policy does not make a company safer. Instead, an enhanced cybersecurity posture results from going through the cyber insurance application and underwriting process.” 2
Although OT environments have a different operational support culture from IT environments, several factors can give OT cybersecurity the management attention it requires.
- The safety culture that is ingrained in all OT environments can incorporate cybersecurity, treating it as another initiating cause of high-impact incidents that can occur.
- The use of management monitoring tools, such as the barrier representation, can ensure that cybersecurity is considered at the same level as other protective layers.
Technology is not the only element of the cybersecurity challenge. People and process are critical weak points. Much of what happens in operational environments revolves around people. Cybersecurity relies on training and awareness, and the adherence to strict processes and procedures. Gaps in training and awareness or in processes and procedures create vulnerabilities that can be as severe as any technical issue.
Incident response is one of the most importance plans to have in place. With the growth in high-profile cybersecurity incidents and the knowledge of the costs of dealing with them, it is harder for organizations to ignore the need for good preparation.
There is still work to be done to educate asset owners that good incident response planning does not begin and end in their own organization. The use of suppliers, vendors, and subcontractors means that cybersecurity risks, and their remediation, rely on the cooperation of all parties.
One key control that asset owners can use is contract management. A set of model clauses that represent good cybersecurity management should be included in all third-party contracts.
Although insurance can be a useful tool for an asset owner, it cannot replace effective identification and proactive management of risk.
As with all other aspects of cybersecurity management, there is still much to do in operational support, but the elements are in place to improve the cybersecurity posture of all organizations.