To secure the critical infrastructures against attackers, the recommended approach is to think like them. Vulnerable Operational Technology (OT) assets are low-hanging fruits for bad actors. When patches are released to the public, the vulnerabilities often are disclosed by National Vulnerability Database (NVD). If attackers do break into the OT network, would they spend weeks or months trying to find a zero-day vulnerability? Or would they perform a reconnaissance of the OT network to identify publicly announced vulnerabilities and exploit them? If OT asset owners don’t remediate OT vulnerabilities, attackers will take advantage of them – it’s a matter of who finds them first.
However, unlike IT, all the missing patches cannot be installed on the OT assets. There are several reasons including:
- ICS or Operating System (OS) patches cannot be implemented because of non-compatible hardware.
- ICS vendor did not approve the OS patches
- The patches are not approved by asset owners – i.e., the patch crashed the OT asset while testing
In such scenarios, either the patches cannot be installed, or they need to wait until the next shutdown. And the vulnerabilities will be present until the next shutdown or forever. If for some reason, the patch cannot be deployed, then other controls need to be applied to reduce the risk to an acceptable level. Such controls include but are not limited to:
- Disconnect the OT asset from Business LAN or DMZ
- Restrict user’s permission to the OT asset
- Application whitelisting or application baselining to run the required services only and block all other services
Moreover, patching in the OT environment is an expensive approach. If the risk can be reduced to an acceptable level by applying alternative controls – meaning if the attackers can be prevented from reaching the vulnerable assets – then the cost or effort of patching and applying alternative controls needs to be compared to decide which approach is best. It is not feasible to patch all the OT assets, thus, it is recommended to patch smartly.
Top 7 OT Patch Management Best Practices
OT environments have a lot of diversity in terms of systems that OT asset owners need to work with. And the job becomes more difficult when Industrial Control Systems (ICS), such as DCS, SIS, PLC, etc., are installed from multiple vendors in the OT environment. That is why an effective patch management approach is important to identify vulnerabilities and reduce the risk to an acceptable level before attackers find them. The following section will discuss the top 7 best approaches for a smooth patch management process.
1. Maintain a comprehensive and evergreen Inventory
A comprehensive inventory of all software, firmware and hardware within the OT environment, including all the assets from the industrial demilitarized zone (level 3.5) to the cell/area zone (level 2-0) in the ISA/IEC-62443 Purdue model, is a critical piece of any OT patch management process. Once there is a clear picture of what is present, it will be easier to compare the known vulnerabilities to the inventory to quickly discover which patches matter to the OT environment.
To maintain a comprehensive Inventory, it is important to create a database of all assets, and keep them organized according to software applications, device type, operating system, version of the software, and operating system for all the computers, network devices, and industrial control systems present in the organization. It will be risky to miss out on vulnerabilities because a device or software was not visible in the inventory management system.
Many plants within the same organizations end up using multiple versions of the same control system software. Multiple control system software versions complicate patching tasks because it makes it more difficult to understand if the vulnerable asset or software is present in the organization. For example, when ICS Advisory ICSA-20-205-01 was released with CVSS v3 10.0, which is the highest CVSS score and the vulnerable asset was a safety system, many OT organization faced difficulties to realize if this vulnerability exists in any of their plants because of missing in-depth OT inventory such as safety system software versions, safety system operating system details, safety system application versions, safety system communication model details, etc. and presence of multiple versions in different plants. Keeping one latest version of the IACS software across all the plants will make the patching process smooth.
All the OT applications across all endpoints need to be monitored. If an OT application is rarely used/not used, it is recommended to decommission it. There is no need to patch what is not present in the OT environment. In a typical OT environment, there are lots of highly vulnerable software packages such as Adobe Flash players and document readers that have no reason to be there. Taking this as an example, uninstalling Flash, will result in much less to patch! This system hardening approach is known as “baselining inventory”. An accurate baseline inventory will make patch discovery and management approach easier.
Having a consolidated Industrial Control System (ICS) software version is important; however, it is more important to have an evergreen inventory. If the consolidated inventory is not updated regularly then it will do no good to IACS patch management. Moreover, it is better to use an automated tool such as PAS Cyber Integrity™ to have an updated and evergreen inventory because manual inventory tracking is prone to errors.
2. Assign criticality to the OT Asset
To assign criticality to the OT asset, a system for assigning criticality scores needs to be established. This may already exist due to the regulation of the safety system. The criticality needs to be assigned considering business Impact i.e., the impact of lost accessibility, reliability, integrity, etc. to the business safety, profitability, etc. For example, if there is a workstation to configure and monitor a safety system, the criticality should not be assigned based on the cost of that workstation. The cost of that workstation would be below $1000; however, the cost of not able to access, monitor, or make changes to the safety system can lead to a catastrophic incident which can cost millions of dollars. The easiest way to assign criticality by considering business impact would be by calculating Recovery Time Objective (RTO - how long the asset can be down without impacting the business) and Recovery Point Objective (RPO – how long the data can be lost without impacting the business). Lower RTO and RPO represent higher critical assets.
Moreover, it can adversely affect brand image and client relationship and they should also be considered while calculating business impact. It is sometimes difficult to assign a dollar value to client relationships and brand image impact; however, a qualitative risk assessment approach would be recommended to consider the risk of business impact for such scenarios.
Considering the maintenance and replacing cost is also part of the criticality assessment. Unlike IT assets such as laptops, network devices, etc., maintaining and replacing ICS assets such DCS, SIS, etc., is expensive and time-consuming because they are related to safety and business continuity. For example, replacing a DCS Engineering server requires synchronization with the DCS controllers. Failing to synchronize the critical configuration file may end up in an upset plant. It goes without saying that no changes can be made in the control system while replacing the DCS Engineering server.
3. Seek new patches and OT vulnerabilities
The OT asset owners should actively look for new patches and vulnerabilities. The organization can sign up for patches announcement by respective ICS vendors. Receiving patch release notifications from multiple vendors can be difficult to manage. Signing up for US-CERT/ICS-CERT vulnerability announcement will help to get notification immediately on the recently announced vulnerabilities.
NVD releases more than 350 vulnerabilities in a week. It is difficult to identify if all the vulnerabilities and patches apply to the OT organization. An automated tool like PAS Vulnerability Management asset model, which is updated daily to add the vulnerabilities announced by NVD, will be helpful to compare in-depth inventory with the announced vulnerabilities to identify the applicable vulnerabilities and patches. Identifying vulnerabilities passively is a huge advantage for OT asset owners, and this can be achieved using such tools that compare current OT inventory to NIST’s CVE Database and ICS-CERT advisories to match which assets are affected and if there is an available patch.
After a vulnerability is identified, it needs to be checked if the vulnerability can be mitigated more smartly. For example, if Google Chrome vulnerability is found in 10 OT servers before the patch applied, the question would be – is Google Chrome required in all the 10 OT servers? Uninstalling it from some of the OT servers will not only eliminate vulnerabilities but also save time while patching in the future.
4. Prioritize Deployment of Patches
It is not possible to deploy all the patches in all the OT assets at the same time. Also, it is not possible to patch one by one. It is recommended to prioritize patch deployment that is specially designed for the OT environment. The CVSS score is calculated using multiple factors such as access vector, access complexity, authentication, integrity, availability, etc. It is a good starting point to decide what to patch. However, if there a critical vulnerability (CVSS score more than 9.0) on the training (low critical) asset and a high vulnerability (CVSS Score between 7.0 and 8.9) on the critical safety system that is monitoring poisonous gas, then it is recommended to patch the safety system first though the CVSS score is lower because exploiting that vulnerability will cost the organization more as it will have a direct impact on people, environment and plant.
However, there can be some safety systems installed for the non-critical process such as monitoring pressure of water supply pipeline. In other words, not all safety systems will be monitoring poisonous gas, and thus, not all the safety systems will have the same criticality assigned. The safety systems that are critical to the plant will need to be prioritized for patching.
Effective and efficient patch management is a risk-based approach. If losing two assets will have the same impact on the business, in other words, if two assets are having the same criticality then the probability of exploiting that vulnerability can be considered to prioritize the patch deployment. For example, if there is a critical server present in the DMZ (Level 3.5) facing the external network, and another critical server present in Level 2 disconnected from business LAN, air-gapped/islanded, and secured physically, then the DMZ server should be prioritized to patch first. This is because the probability to reach Level 2 server and exploit vulnerability will be lower than the probability to reach and exploit the DMZ level server.
5. Assess and reduce risk for exempted patches
Unlike IT, where all the Windows operating system (OS) patches can be installed, installing all the OS patches can crash a critical OT server or workstation. This will have a direct impact on the plant operation as accessing or monitoring critical assets will be discontinued. As a result, the ICS vendor-approved MS patches need to be tested on a non-critical testing setup by OT organizations before they are pushed to production workstations.
If patches cannot be installed on the OT assets, then alternative controls need to be applied to reduce the risk to an acceptable level. Understanding that risk is the multiplication of impact and probability, the business impact or the criticality of the OT asset cannot be changed. However, the probability can be reduced by applying alternative controls. As a minimum, the following controls need to be applied:
- Boundary Protection: Boundary protection such as network segregation, zoning, access control, etc. to prevent both physical and digital invasion
- Network Segmentation: Network segmentation, zoning, etc. will prevent breach, malware spread, and so forth
- Security Incident and Event Management: SIEM solution can aggregate data and detect potential threats.
- Integrity Monitoring: Having an automated tool to monitor critical ICS configuration file integrity and report changes
6. Patch as part of the Change management process
Industrial standards such as IEC-62443 recommend having a change management process. As part of this process, it is recommended to have baseline, record, review, document, and rollback plan. Understanding that deploying patches in the OT environment is a change to the environment, it is recommended to follow the same change management process while deploying patches.
- Baseline patches: Getting a baseline from all the OT systems in the OT environment will provide a starting point for comparing any changes in the future. Considering the patch management approach, the baseline should provide the patches that should be installed when a new system is commissioned. It is understood that the baseline needs to be updated regularly as new patches are released.
- Record installed patches: OT asset owners need to ensure that the baseline and changes are recorded. It is crucial to record the current patch status; in other words, the list of all the patches currently installed in each asset. As the new patches are deployed, they need to be recorded as well. if any patch is not installed, then the alternative applied controls for exempted patches need to be recorded as well. This would help to audit if any patches that are not vendor-approved or not tested are pushed to the OT assets.
- Review installed patches: The changes in the OT environment need to be tested and reviewed regularly before and after the changes. With that said, the patches need to be tested and reviewed before and after patch installations.
- Document the patch process: It is recommended to have a documented change management process. This process should not be done as the last thing but should in parallel as the changes are made to refer to it for future troubleshooting and forensics if needed.
- Have a rollback plan: Sometimes, the vendor-approved/supplied and end-user tested patch may crash the critical OT asset because the testing environment can be different from production environment. It is always recommended to have a rollback plan with complete tested backups before patching critical OT assets.
7. Create a patch management policy
As the organization matures, it needs to have a documented policy that should be followed by its employees and that is updated when required. A well-defined patch management policy and the process are some of the best practices for an efficient patch management approach. A patch management policy helps to guide staff around the acceptable OT patch management process. It is not required to define such a process from scratch because standards such as IEC-62443 are available. ANSI/ISA-TR62443-2-3-2015, which talks about Patch Management in the IACS environment, is a very good starting point to create a systematic patch process.
The patch management policy should at least include:
- Policy for defining a scanning schedule
- Policy for categorizing and separating patches
- Policy for prioritizing and deploying patches
- Policy for exempted patches
In conclusion, effective patch management is needed because attackers will exploit vulnerabilities from missing patches if they break into a critical infrastructures OT network. At the same time, all the patches cannot be installed in the OT network for several restrictions. Knowing the inventory in-depth and then matching them with known vulnerabilities is a good starting point. However, the only solution to mitigate vulnerabilities is to not implement patches. Understanding that not all patches can be installed due to vendor approval or non-compatible hardware, the risk needs to be evaluated and reduced to an acceptable level by applying other hardening mechanisms such as boundary protection and network segmentation. In other words, if the patches cannot be installed then the probability that attackers can find the vulnerabilities needs to be lowered by applying appropriate controls.
Patching is a change to the OT environment and the patch management needs to follow the best change management approach such as baseline, record, review, document, and rollback plan. To achieve a higher maturity, it is required to create and follow a patch management policy and process for the approaches like maintaining a comprehensive inventory and assessing and reducing risk for exempted patches.