This is Part 4 of The OT Security Dozen – a 12-part series on building an OT/ICS cybersecurity program for an industrial operations environment.
Note: You may have noticed that operational technology (OT)/industrial control system (ICS) cybersecurity awareness is a common theme across “The OT Security Dozen,” and hence no exclusive part on awareness itself. The aim for this series is to raise awareness on each type of controls covered, and therefore is considered an essential/integral necessity across this 12-part series.
This part is to help industrial end user/operator organizations understand the typical challenges and drivers behind selecting, implementing, and maintaining an OT intrusion detection (OT IDS) or anomaly detection (AD) solution for OT networks and how such solutions help address challenges around contextual visibility and situational awareness through the identification of assets, traffic flows, vulnerabilities, risks, and aids in continuous monitoring and incident response activities.
Assuming you've performed the OT Security Dozen Part 1: A Year of OT/ICS Cybersecurity Assessments with the discovery of assets and network diagram (and, ideally, having OT Security Dozen Part 3: Network Security Architecture & Segmentation in place, though not mandatory, which could potentially help in lowering the costs of solution implementation). Now we have all the essential pre-requisites for the site and technical information that is required to evaluate, select, implement, and run an OT IDS/AD solution to enhance an OT/ICS cybersecurity program.
Historically, because of information technology (IT)-OT convergence, OT/ICS or production control networks lack contextual visibility in terms of what’s connected to the network and how traffic flows between the assets on the networks. Due to the increase in business demands for more efficiency, productivity, and connectivity requirements for Industry 4.0/industrial internet of things (IIoT)-related digital transformation initiatives, visibility challenges have compounded further. Until a few years ago, there were a handful of solutions with very limited or no capabilities of interpreting a wide range of industrial protocols and detecting anomalies, which required a lot of customization and advanced skillsets.
Other challenges associated with OT asset and traffic visibility are, but not limited to:
Knowing what needs to be protected (assets discovery) and what the risks are (vulnerabilities and threats) are crucial for any IT or OT cybersecurity program, and, according to almost all international standards and best practices, are part of foundational controls that should be put in place.
Industries have responded initially with solution offerings addressing specific needs with point products (e.g., by original equipment manufacturer [OEM] vendors) addressing OT asset discovery/inventory challenges and/or specialized products addressing anomaly detection challenges. Later, industries saw the rise and emergence of specialized security solutions, quickly recognizing the market demands to bundle both the visibility and detection capabilities. In the last 6 years or so the number of such security vendor solutions has dramatically increased by the emergence of niche players entering this space and raising millions in funding and/or traditional global networking or software names either building or acquiring such specialized solutions and integrating them into their product portfolio.
The last 2 years of the pandemic saw an accelerated growth in terms of the maturity of such solutions, expanded OT protocol coverage, greater accuracy in asset, vulnerability, and anomaly/threat detection, and other added capabilities (e.g., internet of things [IoT], IIoT, or internet of medical things [IoMT] device visibility). These solutions are now available in different forms such as on-premises hardware, software-based solutions, or containerized in networking gear and managed via software as a service (SaaS) based portals.
The below diagram depicts a list of a few cybersecurity challenges faced by an industrial organization and how OT IDS/AD solutions address them across PREDICT, PREVENT, DETECT, and RESPONSE cycles (at a high-level):
There are several key prerequisites for implementing an OT IDS/AD solution for OT environments (e.g., manufacturing). Some important considerations include:
There are several different OT IDS/AD solutions available in the market with support to provide coverage across IT, OT, IoT, and IIoT devices/systems. Below is a high-level list of OT IDS/AD solution evaluation and selection criteria (in no particular order).
Note: While comparisons are good, conducting proof of concept (POC) and viewing the outcome is the best way to select a solution; narrow down to at least the top two solutions for POC.
There are a few different methods that can be used by OT IDS/AD solutions, including passive, active, and configuration file methods. Each of these methods has its own unique characteristics and advantages, and they can be used alone or in combination depending on the specific goals and objectives defined.
OT IDS/AD solutions may leverage a combination of statistical analysis, machine learning, and artificial intelligence (AI) techniques for enhanced detection and alerting capabilities.
The following diagram highlights a few examples of both on-premises and hybrid implementations in a 2-tier or 3-tier architecture models for a single site and/or multi-site global deployment.
Define success criteria early in the project lifecycle across the following:
The implementation of an OT IDS/AD solution typically involves several steps or stages. Some of the key steps involved in both running a POC and/or deploying/implementing an OT IDS/AD solution are depicted in the following diagram.
Note: The two types of POC approaches can be adapted: Offline POC and Online POC. The key difference between two is that one is implemented in a lab environment with the use of PCAPs, and the other is performed on site at a production facility.
OT IDS/AD solutions, once implemented, becomes one of the main key OT log sources, providing comprehensive details for network-based activities/events and generating alerts for which organizations need to have a plan in place for handling those alerts effectively, which should include:
After implementation, organizations can take several steps to run and improve OT cybersecurity programs, which may include:
Document entire project lifecycle: It’s critical to understand the importance of documenting the discovery, design/architecture, implementation details, and standard operating procedures (SOPs) for managing the solution. The following diagrams highlights the essential elements (as an example only, not an exhaustive list) to be documented, maintained, and kept up-to-date (create and maintain a single or set of documents based on organizational practices).
After OT/IDS solutions have been implemented, ensure that there’s a hand-over between the implementation and operations team that will be running and managing (plus monitoring) the solution. A good way to do this is to arrange a knowledge transfer session between the teams covering the following topics:
Note: This is not supposed to be a training alternative. For product training, look for OT IDS/AD vendor-specific training options.
Avoid common failures with addressing needs across asset visibility, solution selection, and implementation and operationalization by:
OT IDS/AD solution is also a key security control solution for any given OT cybersecurity program, directly or in-directly improving or facilitating the following security processes:
For your industrial operations, select, design, and implement an OT IDS/AD solution for contextual visibility of OT network environments. If you are unsure where to start, engaging an expert is your best bet to help you select and implement the right OT IDS/AD solution.
A version of this article originally appeared on LinkedIn. The author will be first featuring the series on this platform and encourages everyone to follow along in the SecuringThings newsletter.
See Intro blog here. See Part 1 here. See Part 2 here. See Part 3 here.