Providing Out-of-Band Connectivity to Mission-Critical IT Resources

Implementing a Network Modernization Strategy for Large-Scale Organizations

Two engineers plan a network modernization strategy from a platform overlooking racks of data center infrastructurea
The COVID-19 pandemic forced many large-scale organizations to decentralize their business operations to enable remote work, which shined a spotlight on how outdated their enterprise networks are. As other world events like wars, a recession, and virus resurgences continue to impact business, organizations must modernize their network infrastructure if they want to survive. However, their survival is also contingent on their ability to meet SLAs and maintain 24/7 availability, so it’s crucial to minimize the disruption caused by infrastructure upgrades. This blog provides advice to large-scale organizations on how to implement a network modernization strategy that minimizes disruptions while leaving room for future growth and innovation.

The importance of network modernization

Network infrastructure updates are expensive and can be disruptive, leaving many large companies wondering if the payoff is worth the risks. However, when COVID-19 struck, these organizations were left scrambling to replace their outdated and insecure VPN solutions with more robust remote connectivity technology. Similarly, in the current recession, enterprises that put off network modernization in the past are now finding themselves without the remote management and orchestration capabilities they need to keep their infrastructure running optimally with reduced staff. Even without the looming threat of major world disruptions, outdated network infrastructure poses a risk to large-scale organizations. Obsolete devices are no longer patched by the vendor, which means any vulnerabilities that exist will remain open for hackers to exploit. Older equipment is also more likely to break, and may not be supported by the provider, making it more difficult and expensive to recover from a failure. Plus, outdated infrastructure hampers an enterprise’s ability to innovate with new technologies to stay competitive in the market. Upgrading network infrastructure is expensive, time-consuming, and requires careful planning to prevent business interruption. However, investing in network modernization now will save you from more costly disruptions in the future.

A network modernization strategy for large-scale organizations

Enterprises need to carefully plan their path to network modernization to ensure they can meet their customer SLAs by avoiding outages and performance degradation. Here are some tips for implementing a network modernization strategy that minimizes disruption while leaving room for future growth.

Bridge the gap with a vendor-agnostic platform

To ensure a smooth upgrade process, organizations will gradually upgrade their infrastructure by replacing individual solutions one at a time. There’s typically an extended window of time in which there are both legacy and modern devices that need to be monitored, managed, and supported. This creates additional complexity for administrators who need to learn how to use the new solutions, integrate them with the existing infrastructure, and ensure there’s little-to-no impact on end users. It’s especially challenging when they need to use different management platforms to access and control each solution. That’s why it’s important to implement a vendor-agnostic network management platform that supports legacy and multi-vendor solutions. A vendor-agnostic platform gives administrators a single pane of glass from which to control the entire heterogeneous network architecture, simplifying day-to-day management and allowing them to focus on optimizing performance and implementing future upgrades. Plus, a unified platform makes it possible to extend new technological capabilities (like remote OOB management and automation) to older infrastructure, accelerating network modernization efforts.

Reduce downtime with remote out-of-band management

Any experienced admin knows that installations and updates are risky procedures. Even with the best-laid plan, errors can occur that prevent new systems from coming online, cause integration issues with existing infrastructure, or even take down dependent network services. The risk is even greater when the upgrades occur remotely without any technicians on-site to power cycle devices or reconfigure systems offline. What if there’s an outage or severe disruption, but COVID lockdowns or natural disasters prevent staff from entering these locations? Remote out-of-band (OOB) management creates an alternative path that admins use to access remote infrastructure. It creates an out-of-band network that’s dedicated to infrastructure management and orchestration and that doesn’t rely on the availability of the production network. That means administrators can access and troubleshoot offline devices remotely, reducing the duration and impact of downtime. Remote OOB management makes it safer for large-scale organizations to implement a network modernization strategy and ensures the continued stability and availability of enterprise infrastructure.

Streamline deployments with automation

Even when new infrastructure deployments run smoothly, they take considerable time and effort on the part of network administrators. Large, global organizations have complex and highly distributed network architectures with thousands of moving parts that need to be upgraded or replaced. Just configuring and installing all of these new solutions can add significant delays to the network modernization process. Plus, configuring so many devices is tedious and prone to human error, causing more delays as admins troubleshoot and fix deployment failures. For example, a typo in an IP address on one device could prevent dependent services from deploying correctly, forcing teams to retrace their steps and waste time identifying the error. Automation is the key to streamlining device deployments and reducing configuration errors. For example, Zero Touch Provisioning (ZTP) allows admins to provision new devices automatically over the network using definition files. These files can be reused as many times as needed to deploy many identical solutions across the enterprise network, significantly reducing the time and effort required to modernize infrastructure. Plus, configuration files can be tested pre-deployment to ensure there are no errors or security vulnerabilities. Vendor-agnostic network management platforms, OOB management, and automation are crucial components of a smooth network modernization strategy. Implementing this strategy is easier if you choose a management solution that integrates all these capabilities into a single, unified platform.

Make Nodegrid a part of your network modernization strategy

The Nodegrid platform from ZPE Systems delivers vendor-agnostic control, Gen 3 OOB management, and end-to-end network automation capabilities in a single box. Nodegrid has helped large-scale organizations like the Internet Association of Australia update their network infrastructure without disrupting business. Nodegrid serial consoles support both legacy and modern Cisco pinouts, allowing them to dig their hooks into any device in your network infrastructure. That means you can use the ZPE Cloud solution to extend automation and orchestration to your entire heterogeneous architecture, supercharging your network modernization efforts. Nodegrid uses high-speed OOB interfaces (e.g., 5G/4G cellular) to provide admins with a fast and reliable connection for remote upgrades, management, and orchestration. Nodegrid allows you to power cycle devices, enter BIOS menus, manage power load distribution, and more from anywhere in the world with an internet connection. This makes it easier and safer for large-scale organizations to remotely upgrade their network infrastructure and ensures continuous management availability to prevent downtime in the future. The vendor-agnostic Nodegrid platform also allows you to extend automation features like ZTP to both legacy and modern solutions in your network infrastructure. Nodegrid supports integrations with your choice of third-party automation tools, or you can use Nodegrid hardware to directly host custom scripts and automation apps. This both streamlines the network modernization process and gives you the ability to grow and evolve your network with emerging automation technologies like AIOps. Nodegrid streamlines network modernization strategies by providing vendor-agnostic management, remote OOB management, and end-to-end automation support in a single platform. 

Want to learn more about Nodegrid’s role in enterprise?

To learn more about Nodegrid’s role in an enterprise network modernization strategy, contact ZPE Systems today. Contact Us

Using AIOps and Machine Learning To Manage Automated Network Infrastructure

shutterstock_1825096265(1)

Automation is the key to maintaining optimal network performance and availability during tumultuous times. A resilient, automated network keeps functioning even if administrators can’t physically access the infrastructure or when a recession forces companies to reduce their IT workforce. A network automation framework includes all the tools, technologies, and practices required to build a resilient and fully automated enterprise network infrastructure.

The four building blocks of a resilient network automation framework include:

  1. IT/OT production infrastructure
  2. Automation infrastructure
  3. Orchestration infrastructure
  4. AIOps

In previous blogs, we focused on the building blocks that enable network automation and orchestration. In this blog, we’ll discuss how AIOps and machine learning help teams manage their automation and orchestration—and the massive amounts of data produced by their automated systems—more efficiently.

What is AIOps?

AIOps—artificial intelligence for IT operations—was originally introduced by Gartner in 2017. It uses AI technologies like machine learning (ML) and natural language processing (NLP) to analyze IT operations data. This data is pulled in from many different sources, including monitoring and visibility platforms, environmental monitoring sensors, event logs, and firewalls. AIOps utilizes that data to automate tasks like event correlation, anomaly detection, and root cause analysis (RCA) as well as to predict future outcomes and provide valuable business insights.

What’s the difference between AI and machine learning?

Before we delve any deeper into the specific uses for and benefits of AIOps, it’s important to clarify what we mean when we talk about technologies like AI and machine learning.

AI stands for artificial intelligence, which is defined as a computer’s ability to display human-like intelligence through behaviors like learning from new data, drawing conclusions based on that data, and coming up with solutions to problems.

Machine learning, on the other hand, describes a computer’s ability to process large quantities of data and learn from it. Learning is a major requirement for AI, which means that all machine learning applications could be considered AI. However, not all AI is machine learning—artificial intelligence uses additional technology to make decisions, solve problems, and perform other automated functions.

Essentially, AI describes a broad range of technologies, whereas machine learning is a more specific subset of technologies included in the AI umbrella. In the context of AIOps, however, machine learning is often the only artificial intelligence technology in use.

Using AIOps and machine learning to manage automated network infrastructure

In an automated enterprise network, AIOps and machine learning use advanced algorithms to provide in-depth analysis of all the data collected from production infrastructure, automation components, and orchestration systems. AIOps solutions can even take things a step further by making decisions and solving problems based on the results of that data analysis.

Some examples of how AIOps and machine learning can be used to manage automated network infrastructure include:

Security

Cyberattacks and data breaches are major threats to the reliability and performance of network infrastructure. In addition to the financial losses caused by sensitive data exfiltration and reputation loss, security breaches are also a leading cause of downtime, which directly impacts business revenue. According to the ITIC’s 2022 Global Server Hardware Security survey, 76% of enterprises cited security breaches as the top cause of downtime. That means network security is paramount to the resilience of an automated infrastructure.

For many years, network security relied on signature-based detection for jobs like intrusion prevention, antivirus, and spam filtering. Signature-based detection involves comparing an incoming request to a database of known threats to see if it matches—if not, it’s assumed to be safe and allowed into the network. This approach only works if the database is kept up to date and if all incoming threats have been identified in the past. Signature-based detection often fails to catch zero-day exploits or novel malware that it hasn’t seen before, plus it tends to generate a lot of false positives.

AIOps security solutions overcome this problem by learning from past experiences. Machine learning is able to extract information from past threats and then develop algorithms to recognize, predict, and categorize a new threat that it’s never seen before. This makes AIOps adept at preventing new threats as well as detecting ones already on the network.

You can also use AIOps to analyze data from infrastructure logs and other security solutions to spot the more subtle signs of a breach that’s already happened or that’s currently taking place. For example, AIOps and machine learning may detect an unusually large amount of data leaving the network, which could indicate that a malicious actor is exfiltrating sensitive information. Another security use for AI is called User and Entity Behavior Analytics (UEBA), which inspects account activity on a network and reports anomalous behavior that could indicate an account has been compromised.

AIOps improves upon automated network security solutions by using adaptive learning and predictive analysis to detect new and unusual threats with a greater degree of accuracy. It also takes advantage of the massive amounts of data produced by security appliances and network infrastructure to identify the subtle clues left behind by sophisticated cybercriminals. This makes AIOps a valuable tool for maintaining the security and availability of an automated network infrastructure.

Monitoring

An automated network infrastructure generates a massive quantity of logs that can be used to assess health and performance as well as to identify potential issues before they cause any outages or downtime. However, humans aren’t very good at sifting through large amounts of data to figure out what’s relevant and what isn’t.

Many monitoring solutions use basic automation to help weed out important data, for example by letting admins set performance thresholds that generate automatic alerts when devices fall out of the optimal operating range. However, this kind of automation creates a lot of false positives, which are tedious to sort through and could lead to admin neglect or complacency. It can also only detect specific symptoms and issues that fall within the scope of the monitoring thresholds programmed by a sysadmin, which means it can’t adapt to changing circumstances or predict new problems that weren’t anticipated by the admin in advance.

An AIOps monitoring solution collects all the logs produced by automated infrastructure and analyzes them in real time. Sysadmins can still set performance thresholds and program automatic alerts, but AIOps also uses machine learning to “think outside the box” by recognizing patterns and detecting anomalies it wasn’t programmed to look for. That means issues are identified faster, potentially before they cause any noticeable problems for end-users.

Machine learning also gives AIOps monitoring solutions the ability to track performance over time and predict future outcomes based on historical data. For example, organizations can use AIOps analysis to plan infrastructure upgrade schedules based on when device performance is predicted to start degrading, or in advance of a predicted spike in demand for a particular location. This gives CIOs and IT managers the ability to make smarter decisions about where and when to invest money and how to prioritize new initiatives.

AIOps monitoring solutions work well with data lakes, which are large repositories for unstructured data. Data lakes are an efficient way to process large quantities of data, such as monitoring and security logs. This enables the data to be used by AIOps and other big data tools.

AIOps transforms the flood of logs generated by complex, automated network infrastructures into actionable data. Enterprises can use AIOps and machine learning to catch subtle issues before they turn into major problems, improving the performance and availability of network resources. AIOps also provides valuable business intelligence that organizations can use to make smarter and more cost-effective decisions during recessions and other tumultuous events.

Root cause analysis (RCA)

When there’s an outage or other business interruption, the main priority is fixing whatever is preventing systems from operating normally so that systems can get back online. Often, this means fixing the symptoms of some deeper underlying problem. If that core problem isn’t addressed, it’s likely to cause another outage in the future. That means administrators must perform a root cause analysis (RCA) to discover the source, come up with a fix, and document everything for future reference.

Root cause analysis involves digging through devices, applications, and service logs, which human engineers can’t do as efficiently as AI solutions. AIOps can comb through all the relevant logs to determine the most likely cause of the problem as well as recommend the best solution to fix it. Incidents are automatically generated, prioritized, and assigned to the correct team for resolution, ensuring the core problem is quickly and thoroughly fixed to prevent future outages.

Some AIOps solutions can even automatically resolve some issues without waiting for a human engineer to receive an alert, log in to the system, identify the problem, and implement a solution. This can significantly reduce the mean time to resolution (MTTR) and minimize expensive business interruptions.

Sorting through data is what AIOps does best, which makes it the perfect tool for RCA. AIOps can determine the root cause of automated infrastructure failures much faster than human admins, making it easier to fix these underlying problems before they cause future downtime. AI can even proactively implement fixes while issues are ongoing, allowing businesses to recover faster and reduce the cost of outages.

Implementing AIOps and machine learning in a resilient network automation framework

AIOps is the final layer of the network automation framework because it reduces the management complexity involved in monitoring, troubleshooting, and optimizing automated network infrastructure. Because AIOps needs to collect logs from every single component of the network automation framework, it must be a vendor-neutral solution that has access to your orchestration platform as well as all your management hardware and software. This will be much easier if your orchestration, automation infrastructure, and IT/OT management infrastructure are also vendor-neutral.

For example, the Nodegrid platform from ZPE Systems includes management devices like Gen 3 OOB serial consoles and integrated network edge routers that can bring your entire mixed-vendor environment under a single management umbrella. Nodegrid hardware is truly vendor-neutral, which means it can directly host your AIOps applications to help consolidate devices in your rack. The ZPE Cloud infrastructure orchestration platform also supports integrations with third-party and cloud-based AIOps solutions. Either way, you get network infrastructure management, monitoring, automation, orchestration, and AIOps in a single platform.

ZPE’s Network Automation Blueprint

AIOps works together with IT/OT production infrastructure, automation infrastructure, and orchestration to ensure network resiliency during uncertain times. The Network Automation Blueprint from ZPE Systems provides a reference architecture for achieving Gartner’s definition of hyperautomation as well as meeting the Open Networking User Group (ONUG) Orchestration and Automation recommendations.

Download the Network Automation Blueprint today and see how all these building blocks fit together to ensure network resiliency.

Ready to learn more about implementing AIOps and machine learning?

To learn more about implementing AIOps and machine learning with Nodegrid, contact ZPE Systems today.

Contact Us

A Guide to Infrastructure Orchestration and Automation

infrastructure orchestration and automation
As the recession continues to affect businesses across all industries, enterprise network resilience has never been more critical. The typical outage costs at least $100,000—a price tag that most companies can’t easily absorb in the current economic climate. However, decreasing business revenues have caused many companies, especially in the tech industry, to lay off large portions of their key IT staff. That means there are fewer administrators to monitor and manage network infrastructure and fewer engineers available to respond to issues and recover from outages.

Network automation is the key to ensuring 24/7 availability and optimal performance with less human interaction. A network automation framework provides all the tools and guidance needed to create a fully-automated network infrastructure that’s resilient to failure.

The four building blocks of a resilient network automation framework include:

  1. IT/OT production infrastructure
  2. Automation infrastructure
  3. Orchestration infrastructure
  4. AIOps

In previous blogs we discussed the role of IT/OT production infrastructure in network automation and how an IT/OT convergence strategy accelerates network automation. We also described the automation infrastructure components that enable end-to-end network automation. In this post, we’ll explain how infrastructure orchestration and automation build upon the previous two layers to enable streamlined, hyperautomated network resiliency. Our final blog in the series will conclude with a guide to using AIOps and other machine learning technologies to complete the network automation framework.

What is infrastructure orchestration and automation?

The infrastructure orchestration and automation layer contains the tools and paradigms used to efficiently manage and control that automation. The core components of infrastructure orchestration and automation include:

Version control

The automation infrastructure layer uses infrastructure as code (IaC) to decouple device configurations from the underlying hardware so they can be written as scripts or definition files that automatically provision network resources. In addition, this layer uses software-defined networking (SDN) to create a virtual control plane that overlays the production network infrastructure, allowing network management and optimization tasks to be written as automated scripts.

The goal of IaC and SDN is to reduce human error, speed up device provisioning, and build a more streamlined and resilient network infrastructure. However, IaC and SDN programming can be very complex, and not all sysadmins and network administrators are expert coders. In addition, an automated enterprise network has hundreds or even thousands of these definition files and scripts to store, manage, and deploy.

This is why a network automation framework should include version control in the orchestration and automation layer. Version control is a very familiar concept to programmers, especially in DevOps environments, but not all network and infrastructure teams have used it before. Version control involves storing all code in a centralized repository and then tracking and managing changes to that code.

Let’s say one administrator is responsible for configuring and maintaining the IaC definition file used to provision a particular model of Meraki AP. Here are some examples of how that workflow could break down when that one admin is out of the office for an extended period of time due to COVID-19 or gets laid off due to cutbacks in the organization:

  • Twenty new Meraki APs need to be deployed to a new site with identical configurations.
  • The existing definition needs to be updated and pushed out ASAP to patch a security vulnerability.
  • Someone discovers an error in the current version and they need to roll back to a previous configuration.

A version control system for IaC and SDN acts as the single source of truth for the entire automated infrastructure. All automation scripts and definition files are stored in one centralized location, so anyone with authorization can deploy identical devices with the push of a button. When an admin needs to change the code, those changes are tracked and can be rolled back at any time if a mistake is made. Version control systems even allow admins to leave notes explaining the reasoning or logic behind individual changes, so other team members can pick up where they left off, or in their absence, identify the root cause of issues.

Another key benefit of version control is that it facilitates the use of automated testing. QA and security analysts can run automated scans on code in the version control repository pre-production, so any misconfigurations or security vulnerabilities are identified and fixed before deployment. This reduces the risk of human error and improves the security and resiliency of the automated network infrastructure.

Version control is a core component of infrastructure orchestration and automation because it serves as the single source of truth for the entire automated network architecture.

Orchestrator

Automation is meant to make life easier, but it can be very complicated to manage on a large scale. Modern enterprise network architectures include thousands of moving parts in locations around the world and in the cloud. Automating each of these workflows means writing, testing, deploying, managing, and troubleshooting many different definition files and automation scripts. Doing all of that manually adds more work to overloaded and under-resourced network infrastructure teams, which increases the risk of something going wrong. Simply put, organizations need a way to automate their automation.

An orchestrator is a tool used to control all of the automated workflows on an enterprise network, just like a conductor orchestrates many different instruments and musicians into one cohesive symphony. An orchestrator uses management devices, like Gen 3 OOB serial consoles and SD-WAN gateway routers, to gain control over the physical and virtual network infrastructure. Administrators program the orchestrator to automatically deploy definition files or networking scripts (which it pulls from the version control system) in response to certain triggers. That means admins could potentially automate every step in every workflow, removing the need for human intervention and reducing the chance of errors.

Plus, an orchestrator can react to events much faster than even the best administrator. For example, if a spike in demand is overloading resources at one regional data center, the orchestrator can instantly deploy automated load-balancing workflows to reroute traffic before end-users notice any performance issues. This allows enterprises to maintain 24/7 network availability and performance even with reduced IT staff.

As part of a resilient network automation framework, the orchestrator should be vendor-agnostic (vendor-neutral). It needs to be compatible with all of the automation infrastructure components, as well as the production IT/OT solutions. It also needs to support all of the major third-party automation vendors, such as Ansible and Gluware, to give infrastructure teams the flexibility to use the tools they’re most comfortable with and that work best in their enterprise’s unique environment. Finally, the orchestrator needs to integrate with other tools within the orchestration and automation layer, including the version control system and the monitoring and analytics platform.

The orchestrator is what gives the “orchestration and automation” layer its name. It provides admins with the ability to automatically manage all the automated workflows that make up a resilient network infrastructure. An orchestrator reduces the risk of outages caused by human error and can automatically respond to and prevent potential issues.

Visibility & insights

It’s tempting to think of infrastructure orchestration and automation as a “set it and forget it” solution that can perfectly manage an enterprise network without any human oversight, but the technology isn’t quite there yet. Administrators need a way to monitor all the automated workflows, identify problems the orchestrator may have missed, and analyze the health and performance of the network infrastructure.

A visibility and insights platform collects logs from all the various components of the automated network infrastructure and aggregates the data in one centralized location. It provides visualizations of current device health and network performance, and may even include predictive analysis to power business insights. This gives administrators a big-picture overview of distributed, complex, and automated network architectures so they can ensure continuous availability and optimal performance.

As with the version control system and the orchestrator, the visibility and insights solution needs to be vendor-agnostic so it can dig into every single hardware and software solution in the automated network infrastructure. In a resilient network automation framework, the vendor-neutral version control, orchestrator, and visibility solutions are all combined in a single platform.

Infrastructure orchestration and automation with a single platform

A unified infrastructure orchestration and automation platform like ZPE Cloud simplifies the control and management of a fully-automated enterprise network. ZPE Cloud uses Nodegrid hardware—such as Gen 3 OOB serial consoles and integrated network edge routers—to deliver orchestration and automation to large, distributed, multi-vendor network infrastructures. The ZPE Cloud management app supports integrations with your choice of third-party version control and infrastructure automation solutions, or you can use Nodegrid hardware to directly host your automation software.

With ZPE Cloud, you also get comprehensive monitoring data on all connected infrastructure, plus, you can use Nodegrid environmental monitor sensors to gain insights on conditions in remote data centers and network closets.

ZPE’s Network Automation Blueprint

Infrastructure orchestration and automation works together with IT/OT production infrastructure, automation infrastructure, and AIOps to ensure network resiliency during uncertain times. The Network Automation Blueprint from ZPE Systems provides a reference architecture for achieving Gartner’s definition of hyperautomation as well as meeting the Open Networking User Group (ONUG) Orchestration and Automation recommendations.

In a future blog post, we’ll discuss the remaining building block of the Network Automation Blueprint in depth. In the meantime, you can read about IT/OT production infrastructure and automation infrastructure, or click here to get a sneak peek of the blueprint, which includes a 10-step checklist to get started with automation now.

Ready to learn more about infrastructure orchestration and automation?

To learn more about infrastructure orchestration and automation with ZPE Cloud and Nodegrid, contact ZPE Systems today.

Contact Us

Key Automation Infrastructure Components That Enable End-to-End Network Automation

A resilient network containing automation infrastructure components and concepts overlays a busy industrial plant that uses OT automation.

As inflation rises, new business declines, and another COVID-19 surge looms on the horizon, many organizations are bracing for a recession. CIOs and IT managers are having to do more with less—less staff, less budget for upgrades and repairs, and less access to on-site infrastructure. Despite these restrictions, they still need to ensure the 24/7 availability and optimal performance of enterprise network resources as any amount of downtime could severely impact business revenue.

The ability to continue providing digital services in less-than-ideal situations is known as network resiliency. Network automation is a key tool for ensuring resiliency during staffing shortages and lockdowns, and a network automation framework provides the tools and methodologies needed to create a fully-automated network infrastructure.

The four building blocks of a resilient network automation framework include:

  1. IT/OT production infrastructure
  2. Automation infrastructure
  3. Orchestration infrastructure
  4. AIOps

We’ve previously discussed the role of IT/OT production infrastructure in network automation and how an IT/OT convergence strategy accelerates network automation. In this post, we’ll describe the automation infrastructure components that enable end-to-end network automation. Future blogs will explain how the orchestration infrastructure layer and AIOps layer build upon these components to ensure business resiliency.

What is automation infrastructure?

Automation infrastructure is composed of all the hardware and software solutions that enable automation to occur. These solutions target the IT and OT production infrastructure and automate some or all of their workflows.

Key automation infrastructure components

There are a variety of hardware and software solutions that provide automation capabilities for specific workflows, use cases, and deployment models. As part of a resilient network automation framework, the most important automation infrastructure components include:

Gen 3 OOB serial consoles

Serial consoles are typically installed in data centers and used to manage other devices over a serial cable connection. They create an out-of-band management (OOBM) network that’s dedicated to troubleshooting, management, and orchestration traffic, and which is accessible via a secondary internet connection (often using cellular). This secondary connection ensures administrators always have remote management access to critical data center infrastructure even when the primary ISP, WAN link, or production LAN goes down. That means businesses can recover from outages faster and without dispatching expensive truck rolls.

The latest generation of serial consoles, Gen 3, gives administrators the ability to automate workflows on all data center infrastructure. Gen 3 serial consoles are vendor-neutral, which means they can extend their automated management capabilities to any vendor’s device. That vendor neutrality also means that Gen 3 serial consoles support custom scripts and third-party automation tools in addition to whatever automation capabilities are built-in.

For peak resiliency, data center deployments should follow a two-tier OOB architecture. That means each rack of IT/OT production infrastructure should connect to its own Gen 3 serial console, which provides OOB management access and automation. These top-of-rack serial consoles should then connect to an OOB appliance in the middle or end of the row. This ensures OOBM access for the top-of-rack appliances and creates an additional layer of redundancy and resiliency.

Screenshot 2022-12-05 202130
Another important aspect of Gen 3 serial consoles is security. Since serial consoles provide comprehensive management access to critical infrastructure, they’re a tempting target for cybercriminals. A secure Gen 3 OOBM solution includes:

  • Integration support for third-party security solutions like next-generation firewalls (NGFWs), security service edge (SSE), and SAML 2.0
  • An up-to-date operating system (OS) kernel that’s frequently patched by the vendor when vulnerabilities are identified
  • Onboard firewall functionality to inspect traffic on both the OOB network and the production network
  • Hardware security features like encrypted boot sequences and BIOS protection to prevent unauthorized access on stolen serial consoles

Gen 3 OOB serial consoles are the automation infrastructure components that enable automation and resiliency for data center deployments at the core of enterprise networks.

SD-WAN gateway routers

A gateway router is used to connect a LAN infrastructure to the internet and the enterprise WAN architecture. As part of a resilient network automation framework, all gateway routers should support SD-WAN (software-defined wide area networking).

SD-WAN separates the control and management processes from underlying WAN hardware and virtualizes them as software. SD-WAN uses features like application awareness and guaranteed minimum bandwidth to automatically optimize network performance. An SD-WAN solution can also use automatic load balancing and failover to ensure continuous availability in the event of a localized failure or data center outage.

SD-WAN is usually a cloud-based service that delivers centralized management and orchestration of automated workflows. This service runs on top of the gateway routers deployed at each site.

An SD-WAN gateway router is a key automation infrastructure component for the main office, data center, branch, and edge deployments because it enables automated WAN management and orchestration. An all-in-one cloud-managed gateway router is particularly useful for OT automation in remote facilities like warehouses and factories because it provides SD-WAN capabilities, OOBM, and routing in one multi-function device.

Monitoring, visibility, and analytics

Monitoring and visibility solutions give administrators virtual eyes and ears on remote network infrastructure. As part of a resilient network automation framework, a visibility solution should be vendor neutral so it can dig its probes into any device in a mixed vendor environment. It should also include environmental monitoring sensors that collect data on conditions in the rack.

Device monitoring and environmental sensors give administrators the ability to detect potential issues and respond quickly to prevent outages. Monitoring and visibility solutions also collect valuable data that can feed into the AIOps building block of the network automation framework.

Infrastructure as Code

Infrastructure as Code, or IaC, uses software abstraction to decouple infrastructure configurations from the underlying hardware. Configurations are written as scripts or definition files that automatically provision virtual machines (VMs), containers, or software-defined networking (SDN) devices. An IaC definition file can be deployed repeatedly, which means many identical resources can be spun up quickly while ensuring consistent configurations. An IaC config can also undergo automatic security testing before it’s deployed to any devices to prevent vulnerabilities from affecting production.

Another important aspect of IaC is automatic configuration management. Configuration management solutions like RedHat Ansible allow administrators to define the desired state of a system or network resource. The configuration management tool continuously monitors the resource to detect unauthorized changes, which might be made by a careless sysadmin or could be a sign of a malware infection. As soon as the change is detected, the configuration management solution uses a programmatic playbook to take whatever actions are needed to restore the system to its proper state.

IaC helps ensure network resiliency by reducing human error in device configurations and updates, as well as by enabling the use of pre-production automated security vulnerability scanning and configuration management. Infrastructure as Code also facilitates another key automation infrastructure component—immutable infrastructure.

Immutable infrastructure

In-place system and device updates are a common cause of hangs or failures which can be challenging to resolve remotely. Immutable infrastructure resolves this problem by eliminating updates and configuration changes altogether. Immutable infrastructure refers to virtual systems and network resources that are never changed in place. If an immutable resource has an issue or vulnerability, or if its OS is out of date, an entirely new resource is spun up and the old one is simply deleted.

IaC is an immutable infrastructure best practice because it gives administrators the ability to provision many devices very quickly and with identical configurations. Immutable infrastructure is secure, easy to deploy, and resilient to failure, making it an important part of the network automation framework.

Why Nodegrid is a key automation infrastructure component

The automation infrastructure building block of the network automation framework relies on vendor-neutral OOBM devices like gateway routers and Gen 3 serial consoles that extend automation to converged IT/OT production infrastructure. These devices must also support monitoring and visibility solutions, Infrastructure as Code with configuration management, and immutable infrastructure.

For example, the Nodegrid platform from ZPE Systems includes OOB management hardware for a variety of data centers, branch, and edge deployments. Nodegrid serial consoles, such as the NSCP, can dig their hooks into any device in your data center to enable end-to-end network automation. A Nodegrid Gen 3 OOB serial console can even extend IaC and immutable practices to legacy devices to ensure resiliency without expensive forklift upgrades.

Nodegrid services routers, such as the Mini SR, are compact edge gateways that deliver SD-WAN support, OOBM, and cloud management capabilities to IT/OT infrastructure in smaller branch office and edge data center deployments. Nodegrid SRs can help you consolidate an entire rack of branch infrastructure into a single device to reduce management complexity, CapEx, and OpEx.

Nodegrid out-of-band is delivered via WiFi, Ethernet, or 5G/4G LTE to ensure administrators have fast and reliable access to remote infrastructure. All Nodegrid OOB devices are protected by robust hardware security features like BIOS protection, UEFI Secure Boot, geofencing, disk encryption, and TPM 2.0. Plus, Nodegrid supports integrations with Zero Trust Security solutions like identity and access management (IAM) and SAML 2.0, as well as providing an on-ramp to SSE.

Nodegrid serial consoles and services routers also include interfaces for environmental monitoring sensors to collect crucial data about conditions in your rack. These sensors, as well as any other connected devices, can all be observed and managed from a single, centralized monitoring and reporting platform.

What makes Nodegrid a crucial element of automation infrastructure is its ability to directly host Infrastructure as Code and automated configuration solutions, including Ansible, Chef, Puppet, SaltStack, Monit, and Docker. Nodegrid appliances can then extend the capabilities of the IaC solution to any of the modern, legacy, and mixed-vendor devices it manages.

ZPE’s Network Automation Blueprint

Automation infrastructure works together with IT/OT production infrastructure, orchestration, and AIOps to ensure network resiliency during uncertain times. The Network Automation Blueprint from ZPE Systems provides a reference architecture for achieving Gartner’s definition of hyperautomation as well as meeting the Open Networking User Group (ONUG) Orchestration and Automation recommendations.

In future blog posts, we’ll discuss the remaining two building blocks of the Network Automation Blueprint in depth. In the meantime, you can read about IT/OT production infrastructure or click here to get a sneak peek of the blueprint, which includes a 10-step checklist to get started with automation now.

Want to learn more about key automation infrastructure?

To learn more about Nodegrid as a key automation infrastructure component, contact ZPE Systems today.

Contact Us

How an IT/OT Convergence Strategy Accelerates Network Automation

An ITOT convergence strategy visualized with many digital services organized together in a data center.
In the face of a looming recession, Covid-19 uncertainty, global political instability, and an increasing frequency of natural disasters, network resiliency should be on every organization’s mind. Network resiliency is the ability to continue providing services and connectivity even during disruptions, such as when buildings are locked down or layoffs reduce the number of staff available to maintain or operate the technology. Network automation is the key to ensuring continuous, consistent, and streamlined management during tumultuous times.

A network automation framework provides all the tools and processes needed to create an efficient, resilient, fully automated network infrastructure. The four building blocks of a resilient network automation framework include:

  1. IT/OT production infrastructure
  2. Automation infrastructure
  3. Orchestration infrastructure
  4. AIOps

In this blog, we’ll discuss why an IT/OT convergence strategy is critical for forming the foundation of a network automation framework. Future posts will discuss the other three building blocks and how they work together to ensure business resiliency.

What is IT/OT convergence?

IT/OT convergence is exactly what it sounds like—bringing your information technology (IT) and operational technology together under unified management.

Operational technology, or OT, controls equipment interacting with the physical world, such as industrial machinery or HVAC systems. OT automation runs on specialized industrial computers, such as programmable logic controllers (PLCs) and supervisory control and data acquisition systems (SCADAs). Those computers are usually completely isolated from IT networks, which means operators have no way to access them remotely. If operators can’t get onsite, whether due to a Covid-19 lockdown or natural disaster, they lose the ability to manage OT.

For example, Southern California is home to many high tech manufacturing plants, especially in the aerospace and defense industries. Due to the effects of climate change, there’s been an increase in the frequency and severity of wildfires in this region, leading to more frequent evacuation orders and plant closures. That means operators can’t access their computer systems to control and monitor OT devices, forcing these businesses to pause their operations.

In addition, OT control systems aren’t usually within the purview of IT management because they use specialized computers and automation software that needs to be operated and supported by OT experts. That means IT infrastructure automation and OT infrastructure automation are siloed, which can lead to cost and management inefficiencies. With recession anxieties running high, many organizations are looking for ways to reduce such inefficiencies by converging their IT and OT infrastructure.

IT/OT convergence involves bringing your operational technology under the same management and automation umbrella as your IT network infrastructure. In a converged IT/OT infrastructure, OT control systems like PLCs and SCADAs connect to the same management hardware (e.g., serial consoles or cloud-managed gateway routers) as IT servers and network devices. This gives administrators a single platform from which to orchestrate automation across both IT and OT infrastructure.

What does IT/OT convergence look like?

IT and OT equipment being managed
First, you have the IT and OT equipment being managed. On the IT side, this includes things like servers, storage, security appliances, and SD-WAN devices. On the OT side, you have devices like environmental sensors, cameras, and power distribution units, as well as industrial computers used to monitor and control physical equipment. Some examples of those industrial systems include:

  • Programmable logic controllers (PLCs), which control industrial machines, robotic devices, and other manufacturing processes.
  • Supervisory control and data acquisition (SCADA), which is a control system for high-level supervision of industrial processes, including PLCs.
  • Building management systems (BMSs) which manage building equipment such as HVAC, fire suppression, lighting, and automatic doors.

These IT devices and OT computers all connect to common management hardware. For large deployments, these might be high-density serial consoles; in smaller deployments, these might be network edge routers with integrated serial console management functionality. This management hardware then connects to an orchestration platform that’s used to monitor, deploy, and manage automation across the converged IT/OT infrastructure.

How an IT/OT convergence strategy accelerates network automation

Bringing operational technology onto IT networks makes it possible for operators to remotely access their OT systems when they’re unable to come onsite. That means that your business can continue to function even during pandemic lockdowns, extreme weather events, or wars that prevent your staff from entering the building.

IT/OT convergence also allows you to bring operational technology under the same management umbrella as IT, so you can use the automation tools you’re already familiar with on the IT side to automate your OT. This reduces the overall management complexity of the IT/OT infrastructure and facilitates holistic orchestration of a fully automated—or even hyperautomated—enterprise network. This level of automation can help organizations reduce wasteful processes, eliminate redundancies, and increase operational efficiency so they can weather recessions and other economic difficulties.

Building IT/OT convergence into a resilient network automation framework

Your IT and OT infrastructure represent the target devices that are automated as part of a network automation framework. For maximum resiliency, your IT/OT convergence strategy should include:

Out-of-band (OOB) connectivity

Out-of-band (OOB) connectivity provides an alternative path to remote IT and OT infrastructure when the primary ISP connection goes down. In addition, OOB management devices (like serial consoles) directly connect to IT/OT devices, so administrators can manage them without an IP address or LAN connectivity. While OOB is not itself a component of IT/OT infrastructure, it’s a crucial element of the management devices and orchestration solution you’ll use to converge your IT and OT infrastructure.

Wired and wireless connectivity

Your converged IT/OT management solution also needs to support a variety of wired and wireless connectivity options to ensure resilience and flexibility. For example, if the ISP’s wired network infrastructure is disrupted due to extreme weather or warfare, you should be able to fail over to a 5G or 4G cellular connection. Or you may have some devices that lack RJ-45 ports, which means you need a management solution that supports USB. The goal is for your management solution to be adaptable to any scenario so that sudden changes or unforeseen issues don’t cripple your network operations.

Power control with UPS backup

As a remote network infrastructure, one of the most frustrating issues to deal with is a device that locks up after a system crash or failed firmware update. Often, a power cycle is all that’s needed to fix the problem, but that requires an on-site technician, which means an expensive and time-consuming truck roll. To ensure network resiliency while reducing the incidence of truck rolls, you need an IT/OT management solution that includes rack PDUs and IPMI options to facilitate remote power control of all connected devices.

In addition, an uninterruptible power supply (UPS) improves resiliency by providing backup power in case of an outage. This gives network teams time to investigate the problem and (hopefully) implement a fix before losing power. As part of the network resilience framework, all UPS units should hook into the management solution to allow for automated monitoring, optimization, and troubleshooting.

Environmental Sensors

Environmental sensors are used to monitor conditions in the location where IT and OT infrastructure is deployed. Traditionally, these sensors monitor racks in remote data centers, but they’re especially critical for IT/OT infrastructure that resides in less-ideal locations. For example, environmental sensors can provide data on the temperature and humidity levels in remote warehouses, offshore oil rigs, outdoor “smart city” deployments, and other locations when environmental conditions can’t be controlled.

Environmental sensors alert administrators when conditions grow too extreme for IT/OT equipment to function optimally. That means that teams can respond quickly and prevent equipment failures from bringing down critical resources. In addition, your infrastructure orchestration solution can analyze the data from these sensors to predict future issues or recommend optimizations to improve efficiency and resiliency.

How Nodegrid accelerates IT/OT convergence

The most successful IT/OT convergence strategy relies on vendor-agnostic platforms that can connect to both IT and OT infrastructure. For example, the Nodegrid solution includes management hardware that can connect to modern and legacy devices in a mixed vendor IT/OT infrastructure, such as the Nodegrid Serial Console Plus (NSCP) for large and hyperscale data center deployments and the Nodegrid Net Services Router (NSR) for flexible edge and branch deployments. These devices allow you to use the ZPE Cloud management platform to extend automation and orchestration to all your IT and OT targets to create a unified, efficient, and resilient converged network infrastructure.

ZPE’s Network Automation Blueprint

IT/OT production infrastructure works together with automation infrastructure, orchestration, and AIOps to ensure network resiliency during uncertain times. The Network Automation Blueprint from ZPE Systems provides a reference architecture for achieving Gartner’s definition of hyperautomation as well as meeting the Open Networking User Group (ONUG) Orchestration and Automation recommendations.

In future blog posts, we’ll discuss the remaining three building blocks of the Network Automation Blueprint in depth. In the meantime, click here to get a sneak peek of the blueprint, which includes a 10-step checklist to get started with automation now.

Ready to learn more about implementing an IT/OT convergence strategy?

To learn more about implementing an IT/OT convergence strategy with Nodegrid, contact ZPE Systems today.

Contact Us