Providing Out-of-Band Connectivity to Mission-Critical IT Resources

SD-WAN Benefits: Your Definitive Guide

Illustration of a variety of devices connected to a complex enterprise WAN that needs SD-WAN benefits like centralized orchestration and enhanced security.
SD-WAN, or software-defined wide area networking, is on the rise as organizations grow more distributed and networks get more complicated. SD-WAN’s market share was an estimated $3.4 billion in 2022 and is predicted to increase to $13.7 billion by 2027. Orgs leverage SD-WAN to reduce MPLS costs, improve WAN performance, facilitate greater automation and orchestration capabilities, and improve their security posture. This post explains how SD-WAN works in addition to its benefits.

How does SD-WAN work?

SD-WAN uses software abstraction to decouple WAN control functions from the underlying hardware. When possible, it leverages traditional MPLS to handle requests for enterprise resources in the data center, but it can also use less-expensive cellular and public internet links to handle cloud-destined traffic. SD-WAN uses virtualized and cloud-based security technologies to securely connect remote sites to SaaS, web, and cloud resources, reducing MPLS bandwidth and eliminating the need for VPNs.

SDWan Gateway
Organizations install SD-WAN gateways at campuses, branches, data centers, and any other business locations accessing the WAN architecture. These gateways virtualize WAN management at their sites, giving admins control via centralized software (which is often cloud-based).

Regional points-of-presence (PoPs) act as SD-WAN gateways for employees working from home, giving them access to enterprise network resources without a VPN. Often, major SD-WAN providers have an existing network of regional PoPs to take advantage of, but large or especially geographically diverse organizations may also wish to deploy their own PoPs in specific areas.

There are several different SD-WAN deployment architectures for companies to choose from depending on their specific requirements and capabilities. Learn more in A Guide to SD-WAN Deployment Models.

SD-WAN benefits guide

SD-WAN benefits organizations with complex and highly-distributed networks in the following ways:

SD-WAN Benefits

Reduces costs

  • MPLS bandwidth reduction
  • Fewer circuit installations
  • Faster branch deployments

Improves performance

  • Fewer data center bottlenecks
  • Faster issue response
  • Holistic performance monitoring

Enables automation & orchestration

  • Automated configurations
  • Policy-based workflow automation
  • Centralized orchestration

Enhances branch security capabilities

  • On-ramp to SSE technology
  • Enterprise policy extension
  • Secure access for remote users

SD-WAN reduces costs

MPLS bandwidth is far more expensive than standard broadband, fiber, or cellular, often hundreds of dollars per megabit per month. For branches with existing MPLS circuits installed, SD-WAN reduces bandwidth costs by redirecting traffic that’s destined for the cloud or internet across less-expensive channels, reserving the MPLS for enterprise traffic alone.

For some branch networking use cases, such as IoT (internet of things) deployments relying entirely on cloud-based software and data processing, organizations may opt to forgo a new MPLS installation and rely solely on SD-WAN and cloud-based security solutions. Not only does this save money on installation costs and bandwidth, but it significantly reduces the time it takes to spin up a new branch, enabling that branch to generate revenue sooner.

SD-WAN improves performance

SD-WAN uses technologies like application awareness and guaranteed minimum bandwidth to provide efficient, intelligent routing for improved performance. For example, in organizations with SASE (secure access service edge) deployments, SD-WAN automatically separates cloud- and SaaS-destined traffic to flow through the cloud-based SASE stack instead of the central firewall. This reduces the load on the firewall and ensures improved performance for users who do need to access enterprise resources, while at the same time providing a “shortcut” for remote users trying to reach the cloud.

SD-WAN also responds to availability and performance issues much faster than human admins are capable of, automatically redirecting traffic to avoid bottlenecks or downed nodes to ensure a seamless end-user experience. In addition, SD-WAN’s software abstraction makes it easier to centralize WAN management, giving admins full visibility into every part of the WAN architecture for holistic performance monitoring.

SD-WAN enables automation & orchestration

SD-WAN’s software abstraction opens up many automation opportunities because WAN configurations and workflows are no longer tied to the underlying hardware. For example, device, system, and service configurations can be written as scripts or playbooks and deployed automatically to reduce the time and effort required to spin up a new branch. Policy-based automation can handle additional tasks such as load balancing and failover, and route automation faster and more efficiently than human beings can.

SD-WAN also makes it possible to bring the WAN under one management platform for holistic monitoring and centralized orchestration. This gives admins control over large, distributed, and complex WAN architectures. For example, a centralized SD-WAN platform makes it easier to orchestrate traffic across hybrid cloud architectures because admins can monitor and manage WAN workflows across their various branches, private clouds, and public clouds.

SD-WAN automation and orchestration reduce the number of tedious, manual workflows that fall on overworked networking teams. This helps to decrease the rate of human error in device and security configurations, which in turn decreases the risk that mistakes will cause outages or be exploited by cybercriminals. Centralized SD-WAN orchestration also helps organizations improve their security posture by providing more holistic visibility into things like patch statuses, system changes, and traffic patterns.

SD-WAN enhances branch security capabilities

Another way SD-WAN improves network security is by making it easier to enforce security policies at branches and edge sites without deploying additional hardware or backhauling traffic through a central firewall. Since the SD-WAN control plane is decoupled from the underlying WAN hardware, organizations can also deploy advanced security technologies with fewer device compatibility issues.

SD-WAN enables organizations to use cloud-based security solutions like SSE (security service edge). SD-WAN’s intelligent, application-aware routing can automatically separate traffic from branches and other remote sites based on whether it’s destined for enterprise data center resources or resources that live in the cloud. Cloud-destined traffic is then diverted through the SSE stack, bypassing the main firewall and reducing bottlenecks at the data center.

SSE’s security stack typically includes Firewall-as-a-Service (FWaaS) technology which provides the same (or better) capabilities as a hardware appliance. SSE is also used to extend enterprise security and access control policies to traffic between remote sites and the cloud.

In addition, the SSE stack supports Zero Trust Network Access (ZTNA), which provides secure remote access to enterprise and cloud resources to WFH employees and other systems outside the WAN. In this way, ZTNA is similar to a VPN, only more secure. ZTNA only lets remote users see and interact with one specific resource at a time, and makes them re-authenticate if they wish to access something else. If a remote user account is compromised, this re-authentication step increases the chances that unusual behavior or failed multi-factor authentication (MFA) attempts will trigger an account lock, decreasing the blast radius of an attack.

When SD-WAN and SSE are married together under a single orchestration platform, the result is SASE (secure access service edge). Organizations can purchase a complete SASE solution, or use a vendor-neutral platform to combine the SD-WAN and SSE solutions of their choice for greater customization.

Learn more about SD-WAN benefits

SD-WAN helps organizations reduce MPLS-related costs, improve WAN performance, enable greater automation and orchestration capabilities, and improve overall network security. Learn more about SD-WAN from the branch networking experts at ZPE Systems.

SD-WAN Learning center

Ready to learn more about SD-WAN benefits?

To see how a vendor-neutral orchestration platform simplifies branch management and accelerates SD-WAN benefits, request a free demo of the Nodegrid solution from ZPE Systems.

Contact Us

Simplifying Retail Network Management

Retail network management is visualized with interconnecting icons of networked retail services displayed in front of a retail warehouse

Fast and reliable networks are critical to the success of retail operations. Without network access, stores can’t process payments, handle customer data, or update inventory, which makes outages highly disruptive. According to a recent study, downtime could cost over $300,000 per hour in lost business, which is why it’s crucial that admins have the necessary tools to effectively monitor, manage, and optimize retail networks. This blog discusses some of the specific challenges involved in retail network management and how the right edge gateway solution can help overcome these difficulties.

Retail network management challenges

Managing a retail network comes with unique challenges, especially as the size and geographical distribution of the organization grows. Examples of these challenges include:

  1. Extending fast, reliable connectivity to the entire store for payment processing machines, inventory scanners, and other crucial devices. This is especially challenging in big box stores and other locations with large footprints as well as service-based chains with mechanics’ bays, drive-thrus, and other outdoor or semi-outdoor devices.
    .
  2. Maintaining optimal environmental conditions for networking equipment that’s often installed in closets, storage rooms, warehouses, and other out-of-the-way locations. The priority is typically to keep these devices hidden from customers, so they’re kept in areas that may not be climate controlled and may not have staff physically checking them every day. This increases the risk of environmental issues (like heat and humidity) causing a device failure and means no one is likely to notice the issue until it’s too late.
    .
  3. Remotely troubleshooting and recovering from issues without any on-site technicians. If the ISP connection, WAN, or LAN go down, there’s often no way to remotely access on-site equipment to diagnose and fix the problem. That means network outages require truck rolls to solve, with stores losing money waiting for technicians to travel on-site.
    .
  4. Efficiently monitoring and managing a distributed retail network architecture made up of many different network solutions and platforms. The lack of centralized management increases the risk of human error and makes it difficult to preemptively address potential problems or optimize the speed and performance of the network.

Retail network management teams need a robust solution that addresses these particular challenges. For example, they need small and powerful network devices that use centralized management to reduce management complexity. They also need a way to monitor environmental conditions and recover from outages without having to be on-site.

Simplifying retail network management

Now, let’s discuss how a robust branch gateway solution can help organizations address these challenges.

Compact, all-in-one networking

The layout of a retail store is carefully planned to ensure an optimal experience for customers, which means networking devices need to be as unobtrusive as possible. The ideal branch gateway for retail is compact and combines multiple networking functions, reducing the number of devices that need to be installed. Retail notoriously operates on a small profit margin, so the branch gateway also needs to be affordable without sacrificing performance.

Environmental monitoring

Environmental monitoring sensors collect data on conditions like temperature, humidity, and air quality in the location where networking equipment is installed. These sensors typically connect to the retail branch gateway via USB and report back to the management platform, giving admins the ability to remotely monitor the environment. This is crucial when most retail networks are managed by admins in a centralized office which may be hundreds or thousands of miles away from the stores themselves. Environmental monitoring allows them to identify and resolve potential problems before they cause device failures and outages. For example, if environmental sensors detect high temperatures, admins can get on-site personnel to turn up the air conditioning or call in an HVAC repair before devices overheat and bring down the network.

Out-of-band (OOB) management

Out-of-band (OOB) management uses redundant network interfaces (often cellular) to provide an alternative path to remote infrastructure. A branch gateway with OOB allows admins to remotely connect to devices in the store without relying on an IP address from the LAN, which means they’ll always have access even if the production network goes down. Without OOB management, the retail location goes offline for hours or even days waiting for a technician to arrive on-site, diagnose, and repair the issue. With OOB, admins can remotely access the infrastructure to restore services, often so fast that customers don’t even notice. That means they can remotely recover from more outages without truck rolls, saving time and money.

Vendor-neutral orchestration

A vendor-neutral branch gateway can interface with all the other devices in a retail network infrastructure, even if they’re from a different vendor’s ecosystem. This gives admins a single platform from which to monitor and manage every device in the store. Even better is when all of the branch gateways in the entire retail network architecture hook into a single, centralized, cloud-based orchestration platform. Admins can then monitor, control, and optimize network infrastructure for all the retail locations from one place for ultimate efficiency.

In addition, a vendor-neutral retail network management platform enables the use of third-party automation solutions. Automation reduces the risk of human error and makes it easier for teams to effectively manage and optimize even complex retail network architectures.

Retail network management with Nodegrid

Compact, all-in-one branch gateways like Nodegrid use environmental monitoring, OOB management, and vendor-neutral platforms to simplify retail network management. The Nodegrid Mini SR, for example, is an inexpensive retail branch gateway that’s roughly the size of an iPhone, so you can easily deploy them anywhere in your store without disrupting the customer experience. Despite its small size and low price point, the MSR still delivers Gen 3 OOB management capabilities while supporting Nodegrid environmental monitoring sensors and third-party automation. The Nodegrid platform is also completely vendor-neutral, giving retail network admins a single pane of glass from which to monitor, orchestrate, and optimize the entire distributed network architecture.

Ready to learn more about Nodegrid?

To learn more about about simplifying retail network management with Nodegrid, click here to download the Mini SR datasheet, or contact ZPE Systems today.

Contact Us

Implementing a Network Modernization Strategy for Large-Scale Organizations

Two engineers plan a network modernization strategy from a platform overlooking racks of data center infrastructurea
The COVID-19 pandemic forced many large-scale organizations to decentralize their business operations to enable remote work, which shined a spotlight on how outdated their enterprise networks are. As other world events like wars, a recession, and virus resurgences continue to impact business, organizations must modernize their network infrastructure if they want to survive. However, their survival is also contingent on their ability to meet SLAs and maintain 24/7 availability, so it’s crucial to minimize the disruption caused by infrastructure upgrades. This blog provides advice to large-scale organizations on how to implement a network modernization strategy that minimizes disruptions while leaving room for future growth and innovation.

The importance of network modernization

Network infrastructure updates are expensive and can be disruptive, leaving many large companies wondering if the payoff is worth the risks. However, when COVID-19 struck, these organizations were left scrambling to replace their outdated and insecure VPN solutions with more robust remote connectivity technology. Similarly, in the current recession, enterprises that put off network modernization in the past are now finding themselves without the remote management and orchestration capabilities they need to keep their infrastructure running optimally with reduced staff. Even without the looming threat of major world disruptions, outdated network infrastructure poses a risk to large-scale organizations. Obsolete devices are no longer patched by the vendor, which means any vulnerabilities that exist will remain open for hackers to exploit. Older equipment is also more likely to break, and may not be supported by the provider, making it more difficult and expensive to recover from a failure. Plus, outdated infrastructure hampers an enterprise’s ability to innovate with new technologies to stay competitive in the market. Upgrading network infrastructure is expensive, time-consuming, and requires careful planning to prevent business interruption. However, investing in network modernization now will save you from more costly disruptions in the future.

A network modernization strategy for large-scale organizations

Enterprises need to carefully plan their path to network modernization to ensure they can meet their customer SLAs by avoiding outages and performance degradation. Here are some tips for implementing a network modernization strategy that minimizes disruption while leaving room for future growth.

Bridge the gap with a vendor-agnostic platform

To ensure a smooth upgrade process, organizations will gradually upgrade their infrastructure by replacing individual solutions one at a time. There’s typically an extended window of time in which there are both legacy and modern devices that need to be monitored, managed, and supported. This creates additional complexity for administrators who need to learn how to use the new solutions, integrate them with the existing infrastructure, and ensure there’s little-to-no impact on end users. It’s especially challenging when they need to use different management platforms to access and control each solution. That’s why it’s important to implement a vendor-agnostic network management platform that supports legacy and multi-vendor solutions. A vendor-agnostic platform gives administrators a single pane of glass from which to control the entire heterogeneous network architecture, simplifying day-to-day management and allowing them to focus on optimizing performance and implementing future upgrades. Plus, a unified platform makes it possible to extend new technological capabilities (like remote OOB management and automation) to older infrastructure, accelerating network modernization efforts.

Reduce downtime with remote out-of-band management

Any experienced admin knows that installations and updates are risky procedures. Even with the best-laid plan, errors can occur that prevent new systems from coming online, cause integration issues with existing infrastructure, or even take down dependent network services. The risk is even greater when the upgrades occur remotely without any technicians on-site to power cycle devices or reconfigure systems offline. What if there’s an outage or severe disruption, but COVID lockdowns or natural disasters prevent staff from entering these locations? Remote out-of-band (OOB) management creates an alternative path that admins use to access remote infrastructure. It creates an out-of-band network that’s dedicated to infrastructure management and orchestration and that doesn’t rely on the availability of the production network. That means administrators can access and troubleshoot offline devices remotely, reducing the duration and impact of downtime. Remote OOB management makes it safer for large-scale organizations to implement a network modernization strategy and ensures the continued stability and availability of enterprise infrastructure.

Streamline deployments with automation

Even when new infrastructure deployments run smoothly, they take considerable time and effort on the part of network administrators. Large, global organizations have complex and highly distributed network architectures with thousands of moving parts that need to be upgraded or replaced. Just configuring and installing all of these new solutions can add significant delays to the network modernization process. Plus, configuring so many devices is tedious and prone to human error, causing more delays as admins troubleshoot and fix deployment failures. For example, a typo in an IP address on one device could prevent dependent services from deploying correctly, forcing teams to retrace their steps and waste time identifying the error. Automation is the key to streamlining device deployments and reducing configuration errors. For example, Zero Touch Provisioning (ZTP) allows admins to provision new devices automatically over the network using definition files. These files can be reused as many times as needed to deploy many identical solutions across the enterprise network, significantly reducing the time and effort required to modernize infrastructure. Plus, configuration files can be tested pre-deployment to ensure there are no errors or security vulnerabilities. Vendor-agnostic network management platforms, OOB management, and automation are crucial components of a smooth network modernization strategy. Implementing this strategy is easier if you choose a management solution that integrates all these capabilities into a single, unified platform.

Make Nodegrid a part of your network modernization strategy

The Nodegrid platform from ZPE Systems delivers vendor-agnostic control, Gen 3 OOB management, and end-to-end network automation capabilities in a single box. Nodegrid has helped large-scale organizations like the Internet Association of Australia update their network infrastructure without disrupting business. Nodegrid serial consoles support both legacy and modern Cisco pinouts, allowing them to dig their hooks into any device in your network infrastructure. That means you can use the ZPE Cloud solution to extend automation and orchestration to your entire heterogeneous architecture, supercharging your network modernization efforts. Nodegrid uses high-speed OOB interfaces (e.g., 5G/4G cellular) to provide admins with a fast and reliable connection for remote upgrades, management, and orchestration. Nodegrid allows you to power cycle devices, enter BIOS menus, manage power load distribution, and more from anywhere in the world with an internet connection. This makes it easier and safer for large-scale organizations to remotely upgrade their network infrastructure and ensures continuous management availability to prevent downtime in the future. The vendor-agnostic Nodegrid platform also allows you to extend automation features like ZTP to both legacy and modern solutions in your network infrastructure. Nodegrid supports integrations with your choice of third-party automation tools, or you can use Nodegrid hardware to directly host custom scripts and automation apps. This both streamlines the network modernization process and gives you the ability to grow and evolve your network with emerging automation technologies like AIOps. Nodegrid streamlines network modernization strategies by providing vendor-agnostic management, remote OOB management, and end-to-end automation support in a single platform. 

Want to learn more about Nodegrid’s role in enterprise?

To learn more about Nodegrid’s role in an enterprise network modernization strategy, contact ZPE Systems today. Contact Us

Using AIOps and Machine Learning To Manage Automated Network Infrastructure

shutterstock_1825096265(1)

Automation is the key to maintaining optimal network performance and availability during tumultuous times. A resilient, automated network keeps functioning even if administrators can’t physically access the infrastructure or when a recession forces companies to reduce their IT workforce. A network automation framework includes all the tools, technologies, and practices required to build a resilient and fully automated enterprise network infrastructure.

The four building blocks of a resilient network automation framework include:

  1. IT/OT production infrastructure
  2. Automation infrastructure
  3. Orchestration infrastructure
  4. AIOps

In previous blogs, we focused on the building blocks that enable network automation and orchestration. In this blog, we’ll discuss how AIOps and machine learning help teams manage their automation and orchestration—and the massive amounts of data produced by their automated systems—more efficiently.

What is AIOps?

AIOps—artificial intelligence for IT operations—was originally introduced by Gartner in 2017. It uses AI technologies like machine learning (ML) and natural language processing (NLP) to analyze IT operations data. This data is pulled in from many different sources, including monitoring and visibility platforms, environmental monitoring sensors, event logs, and firewalls. AIOps utilizes that data to automate tasks like event correlation, anomaly detection, and root cause analysis (RCA) as well as to predict future outcomes and provide valuable business insights.

What’s the difference between AI and machine learning?

Before we delve any deeper into the specific uses for and benefits of AIOps, it’s important to clarify what we mean when we talk about technologies like AI and machine learning.

AI stands for artificial intelligence, which is defined as a computer’s ability to display human-like intelligence through behaviors like learning from new data, drawing conclusions based on that data, and coming up with solutions to problems.

Machine learning, on the other hand, describes a computer’s ability to process large quantities of data and learn from it. Learning is a major requirement for AI, which means that all machine learning applications could be considered AI. However, not all AI is machine learning—artificial intelligence uses additional technology to make decisions, solve problems, and perform other automated functions.

Essentially, AI describes a broad range of technologies, whereas machine learning is a more specific subset of technologies included in the AI umbrella. In the context of AIOps, however, machine learning is often the only artificial intelligence technology in use.

Using AIOps and machine learning to manage automated network infrastructure

In an automated enterprise network, AIOps and machine learning use advanced algorithms to provide in-depth analysis of all the data collected from production infrastructure, automation components, and orchestration systems. AIOps solutions can even take things a step further by making decisions and solving problems based on the results of that data analysis.

Some examples of how AIOps and machine learning can be used to manage automated network infrastructure include:

Security

Cyberattacks and data breaches are major threats to the reliability and performance of network infrastructure. In addition to the financial losses caused by sensitive data exfiltration and reputation loss, security breaches are also a leading cause of downtime, which directly impacts business revenue. According to the ITIC’s 2022 Global Server Hardware Security survey, 76% of enterprises cited security breaches as the top cause of downtime. That means network security is paramount to the resilience of an automated infrastructure.

For many years, network security relied on signature-based detection for jobs like intrusion prevention, antivirus, and spam filtering. Signature-based detection involves comparing an incoming request to a database of known threats to see if it matches—if not, it’s assumed to be safe and allowed into the network. This approach only works if the database is kept up to date and if all incoming threats have been identified in the past. Signature-based detection often fails to catch zero-day exploits or novel malware that it hasn’t seen before, plus it tends to generate a lot of false positives.

AIOps security solutions overcome this problem by learning from past experiences. Machine learning is able to extract information from past threats and then develop algorithms to recognize, predict, and categorize a new threat that it’s never seen before. This makes AIOps adept at preventing new threats as well as detecting ones already on the network.

You can also use AIOps to analyze data from infrastructure logs and other security solutions to spot the more subtle signs of a breach that’s already happened or that’s currently taking place. For example, AIOps and machine learning may detect an unusually large amount of data leaving the network, which could indicate that a malicious actor is exfiltrating sensitive information. Another security use for AI is called User and Entity Behavior Analytics (UEBA), which inspects account activity on a network and reports anomalous behavior that could indicate an account has been compromised.

AIOps improves upon automated network security solutions by using adaptive learning and predictive analysis to detect new and unusual threats with a greater degree of accuracy. It also takes advantage of the massive amounts of data produced by security appliances and network infrastructure to identify the subtle clues left behind by sophisticated cybercriminals. This makes AIOps a valuable tool for maintaining the security and availability of an automated network infrastructure.

Monitoring

An automated network infrastructure generates a massive quantity of logs that can be used to assess health and performance as well as to identify potential issues before they cause any outages or downtime. However, humans aren’t very good at sifting through large amounts of data to figure out what’s relevant and what isn’t.

Many monitoring solutions use basic automation to help weed out important data, for example by letting admins set performance thresholds that generate automatic alerts when devices fall out of the optimal operating range. However, this kind of automation creates a lot of false positives, which are tedious to sort through and could lead to admin neglect or complacency. It can also only detect specific symptoms and issues that fall within the scope of the monitoring thresholds programmed by a sysadmin, which means it can’t adapt to changing circumstances or predict new problems that weren’t anticipated by the admin in advance.

An AIOps monitoring solution collects all the logs produced by automated infrastructure and analyzes them in real time. Sysadmins can still set performance thresholds and program automatic alerts, but AIOps also uses machine learning to “think outside the box” by recognizing patterns and detecting anomalies it wasn’t programmed to look for. That means issues are identified faster, potentially before they cause any noticeable problems for end-users.

Machine learning also gives AIOps monitoring solutions the ability to track performance over time and predict future outcomes based on historical data. For example, organizations can use AIOps analysis to plan infrastructure upgrade schedules based on when device performance is predicted to start degrading, or in advance of a predicted spike in demand for a particular location. This gives CIOs and IT managers the ability to make smarter decisions about where and when to invest money and how to prioritize new initiatives.

AIOps monitoring solutions work well with data lakes, which are large repositories for unstructured data. Data lakes are an efficient way to process large quantities of data, such as monitoring and security logs. This enables the data to be used by AIOps and other big data tools.

AIOps transforms the flood of logs generated by complex, automated network infrastructures into actionable data. Enterprises can use AIOps and machine learning to catch subtle issues before they turn into major problems, improving the performance and availability of network resources. AIOps also provides valuable business intelligence that organizations can use to make smarter and more cost-effective decisions during recessions and other tumultuous events.

Root cause analysis (RCA)

When there’s an outage or other business interruption, the main priority is fixing whatever is preventing systems from operating normally so that systems can get back online. Often, this means fixing the symptoms of some deeper underlying problem. If that core problem isn’t addressed, it’s likely to cause another outage in the future. That means administrators must perform a root cause analysis (RCA) to discover the source, come up with a fix, and document everything for future reference.

Root cause analysis involves digging through devices, applications, and service logs, which human engineers can’t do as efficiently as AI solutions. AIOps can comb through all the relevant logs to determine the most likely cause of the problem as well as recommend the best solution to fix it. Incidents are automatically generated, prioritized, and assigned to the correct team for resolution, ensuring the core problem is quickly and thoroughly fixed to prevent future outages.

Some AIOps solutions can even automatically resolve some issues without waiting for a human engineer to receive an alert, log in to the system, identify the problem, and implement a solution. This can significantly reduce the mean time to resolution (MTTR) and minimize expensive business interruptions.

Sorting through data is what AIOps does best, which makes it the perfect tool for RCA. AIOps can determine the root cause of automated infrastructure failures much faster than human admins, making it easier to fix these underlying problems before they cause future downtime. AI can even proactively implement fixes while issues are ongoing, allowing businesses to recover faster and reduce the cost of outages.

Implementing AIOps and machine learning in a resilient network automation framework

AIOps is the final layer of the network automation framework because it reduces the management complexity involved in monitoring, troubleshooting, and optimizing automated network infrastructure. Because AIOps needs to collect logs from every single component of the network automation framework, it must be a vendor-neutral solution that has access to your orchestration platform as well as all your management hardware and software. This will be much easier if your orchestration, automation infrastructure, and IT/OT management infrastructure are also vendor-neutral.

For example, the Nodegrid platform from ZPE Systems includes management devices like Gen 3 OOB serial consoles and integrated network edge routers that can bring your entire mixed-vendor environment under a single management umbrella. Nodegrid hardware is truly vendor-neutral, which means it can directly host your AIOps applications to help consolidate devices in your rack. The ZPE Cloud infrastructure orchestration platform also supports integrations with third-party and cloud-based AIOps solutions. Either way, you get network infrastructure management, monitoring, automation, orchestration, and AIOps in a single platform.

ZPE’s Network Automation Blueprint

AIOps works together with IT/OT production infrastructure, automation infrastructure, and orchestration to ensure network resiliency during uncertain times. The Network Automation Blueprint from ZPE Systems provides a reference architecture for achieving Gartner’s definition of hyperautomation as well as meeting the Open Networking User Group (ONUG) Orchestration and Automation recommendations.

Download the Network Automation Blueprint today and see how all these building blocks fit together to ensure network resiliency.

Ready to learn more about implementing AIOps and machine learning?

To learn more about implementing AIOps and machine learning with Nodegrid, contact ZPE Systems today.

Contact Us

A Guide to Infrastructure Orchestration and Automation

infrastructure orchestration and automation
As the recession continues to affect businesses across all industries, enterprise network resilience has never been more critical. The typical outage costs at least $100,000—a price tag that most companies can’t easily absorb in the current economic climate. However, decreasing business revenues have caused many companies, especially in the tech industry, to lay off large portions of their key IT staff. That means there are fewer administrators to monitor and manage network infrastructure and fewer engineers available to respond to issues and recover from outages.

Network automation is the key to ensuring 24/7 availability and optimal performance with less human interaction. A network automation framework provides all the tools and guidance needed to create a fully-automated network infrastructure that’s resilient to failure.

The four building blocks of a resilient network automation framework include:

  1. IT/OT production infrastructure
  2. Automation infrastructure
  3. Orchestration infrastructure
  4. AIOps

In previous blogs we discussed the role of IT/OT production infrastructure in network automation and how an IT/OT convergence strategy accelerates network automation. We also described the automation infrastructure components that enable end-to-end network automation. In this post, we’ll explain how infrastructure orchestration and automation build upon the previous two layers to enable streamlined, hyperautomated network resiliency. Our final blog in the series will conclude with a guide to using AIOps and other machine learning technologies to complete the network automation framework.

What is infrastructure orchestration and automation?

The infrastructure orchestration and automation layer contains the tools and paradigms used to efficiently manage and control that automation. The core components of infrastructure orchestration and automation include:

Version control

The automation infrastructure layer uses infrastructure as code (IaC) to decouple device configurations from the underlying hardware so they can be written as scripts or definition files that automatically provision network resources. In addition, this layer uses software-defined networking (SDN) to create a virtual control plane that overlays the production network infrastructure, allowing network management and optimization tasks to be written as automated scripts.

The goal of IaC and SDN is to reduce human error, speed up device provisioning, and build a more streamlined and resilient network infrastructure. However, IaC and SDN programming can be very complex, and not all sysadmins and network administrators are expert coders. In addition, an automated enterprise network has hundreds or even thousands of these definition files and scripts to store, manage, and deploy.

This is why a network automation framework should include version control in the orchestration and automation layer. Version control is a very familiar concept to programmers, especially in DevOps environments, but not all network and infrastructure teams have used it before. Version control involves storing all code in a centralized repository and then tracking and managing changes to that code.

Let’s say one administrator is responsible for configuring and maintaining the IaC definition file used to provision a particular model of Meraki AP. Here are some examples of how that workflow could break down when that one admin is out of the office for an extended period of time due to COVID-19 or gets laid off due to cutbacks in the organization:

  • Twenty new Meraki APs need to be deployed to a new site with identical configurations.
  • The existing definition needs to be updated and pushed out ASAP to patch a security vulnerability.
  • Someone discovers an error in the current version and they need to roll back to a previous configuration.

A version control system for IaC and SDN acts as the single source of truth for the entire automated infrastructure. All automation scripts and definition files are stored in one centralized location, so anyone with authorization can deploy identical devices with the push of a button. When an admin needs to change the code, those changes are tracked and can be rolled back at any time if a mistake is made. Version control systems even allow admins to leave notes explaining the reasoning or logic behind individual changes, so other team members can pick up where they left off, or in their absence, identify the root cause of issues.

Another key benefit of version control is that it facilitates the use of automated testing. QA and security analysts can run automated scans on code in the version control repository pre-production, so any misconfigurations or security vulnerabilities are identified and fixed before deployment. This reduces the risk of human error and improves the security and resiliency of the automated network infrastructure.

Version control is a core component of infrastructure orchestration and automation because it serves as the single source of truth for the entire automated network architecture.

Orchestrator

Automation is meant to make life easier, but it can be very complicated to manage on a large scale. Modern enterprise network architectures include thousands of moving parts in locations around the world and in the cloud. Automating each of these workflows means writing, testing, deploying, managing, and troubleshooting many different definition files and automation scripts. Doing all of that manually adds more work to overloaded and under-resourced network infrastructure teams, which increases the risk of something going wrong. Simply put, organizations need a way to automate their automation.

An orchestrator is a tool used to control all of the automated workflows on an enterprise network, just like a conductor orchestrates many different instruments and musicians into one cohesive symphony. An orchestrator uses management devices, like Gen 3 OOB serial consoles and SD-WAN gateway routers, to gain control over the physical and virtual network infrastructure. Administrators program the orchestrator to automatically deploy definition files or networking scripts (which it pulls from the version control system) in response to certain triggers. That means admins could potentially automate every step in every workflow, removing the need for human intervention and reducing the chance of errors.

Plus, an orchestrator can react to events much faster than even the best administrator. For example, if a spike in demand is overloading resources at one regional data center, the orchestrator can instantly deploy automated load-balancing workflows to reroute traffic before end-users notice any performance issues. This allows enterprises to maintain 24/7 network availability and performance even with reduced IT staff.

As part of a resilient network automation framework, the orchestrator should be vendor-agnostic (vendor-neutral). It needs to be compatible with all of the automation infrastructure components, as well as the production IT/OT solutions. It also needs to support all of the major third-party automation vendors, such as Ansible and Gluware, to give infrastructure teams the flexibility to use the tools they’re most comfortable with and that work best in their enterprise’s unique environment. Finally, the orchestrator needs to integrate with other tools within the orchestration and automation layer, including the version control system and the monitoring and analytics platform.

The orchestrator is what gives the “orchestration and automation” layer its name. It provides admins with the ability to automatically manage all the automated workflows that make up a resilient network infrastructure. An orchestrator reduces the risk of outages caused by human error and can automatically respond to and prevent potential issues.

Visibility & insights

It’s tempting to think of infrastructure orchestration and automation as a “set it and forget it” solution that can perfectly manage an enterprise network without any human oversight, but the technology isn’t quite there yet. Administrators need a way to monitor all the automated workflows, identify problems the orchestrator may have missed, and analyze the health and performance of the network infrastructure.

A visibility and insights platform collects logs from all the various components of the automated network infrastructure and aggregates the data in one centralized location. It provides visualizations of current device health and network performance, and may even include predictive analysis to power business insights. This gives administrators a big-picture overview of distributed, complex, and automated network architectures so they can ensure continuous availability and optimal performance.

As with the version control system and the orchestrator, the visibility and insights solution needs to be vendor-agnostic so it can dig into every single hardware and software solution in the automated network infrastructure. In a resilient network automation framework, the vendor-neutral version control, orchestrator, and visibility solutions are all combined in a single platform.

Infrastructure orchestration and automation with a single platform

A unified infrastructure orchestration and automation platform like ZPE Cloud simplifies the control and management of a fully-automated enterprise network. ZPE Cloud uses Nodegrid hardware—such as Gen 3 OOB serial consoles and integrated network edge routers—to deliver orchestration and automation to large, distributed, multi-vendor network infrastructures. The ZPE Cloud management app supports integrations with your choice of third-party version control and infrastructure automation solutions, or you can use Nodegrid hardware to directly host your automation software.

With ZPE Cloud, you also get comprehensive monitoring data on all connected infrastructure, plus, you can use Nodegrid environmental monitor sensors to gain insights on conditions in remote data centers and network closets.

ZPE’s Network Automation Blueprint

Infrastructure orchestration and automation works together with IT/OT production infrastructure, automation infrastructure, and AIOps to ensure network resiliency during uncertain times. The Network Automation Blueprint from ZPE Systems provides a reference architecture for achieving Gartner’s definition of hyperautomation as well as meeting the Open Networking User Group (ONUG) Orchestration and Automation recommendations.

In a future blog post, we’ll discuss the remaining building block of the Network Automation Blueprint in depth. In the meantime, you can read about IT/OT production infrastructure and automation infrastructure, or click here to get a sneak peek of the blueprint, which includes a 10-step checklist to get started with automation now.

Ready to learn more about infrastructure orchestration and automation?

To learn more about infrastructure orchestration and automation with ZPE Cloud and Nodegrid, contact ZPE Systems today.

Contact Us

How to Implement Zero Trust: Technologies to Shield You From Million-Dollar Losses

Staff on laptop with zero trust security in place.

How to implement zero trust security is a growing focus of organizations across the globe. With cyber attacks frequently hitting some of the largest companies and threatening entire economies, it’s no wonder why comprehensive network security is a top priority among public- and private-sector entities.

In this post, we’ll show you what you need to implement zero trust security, from big-picture items to individual technologies.

But first, here’s a recap of zero trust security and why your business won’t be safe without it.

Why you need Zero Trust Security

Imagine bringing in a new hire to your department. Soon after, you notice suspicious computer slowdowns and applications that don’t respond as usual. You dive into your program files and discover an unknown .exe file, and you dive deeper to discover attackers actively exploiting your resources. You quickly pull your team together to lock down your network, sanitize every computer and connection, and send out a company-wide instruction to have every employee reset their password.

It turns out, your newest employee unknowingly clicked a bad link and opened the door for a trojan horse attack. But because of your quick response, no significant damage was done and you can rest easy again.

Months later, you come in for your normal workday only to find all your systems locked and unresponsive. Dave, a senior engineer, retired on the day of the attack and never reset his password. The hackers stole his credentials and have gone unnoticed for months. Now your company and its customers are compromised, and the consumer markets you serve are in a frenzy due to a shortage of goods. You can’t help but feel somewhat responsible for the entire ordeal.

This example mimics recent real-world cyberattacks and highlights the importance of moving away from traditional security approaches.

Traditional architecture uses the castle-and-moat security approach. Once a user gains access (crosses the moat), they become trusted to use your organization’s resources (the castle). Aside from the occasional password reset or other authentication protocol, this approach leaves plenty of opportunities for outsider and insider attacks. Zero trust security, however, places a moat around every node and user. This means that no matter how often a system or user needs to access a resource, they always have to verify their identity and intent.

In other words: never trust, always verify. In our example above, implementing simple two-factor authentication could have alerted Dave to his stolen credentials, which would have prevented the attack.

The need for zero trust is due to the explosion of distributed networking. Communications used to be straightforward and centralized: a trusted user using a trusted device would connect from a trusted office location to the data center. Apps and data were securely transmitted between parties, and sealing out attackers could be as simple as deploying a new point solution or product. But user expectations changed all this; now, they need to connect from anywhere using a variety of devices, which means the modern network includes SaaS, cloud, and third-party platforms. This hybrid infrastructure means there are now more nodes and lines of communication than ever — and each is vulnerable to attack.

If the recent attacks on SolarWinds, Microsoft Exchange, and Colonial Pipeline aren’t convincing enough, consider the latest hack involving Kaseya, an American company that specializes in IT and network management software. By exploiting the virtual systems/server administrator (VSA), attackers were able to compromise up to 1,500 of Kaseya’s customers, shutting down educational services, law firms, and an outpatient surgical center in South Carolina.

Pervasive attacks like these have prompted political action, with the President signing a cybersecurity executive order this past May. Read our breakdown of the legislation and how it aims to improve cybersecurity across public and private sectors.

Now that you know why you need better security, how do you implement zero trust?

How to implement Zero Trust: The big picture

Zero trust is merely a concept, however implementing Zero Trust Network Access (ZTNA) means putting this concept to work. Implementing ZTNA involves two parts:

  • The processes, which we covered in a previous post, and
  • The technologies, which we’ll talk about in this post

At a high level, this diagram shows the components you need when considering how to implement zero trust.

A high level diagram of the three main components of zero trust security, including the enterprise resource, policy enforcement point, and policy decision point.

There are three major components to look at in the big picture of zero trust security:

  1. Enterprise resource — This includes all the IT stuff you need to protect and that your business relies on, like hardware, software, and network equipment. In simple terms, this is like the gold that you keep carefully guarded in the center of your castle.
  2. Policy enforcement point — This is the datapath element that enables, monitors, and terminates connections between users / devices / applications and enterprise resources. Simply put, this is like the guard that accompanies those wishing to access your gold.
  3. Policy decision point — This is the layer that decides who / what is safe and grants / revokes access accordingly. In other words, this is the gatekeeper who determines who is allowed into your castle.

To better understand these, here’s a closer look at each:

Enterprise resource

This component is pretty straightforward, and consists of elements you need to operate and manage IT environments. These elements can include hardware like computers and data storage devices; software such as web servers, content management systems, and operating systems; and network equipment like servers, routers, firewalls, and out-of-band devices.

 

Policy enforcement point

This component consists of the datapath elements that enable, monitor, and terminate connections between subjects (users / devices / applications) and your enterprise resources. Though this is represented as one component, it is comprised of two parts that are both typically used in deployments. These parts are:

  • A client-side agent, usually deployed on a laptop or server.
  • A resource-side gateway, which controls access in cases where a client-side agent is not used. Examples where gateways are used include regulated healthcare equipment, ATM machines, and operational technology equipment.

 

Policy decision point

This component is the management and orchestration layer. This layer essentially checks identities to verify who is safe, and assigns policies to determine who gets access and to what. This is also represented as one component but is comprised of two parts:

  • Policy engine — This is the engine that decides whether a machine or web traffic is safe. To accomplish this, the engine uses a variety of data sources when making its determination, such as PKIs and identity management providers, CDM systems, and activity logs.
  • Policy administrator — This administrator uses the policy engine’s determination to grant or revoke access to a machine or web traffic.

There are many tools available to help you monitor and visualize traffic, so you can create policies and configure your policy decision point to meet your zero trust outcomes.

In order to create your zero trust configuration, you need to deploy several essential technologies.

How to implement Zero Trust: Essential technologies

Zero trust is a complete re-imagining of network security and can be a daunting task. But when you add its fundamental technologies to your toolkit, you can effectively build the three components described above and achieve Zero Trust Network Access (ZTNA). Here are the essential technologies you need to accomplish this.

 

Identity and access management

Such a big part of zero trust security relies on verifying that a device or user really is who they say they are. For this, you need an identity management solution from a trusted provider and public key infrastructure (PKI). This allows you to essentially create and issue a digital fingerprint for every user, and includes information such as their username, role, and other unique data. Multi-factor authentication is a critical component of identity verification, which requires users to present two or more pieces of identification/verification before granting access.

Additionally, access management is an important piece that determines a user’s authorization level, or in other words, which resources they can access. Identity and access management both feed information into your zero trust model’s policy engine.

 

Policy management

Another essential technology to have is a policy management solution. This is integrated into your security stack and serves as a single policy creation point. This allows you to define access and authentication policies for your entire organization.

You can specify data access rules for users, devices, and roles, which is vital to achieving micro-segmentation, limiting lateral movement, and enforcing least-privilege access. All of these feed into your policy engine and are used by your policy enforcement point to validate whether a session is allowed to continue.

 

Zero trust equipment and applications

Tying everything together requires equipment and applications that are able to enforce your policies. These are physical or virtual solutions that sit in front of servers and serve as your enforcement points. For example, this could be your next-gen firewall (NGFW) that initiates the multi-factor authentication protocol, verifies a user’s identity, and uses your defined policies to restrict the user’s access to a specific segment of your network.

Where can you get these essential Zero Trust technologies?

When considering how to implement zero trust, keep in mind that there are many vendors who can provide you with the essential technologies.

  • Obtaining an identity and access management solution is the easiest task when implementing zero trust. Many organizations offer an identity store, such as Azure Active Directory or Google Cloud Identity. You can also use companies dedicated to identity management, such as Duo, Okta, or Ping Identity. Keep in mind that if you need to control third-party access, such as for customers or equipment management contractors, you’ll need a solution that can access multiple identity stores simultaneously.
  • Obtaining a policy management solution requires careful consideration and should be part of your overall security stack. Look for a solution that allows you to create policies and set up datapath enforcement points. An adequate framework enables you to create authentication and post-authentication access rules, with an enforcement point that segments your network and continuously authenticates sessions. This security stack can be an on-prem NGFW, or delivered via the cloud using a Secure Access Service Edge (SASE) model, both of which are available from trusted providers like Palo Alto Networks.
  • Regardless of whether you use an on-prem or SASE model, you need an edge infrastructure platform to sit in front of servers and host the enforcement point. For on-prem, this platform must be able to host an NGFW to secure network segments and VLANs. For SASE, this platform must be able to create VPN tunnels to your SASE platform, which can be used for inline inspection and policy enforcement. Either approach requires powerful computing capabilities and a flexible operating system to accommodate workloads for detecting, analyzing, and automatically responding to threats, which few vendors offer.

Here are examples of what proper zero trust implementations look like, with ZPE Systems’ Nodegrid as the edge infrastructure platform:

Implementation diagram showing how to implement ZTNA at the data center using Nodegrid.

In this diagram, you can see where ZTNA and Nodegrid fit into the scheme at the data center. The user connects via Internet, and the Nodegrid SR device serves as the Policy Enforcement Point hosting a VM. This VM communicates with the Policy Engine to authenticate the user, and then grants access to the data center application.

Implementation diagram showing how to implement ZTNA at a branch, edge, or other distributed location.

In this diagram, the user tries to connect to an application at a branch, edge, or other distributed location. The user connects via Internet, where SASE and ZTNA provide secure connectivity. The Nodegrid SR device connects via VPN to the Policy Engine for authentication, and then grants access to the branch application.

How to implement Zero Trust: A recap

To protect your organization, implementing zero trust requires you to build out the main components. With the policy decision point and policy enforcement point in place, you can secure your enterprise resources from outsider and insider attacks. Ensuring these components work like a well-oiled machine means you need the proper identity and access management tools, a complete policy management solution built into your security stack, and equipment and applications that can enforce your zero trust security policies.

Because user expectations have caused infrastructure to become incredibly distributed and complex, the attack surface has increased dramatically. The traditional castle-and-moat approach to security is no longer adequate, and recent newsworthy cyberattacks showcase the network vulnerabilities that even the largest companies still struggle to address. The President’s latest cybersecurity executive order is a step in the right direction to bolster infrastructure protection for public and private sector entities, and you can use this blog as a starting point to begin your zero trust journey.

Don’t get caught without these 5 security must-haves

Watch our webinar, Cyberattacks: 5 Security Must-Haves for Hybrid Infrastructure Gateways, and learn how to lay a solid foundation that makes implementing zero trust easier. Our experts will talk you through how to:

  • Keep edge networks and users fully protected
  • Make smart buying decisions
  • Get complete security and control for years of serviceability

Watch now to protect your business from growing cybercrime.