Providing Out-of-Band Connectivity to Mission-Critical IT Resources

Part 1: Immutable Infrastructure: Challenges Your Company Needs to Be Aware of

shutterstock_1299826528

Immutable infrastructure refers to the critical network resources and systems that make up your infrastructure and that are never updated, changed, or fixed in any way—they stay exactly the same. If something needs to be modified, the entire system or device is replaced by a new one. While this approach has many advantages for organizations, there are still some immutable infrastructure challenges you’ll need to overcome.

Mutable vs immutable infrastructure

Traditional infrastructure deployments are mutable and continuously change in place. Sysadmins and network engineers will constantly deploy patches, modify configurations, and install new software on systems and devices while they’re actively in use. The benefit of this approach is that you don’t need to create entirely new server instances or network deployments every time you want to change something.

However, mutable infrastructure does create some risk. For example, what if you deploy a patch that breaks a core function? What if some new code introduces a security vulnerability to the system? How about if an in-place upgrade fails halfway through and you end up with an unplanned version of the configuration? With mutable infrastructure, you’re stuck troubleshooting the issues and attempting to deploy fixes on systems and devices actively in use.

On the other hand, immutable infrastructure is frequently copied, deleted, and recreated without making changes to the systems currently in use. Configurations are abstracted as software code and managed from a centralized location that’s physically and logically separate from the target infrastructure. This code can be copied and deployed to many different targets as frequently as necessary. The environments themselves are virtualized (and often containerized) which creates an additional abstraction layer from the underlying hardware. This also makes it possible to copy, delete, and recreate instances as needed.

When an infrastructure as code (IaC) or software-defined networking (SDN) configuration needs to be updated, a new version of the code is written, deployed to a new instance, and tested to ensure functionality and security. Then, traffic is redirected to the new instance and the the old one is simply deleted. If a virtualized or containerized environment fails, or is compromised by a hacker, you can delete it and replace it with an exact copy with minimum hassle.

Immutable infrastructure is becoming popular among DevOps and NetDevOps organizations that use IaC and SDN to integrate resource provisioning directly into the software development pipeline. While this approach has clear advantages—including security improvement, IT complexity and failure decrement, and easier troubleshooting than mutable infrastructure—there are also some immutable infrastructure challenges.

Immutable infrastructure challenges

The immutable infrastructure paradigm was initially conceptualized for hyperscale and enterprise data center deployments. It relies on software-defined technology stacks and orchestration solutions that automate deployment and provisioning. The challenge comes when you need to venture outside of this ideal deployment, as is the case for many organizations.

Modern enterprise networks are shifting away from massive, centralized data centers because modern enterprises are themselves less centralized than they used to be. As operations become more globalized and remote, distributed workforces evolve the norm, and enterprises deploy infrastructure closer to the network edge. Edge network infrastructure is deployed to small local data centers, branch offices, remote warehouses, and other distributed locations. Often, these smaller deployments rely on hardware-based appliances, servers, and legacy equipment.

This creates some significant challenges when you try to shift to immutable infrastructure, including:

  1. Extending the software-defined network automation and orchestration to remote locations outside your enterprise network.
  2. Bringing the orchestrator’s hooks into all of your disparate legacy hardware solutions.
  3. Finding a way to apply immutable principles to this mutable hardware-based infrastructure.

Solving immutable infrastructure challenges

Immutable infrastructure requires centralized orchestration of software-defined technology, so you need to apply SDN to WAN architecture to bring immutable to the edge. This is called SD-WAN, or software-defined wide area network. SD-WAN decouples the management of your WAN from the underlying hardware, so you can use orchestration to control distributed WAN architecture.

However, SD-WAN only gets you to the perimeter of your edge networks. To use immutable infrastructure effectively, you also need to extend the orchestrator’s reach into the branch and edge LANs. You can achieve this through SD-Branch technology, which gives you software-defined control over the internal networking infrastructure of remote architectures.

The second goal is to ensure that your orchestration solution can see and control every piece of your edge architecture, even legacy systems not designed with automation in mind. The SD-WAN/SD-Branch gateways and console servers you install at the edge need to support legacy pinouts and integrate with third-party hardware and software. If the edge connectivity solution can’t say yes to every component of your distributed network infrastructure, you’ll have gaps in the software-defined orchestration coverage.

The third task is to turn mutable hardware into immutable infrastructure, which you can accomplish through virtualization. In the same way that a single physical server can be turned into many different virtual machines, you can use network functions virtualization (NFV) to turn physical networking appliances into virtualized solutions. NFV creates an abstraction layer that separates the underlying hardware’s routing, switching, load-balancing, and other management functions. This allows your orchestrator to manage these functions automatically and create, copy, delete, and recreate network configurations at will without worrying about the mutable hardware.

The tricky thing about solving each of these challenges is that you need a truly vendor-neutral solution to make it all work. For example, if you have different branch gateways in different locations, you need to ensure that the SD-WAN/SD-Branch platform will integrate with all of them. Otherwise, you’ll need to manage multiple software-defined technology stacks, or you’ll lose the ability to apply immutable principles consistently across your entire distributed network.

The network functions virtualization platform also needs to support all of your disparate vendor hardware and legacy architecture; otherwise, you won’t be able to turn all mutable infrastructure into virtualized, immutable solutions. Plus, the orchestrator needs to integrate with your NFV platform as well as all edge hardware and software, to have full coverage.

Many immutable infrastructure solutions fall short of true vendor-neutrality. That means, to use them effectively, you have to upgrade your edge infrastructure hardware and software to compatible versions. This is an expensive and time-consuming endeavor and one that creates a massive roadblock for globally distributed enterprises hoping to adopt immutable principles.

Nodegrid brings immutable infrastructure to edge networks

ZPE Systems can help you bring immutable infrastructure to your edge networks with the vendor-neutral Nodegrid platform. Nodegrid’s powerful, all-in-one branch gateways give you the best of both worlds: you can use our powerful SD-WAN and SD-Branch technology or directly host your choice of third-party software-defined networking solutions. The modular design of the Nodegrid Net Services Router (NSR) also gives you added capabilities like edge compute, terminal server, NetDevOps, and more.

The vendor-neutral ZPE Cloud orchestration platform can say yes to every component of your distributed network architecture, including legacy hardware appliances and systems. ZPE Cloud gives you complete control over your mutable hardware, making it possible to apply software-defined orchestration to even the smallest branch deployments.

Plus, all Nodegrid devices run on the vendor-neutral, Linux-based Nodegrid OS with support for NFV. You can use Nodegrid OS to virtualize every piece of the edge networking stack, turning mutable branch hardware into immutable, automated solutions.

Learn how Nodegrid can solve your immutable infrastructure problems.

Call 1-844-4ZPE-SYS to see a demo.

Contact Us

Why cybersecurity can make you feel lost in space

Space Odyssey – Frank Poole

Cybersecurity has been a hot topic for years. With many high-profile breaches, malware attacks, and pricey payouts, it’s no wonder why companies continue to add more and more protection for their IT systems.

Despite this, hackers continue to succeed at exploiting vulnerabilities. Why are there still vulnerabilities in the first place? All it takes is one weak spot and one bad actor (looking at you, HAL 9000) to lock you out and leave you scrambling to regain control.

In this post, we’ll cover how network infrastructure has evolved in the past decade, why cybersecurity can make you feel lost in space, and why recovery is crucial to modern cybersecurity.

 

How network infrastructure evolved

The movie 2001: A Space Odyssey predicted that by the year 2001, technological advancements would enable things like space travel and virtual conferencing. In reality, we were still rolling around in gas-powered cars or waiting for 56kbps dial-up connections to load our email inboxes.

Times were simpler, but that also meant that network infrastructure and cybersecurity were simpler. Most people would go to work at a physical location like an HQ or branch office, and distributed or remote work technologies were very much in their infancy. This meant that network infrastructures were more simple and localized, usually requiring a simple MPLS connection from their off-site data center (if they had one) to their branch offices. Cybersecurity was simple: it was either inherent to the connection type (like MPLS), or required something like a basic firewall or encryption method.

Network architecture showing simplicity of data center connected via MPLS to branch office

Fast forward more than 20 years, and the network infrastructure common to 2001 is barely recognizable. With customers and employees demanding companies adapt to their on-the-go and remote-work lifestyles, the network infrastructure exploded, causing a sort of Big Bang of cybersecurity as we know it today.

Network architecture showing complexity of data center, CDN, remote user, branch office, all connected via many paths

Modern networks need to serve many branch offices and remote locations, and the only way to succeed is by incorporating a myriad of on-prem, cloud, and SaaS solutions. This creates a hybrid infrastructure of data, security, networking, and computing distributed everywhere. In other words, the attack surface continues to expand much like the universe itself, and security professionals have been struggling to contain all the vulnerabilities left in its wake.

 

Why cybersecurity makes you feel lost in space

You might relate to Frank Poole. In the movie, the HAL 9000 supercomputer leads Frank to perform a spacewalk in order to repair a portion of their ship. While Frank floats toward the ship, the corrupted HAL takes control of an EVA pod and slams it into Frank, causing him to tumble helplessly through the black void of space and eventually meet his demise.

Frank Poole death

Trying to secure your IT infrastructure can make you feel just as helpless and out of control. That’s because cybersecurity presents several challenges that make it difficult to gain your footing. And with 2021’s executive order regarding zero trust security, cybersecurity seems even more daunting as previous protection methodologies are becoming wholly obsolete.

Here’s a brief look at some of the challenges of modern cybersecurity.

 

Too many products

Regardless of your industry, there are so many security products to choose from that it can easily feel like you’re floating amongst an endless sky of stars. It’s difficult enough choosing properly secured servers, routers, storage devices, and other physical equipment. Add on the other crucial pieces of the modern network architecture, and it’s easy to make a full time job of researching, comparing, and selecting the right cloud and SaaS security products. Here’s a list that barely scratches the surface of different types of security products to choose from:

  • Firewalls & next-gen firewalls (NGFWs)
  • Security information and event management (SIEM) systems
  • Identify and access management (IAM) products
  • Pen testers
  • Data analytics
  • Intrusion prevention and detection systems (IDPS)
  • Endpoint protection apps
  • Database security solutions
  • Ransomware/malware detection and removal
  • Authentication and single sign-on

 

Too many vendors

All of these products have to originate from somewhere, which brings us to the next challenge: there are too many cybersecurity vendors to choose from. This isn’t necessarily a bad thing, since competition creates better products, but it does complicate the cybersecurity professional’s journey to achieving holistic protection.

At RSA Conference 2022, for example, there were 450 security exhibitors present, 70 of which were funded well enough to afford the cost of a booth. During the show, many discussed that in the previous 18 months there were 1,800 new cybersecurity vendors that received funding to be installed in networks. The TL;DR — this multi-vendor ecosystem will persist (and probably grow even more), and so will the challenge of achieving holistic security.

Of course everyone wants the best of the best, which might draw your attention to staples like Cisco, Fortinet, and Palo Alto Networks. But because the modern hybrid infrastructure is so diverse, there now exist so many niche products available from thousands of vendors. In fact, CyberDB compiled a database that includes more than 3,500 security companies from the United States alone.

Here’s a graphic that puts into perspective just a fraction of the available vendors:

so many security vendors

 

Too many gaps

The third and most important challenge stems from the first two above: there are just too many security gaps to address. Part of this problem is due to the diversity of hybrid infrastructure. But once you’re able to identify the gaps, you’ll find that addressing these will more often than not create even more gaps.

That’s because there’s no single vendor or suite of products that provides holistic cybersecurity. You deploy a variety of products but inevitably run into interoperability issues, which only perpetuates more vulnerabilities as you add more solutions to address these gaps.

What you end up with is a plethora of solutions that are secure themselves, but that don’t provide protection for your infrastructure as a whole.

 

Why recovery is key to modern cybersecurity

According to a Sophos survey, 66% of surveyed organizations suffered ransomware attacks in 2022. And when attacks happened, 70% of organizations needed more than two weeks to recover. Ransomware is the modern disaster, which makes minimizing recovery times an essential part of modern cybersecurity.

Recall the Fortinet 7.0 CVE from 2022. Customers upgrading to the then latest release of FortiOS suddenly found themselves vulnerable to an authentication bypass, where attackers could gain admin access using certain HTTP/S requests. This typical scenario leaves IT teams waiting for a solution while their business remains vulnerable. What’s needed is the ability to recover quickly and automatically, whether from an active attack or an at-risk configuration.

 

Get the blueprint for fast recovery times

Big Tech companies have spent years building this capability into their infrastructure. At ZPE Systems, we’ve directly collaborated with these companies and have created best practices based on these proven architectures. This Network Automation Blueprint details the components and practical steps to take, from automating IT/OT production infrastructure, to implementing an effective design for orchestration and automation environments.

The blueprint is your template to achieving fast recovery times and reducing your risk of attack. Download the blueprint now.

 

Watch the blueprint recover a failed upgrade

Watch this tech demo from Tech Field Day 26, where Rene Neumann shows how the blueprint helps you recover a failed device upgrade in minutes.

Supply Chain Security Risk Management Best Practices

supply chain security risk management

A supply chain attack is when cybercriminals breach your network by compromising an outside vendor or partner. Often, these attacks exploit a weak link in your trusted ecosystem of third-party software, hardware, and integrations. A hacker will, for example, use a compromised vendor service account—one using the same username and password across many different client systems—to infiltrate all the third-party networks with privileged access for the vendor.

Several high-profile incidents like the SolarWinds attack and the Microsoft Exchange exploit illustrated how even the largest and most respected firms can introduce risk to your supply chain. In this post, we’ll use these examples to highlight the security challenges posed by supply chain attacks before providing supply chain security risk management best practices and solutions to protect your enterprise.

Supply chain security risk management challenges

→  SolarWinds attack uses trusted infrastructure monitoring software to compromise customer systems

In early 2020, an advanced persistent threat infiltrated a SolarWinds update server—It was a highly sophisticated group of hackers allegedly acting on behalf of a foreign state. The hackers injected malicious code into new builds of SolarWinds’ Orion platform, which thousands of customers use to monitor their critical IT infrastructure. These infected updates were unknowingly pushed out to over 18,000 customers, creating 18,000 backdoors for hackers to breach.

Once the compromised software was installed on target networks—including U.S. government agencies like the Department of Homeland Security and tech giants like Microsoft—attackers used these backdoors to steal identities and tokens to impersonate real users. Then, they were able to sidestep multi-factor authentication and spread laterally within affected networks, causing untold damage in their wake.

The full fallout and consequences of the SolarWinds attack are still unfolding more than two years later. This supply chain attack was devastating since they used the exploited software to monitor network infrastructure. That means hackers had privileged access to the most sensitive, vulnerable, and critical systems on affected networks. In addition, the advanced persistent threat used sophisticated techniques to bypass MFA and impersonate authorized users, making it extraordinarily difficult to track and prevent their movements.

The SolarWinds attack proved a few critical things about supply chain security:

  1. The infected software could contaminate customer systems, meaning intrusion prevention and anti-malware software weren’t advanced enough to detect the malicious code.
  2. The hackers were sophisticated enough to bypass MFA and other advanced authentication technologies, showing that these measures alone aren’t sufficient to prevent accounts from being compromised.
  3. The hackers were able to use those compromised accounts to freely move around on breached networks, illustrating the need for internal defenses and trust verification.

 

→  Microsoft Exchange attack exploits vulnerabilities on legacy on-premises systems

In early 2021, hackers used multiple zero day exploits to attack the on-premises version of Microsoft Exchange. They compromised servers at over 30,000 organizations in the United States, accessing email accounts and installing web shell malware. They used this malware to remotely access server functions and jump to other connected systems. In addition, hackers could use compromised email accounts as conduits to infect other organizations. They would look for a high-value contact in a compromised account’s contact list (for example, an executive at a major financial firm) and then send phishing messages and infected attachments to that target, extending their reach even further.

One of the reasons this attack was so successful is because hackers targeted on-premises Exchange implementations on legacy systems. Many organizations that still use legacy Exchange servers are less technically-savvy than those that jumped to the cloud. They may not have a large team of specialist admins and security engineers monitoring servers and applying regular updates to legacy systems. That made it easier for hackers to exploit unpatched vulnerabilities, and gave them more time to execute their endgame (infecting higher-value targets at connected organizations) before being detected.

So, what lessons have we learned from these two incidents, and how can we apply them to supply chain security risk management?

Supply chain security risk management best practices

These high-profile events illustrated a few key challenges:

  1. Many signature-based intrusion detection and security monitoring solutions aren’t sophisticated enough to detect zero-day exploits and novel malware.
  2. Cybercriminals can outsmart MFA and other advanced authentication methods, so they must be layered with other security controls.
  3. Organizations must not neglect internal defenses (including trust re-verification and network segmentation) because they’re crucial for preventing the spread of infections and compromised accounts.
  4. Organizations must have a plan for adequately monitoring, patching, and controlling legacy systems on their enterprise network.

Zero trust security

Zero trust security is a supply chain security risk management best practice due to its guiding principle of “never trust, always verify.” Zero trust creates a multi-layered defense of highly specific security policies and controls that focus on preventing breaches and limiting their damage once they’ve already occurred.

Next-generation firewalls (or NGFWs) use advanced machine learning and artificial intelligence technology to monitor network traffic for threats. Rather than relying on a signature database of known threats (which can’t account for zero-day exploits and novel malware), they use deep learning and other AI technology to analyze traffic with greater accuracy. NGFWs also enable network microsegmentation and may even include UEBA.

Microsegmentation is the zero trust practice of grouping systems and resources into small logical network segments. Microsegmentation allows you to create highly specific micro-perimeters of security policies and controls around each network segment. This ensures that all network resources are accounted for and adequately protected, and also allows you to reverify an account’s trust as they move from microsegment to microsegment.

User and Entity Behavior Analytics (or UEBA) technology monitors the behavior of entities (accounts, devices, applications, etc.) on your network. It uses machine learning to establish baselines of normal behavior, allowing it to analyze entity activity in real-time contextually. If an account or device behaves suspiciously, UEBA can block access, alert security, and/or force that entity to re-establish trust before letting it access another microsegment.

Even if the malicious code injected into the SolarWinds Orion updates made it past your NGFW’s initial defenses, zero trust security tools and practices will limit the attacker’s movement inside your enterprise network. Microsegmentation, aided by technology such as UEBA, would force a compromised account to re-establish trust before accessing additional resources while alerting security personnel to a potential breach.

Legacy modernization

Sounds scary and expensive, but it’s a critical process for securing on-premises and hybrid network environments. Obviously, the best-case scenario would be to replace your existing out-of-date hardware with newer systems or to migrate all your legacy services to the cloud, but that’s not realistic for many organizations. A more cost-effective way to modernize legacy systems is centralized infrastructure management, monitoring, and orchestration.

An infrastructure management platform that can hook into all your legacy, on-premises, data center, and cloud systems will help you ensure your entire architecture is always patched and secure. Your engineers won’t have to jump from box to box or switch between on-premises and cloud monitoring systems, increasing the efficiency they can maintain and control every piece of your infrastructure. Legacy modernization with unified infrastructure orchestration would have enabled engineers to patch on-premises Exchange vulnerabilities and detect the signs of a breach much faster.

Zero trust security and legacy modernization would have reduced the impact of these supply chain attacks and are critical for preventing similar events from occurring in the future. ZPE Systems can help you implement supply chain security risk management best practices through our Nodegrid family of secure infrastructure management solutions.

All Nodegrid hardware and software are protected by the Zero Trust Security Framework Foundation, with features like secure boot, geofencing, and up-to-date OS kernels and encryption modules. Nodegrid is vendor-neutral and supports integrations with your choice of NGFW and security software. Plus, Nodegrid supports legacy pinouts, so you can connect your on-premises infrastructure to the Nodegrid Manager or ZPE Cloud network orchestration solutions.

Learn more about supply chain security risk management best practices:

→   What Are the Key Zero Trust Security Principles?
→   The Importance of Micro-Segmentation for Zero Trust Networks
→   Data Center Modernization Strategy: How to Streamline Your Legacy Environment

Learn how Nodegrid supports supply chain security risk management best practices.

Call 1-844-4ZPE-SYS or contact us to view a demo.

Contact Us

Why You Need a Next-Gen OOB Console Server

oob console server

An OOB (out-of-band) console server is a fundamental data center tool that allows you to view, manage, and troubleshoot critical remote infrastructure on a dedicated network connection.

While the functionality of generation 1 console servers is limited, generation 2 models evolved to include features like automation and security. Now, as more enterprises embrace NetDevOps, there’s a need for greater automation and orchestration, which is why next-generation or generation 3 console servers are emerging.

In this post, we’ll discuss the advantages of a next-gen OOB console server and how these devices address the challenges and limitations of previous generations.

The importance of an OOB console server

An out-of-band console server may also be referred to as a serial console, serial console server, or serial console switch. There are also OOB serial console routers which include gateway routing functionality for small branch offices and use cases for edge data centers.

OOB console servers are tools fundamental for data center infrastructure management; they connect to all your remote network devices and give you the ability to control them on a dedicated management network remotely. This network is completely separate from the WAN circuit and internal LAN, and is accessed typically via cellular, dial-up, or DSL modem.

Out-of-band data center access is crucial for a few key reasons:

  1. It provides 24/7 remote access to your critical data center infrastructure even if your WAN link goes down, allowing you to troubleshoot and recover without expensive truck rolls.
  2. You can still view and manage remote devices even if malicious actors compromise your production network or data center infrastructure without exposing yourself.
  3. Conducting resource-intensive network orchestration on a dedicated management plane reduces the performance impact on your production network and end-users.

Why do you need a next-gen OOB console server?

As modern enterprise networks have grown more complex and distributed, so have network and data center management workflows. This complexity makes it harder for engineers to efficiently manage their workloads and increases the risk of human error, especially with multi-vendor and hybrid network infrastructures.

These pain points led to the evolution of automated network management tools and solutions. Automation increases the speed and efficiency with which network administrators can provision, monitor, and optimize an infrastructure while reducing the risk of human error. Gen 2 OOB console servers have automation capabilities and scripting support that help fill the gap for data center management. Plus, Gen 2 serial consoles automate tasks like infrastructure provisioning (via zero touch provisioning, or ZTP) and basic troubleshooting (such as refreshing DNS or power-cycling) to reduce the amount of tedious manual work.

However, the needs and pain points of modern enterprises continue to evolve. It’s not enough to use individual, disparate scripts and solutions to automate specific tasks or workloads, especially to achieve NetOps or NetDevOps transformation. Gen 2 OOB console servers offer some automation support, but typically limit you to a particular vendor ecosystem or API library. Since enterprise networks consist of many different vendor solutions and devices, this rigidity leaves you with gaps in your automation coverage.

That’s why a new generation of console servers is rising to meet this challenge. Next-gen OOB console servers, also known as Gen 3, promise to deliver end-to-end automation and NetDevOps data center orchestration.

What to look for in a next-gen OOB console server

For an OOB console server to be truly next-gen, it must be able to dig its automation hooks into every device and solution in your rack. That means it needs to be vendor-neutral and include support for legacy systems not originally designed for automation.

In addition, a next-gen OOB serial console switch should support integrations with the third-party automation and orchestration tools of your choosing. That means both the hardware and software need to be vendor-neutral.

A next-gen console server should also provide high-speed OOB access and failover. Many Gen 1 and Gen 2 solutions use dial-up or 3G cellular connections, which can be slow and unreliable. Plus, 3G will be phased out (in the United States) by the end of this year. This leads to frustration when engineers try to troubleshoot and restore remote data center infrastructure as quickly as possible, and also hampers automation and orchestration efforts.

Another issue to consider is scalability. A next-gen OOB console server needs to provide enough managed ports for you to grow your data center infrastructure without needing to upgrade your management device continuously. You can even get modular serial consoles that allow you to expand or swap out port configurations as needed.

Last but not least, your next-gen console server needs to include and support advanced security controls. Imagine installing a preconfigured device that has unknowingly been infected. This could be like installing a trojan horse into your infrastructure. A next-gen OOB console server should include enterprise-grade security features and integrate with zero trust security controls and policies.

Orchestrating critical data center infrastructure with a next-gen OOB console server

Next-gen or Gen 3 OOB console servers deliver end-to-end automation and orchestration capabilities, so you can efficiently control complex data center infrastructure. A next-gen solution includes vendor-neutral hardware and software, high-speed OOB access and failover, the ability to scale up or down as needed, and enterprise security features and functionality.

The Nodegrid next-gen OOB console server solution from ZPE Systems delivers true end-to-end automation for critical data center infrastructure. Nodegrid’s vendor-neutral hardware and software can control all your vendor solutions, so there are no barriers to automating anything and everything. For example, Nodegrid zero touch provisioning (ZTP) can extend to all connected devices, allowing you to deploy remote data center infrastructure with the push of a button.

The Nodegrid Serial Console S Series can even control legacy and mixed environments, so you can upgrade your data center infrastructure at your own pace without losing automation capabilities. The open architecture, Linux-based Nodegrid OS supports integrations with third-party automation solutions so you can create a customized orchestration platform that suits your enterprise’s unique use cases and staff skillsets.

Nodegrid delivers high-speed remote out-of-band access and failover via two dual-SIM high-speed 4G/5G/LTE slots, plus you can upgrade to 5G without having to do a forklift upgrade. With up to 96 managed ports in a streamlined 1U rack-mounted device, the Nodegrid Serial Console Plus can handle enterprise-scale deployments or scale with you as you grow. The Nodegrid next-gen OOB console server also keeps management and orchestration secure, with onboard security features like UEFI secure boot, properly integrated TPM 2.0 security, encrypted solid-state disks, and geofencing.

The Nodegrid Serial Console from ZPE Systems is a true next-gen OOB console server. It delivers end-to-end automation, high-speed OOB access and failover, scalable port configurations, and enterprise-grade zero trust security features.

Learn more about OOB console servers:

★  Comparing the Best Console Servers for Data Centers in 2022
★  Out-of-Band Network Management: Fundamental Principles & Use Cases
★  How to Choose Secure Out-of-Band Management

See the Nodegrid OOB console server at work.

Call 1-844-4ZPE-SYS to request a demo

Watch A Demo

Network Disaster Recovery Plan Checklist

shutterstock_309021146

Your organization may feel secure now, but a disaster could occur at any moment. For example, the war in Ukraine took the world by surprise and left many organizations scrambling to protect and recover critical infrastructure, applications, and data from Ukrainian facilities.

To ensure you’re ready to weather any crisis, you need a robust disaster recovery (DR) plan that accounts for many different scenarios and challenges. This blog provides a network disaster recovery plan checklist to help you establish protocols for protecting your systems, data, and business.

Your network disaster recovery plan checklist

Identify potential disasters

There’s no one-size-fits-all disaster recovery plan—recovering from ransomware is a much different process than recovering from a tornado. You need to determine what types of disasters are most likely to occur and assess each scenario’s individual risk to your facilities, systems, and data.

Network disaster recovery plan checklist:

  Make a list of disasters (natural, man-made, and otherwise) that could pose a threat to your organization.

  Briefly describe what each disaster would look like and how they would impact your company.

  Prioritize your list of disasters based on how likely they are to occur.

Establish the potential impact of a disaster

You should conduct what’s known as a business impact analysis to define how each of these disaster scenarios would impact your organization.

Network disaster recovery plan checklist:

  Determine which business processes, systems, and data are affected by each disaster scenario on your list.

★  Tip: Don’t forget your cloud and edge resources

  Outline precisely how operations will be disrupted by losing or disrupting critical business services.

  Analyze the impact on every aspect of your organization, including productivity, revenue, reputation, etc.

  Calculate the estimated cost of each disaster, both in terms of lost revenue and recovery costs.

Create recovery protocols

What steps do you need to take to recover from a disaster, and what technology will you use to do it? You should create specific recovery protocols for each high-priority disaster scenario on your list.

Network disaster recovery plan checklist:

  Make a detailed list of all recovery procedures and who is responsible for each.

  Make a list of all the technology that will be leveraged in a disaster (e.g., backup data solutions, network failover)

  Outline instructions for every step in every recovery procedure, including branching recovery paths in case one or more of your recovery systems is unavailable.

Set expectations and timelines

Once you know how you’ll recover from each potential disaster scenario, you need to determine the realistic timeline for recovery. This timeline should be based on data and information from the individual team members involved in recovery efforts, as well as the business impact analysis you performed earlier.

Network disaster recovery plan checklist:

  Define how long it would take to complete the recovery procedures for each disaster.

  Compare this to the business impact analysis showing the estimated cost of a disaster to see if your recovery protocols will work quickly enough to prevent unacceptable losses.

★  Tip: If your recovery protocols are too time-consuming, you may need to return to step 3 and re-evaluate your technologies and procedures.

Define individual roles and responsibilities

When disaster strikes, it’s crucial to take action immediately. This is only possible if everyone involved in disaster recovery knows their responsibilities clearly and who is in charge of decision-making.

Network disaster recovery plan checklist:

  Identify disaster recovery team members and determine how they should be contacted when there’s an emergency.

  List the stakeholders who must be kept updated on the recovery status.

  Assign a person (or team) responsible for monitoring the business impact of an ongoing disaster.

  Assign people at each site who will decide on evacuation or relocation of staff and assets.

  Identify the people who have access to secure systems and/or can grant access to others.

Establish lines of communication

Everyone in your organization needs to know who’s in charge of communicating vital information and how to get in touch with key members of the disaster recovery team. You should also identify a single person (or small team of people) responsible for communicating relevant updates to the public to ensure consistent messaging.

Network disaster recovery plan checklist:

  Determine how to communicate with the disaster recovery team (and the rest of the organization) if email and phones are down.

  Create a flowchart outlining who should be contacted in what order for each specific disaster scenario and recovery step.

  Identify a single point of contact responsible for disseminating critical information to staff.

  Make a list (in multiple locations to ensure constant availability) of vendor and support phone numbers to call in case of a cloud or service-related outage.

★  Tip: Also include the support numbers for all your recovery-related technology.

  Identify a single point of contact through which all information about your disaster will be disseminated to the public/customers.

Create a disaster recovery playbook

You should collect all of the information gathered and analyzed in the previous steps into a single playbook that will act as the source of truth for your disaster recovery efforts. This playbook should be made readily available to everyone involved in the disaster recovery plan and duplicated across redundant systems to ensure it’s accessible when a disaster occurs. Essential information from the playbook (such as points of contact) should be shared with everyone in your organization, even if they don’t have a role to play in recovery.

Test your plan regularly

How do you know your plan actually works? You need to test your plan after implementation and then test again on a regular basis. Conduct employee drills to make sure everyone involved knows what they need to do if a disaster occurs. Test your processes and technologies to make sure they still function correctly and that you can recover within the timeline outlined above. Regular testing will let you know if any processes, instructions, or contact points are outdated.

The challenge of network disaster recovery

Even with the most robust network disaster recovery plan, you’re likely to face some hurdles when it comes time to execute your protocols.

For example, what if a disaster occurs at a remote branch office or data center? If you lose network access to your remote infrastructure, do you have a way to remotely troubleshoot and recover, or do you need to lose time and money to truck rolls or local consultants?

How do you deploy replacement devices if remote hardware fails or is irreparably damaged? Do you have staff on-site who can install and configure new devices?  If you stage new equipment at HQ and then ship it to the remote site, what happens if a malicious actor intercepts the package?

Do you have a way to monitor your infrastructure centrally and orchestrate your disaster recovery efforts? Can that system dig its hooks into every network architecture component, including legacy systems?

How ZPE Systems empowers streamlined network disaster recovery

The Nodegrid solution from ZPE Systems helps you execute your disaster recovery plan while avoiding all the most common challenges. Remote out-of-band management gives you access to all your remote network infrastructure via a dedicated link so you can still view, troubleshoot, and recover systems during an outage.

Ultra-secure zero touch provisioning (ZTP) allows you to ship factory-default equipment to remote sites and deploy configurations in a matter of moments, so you can recover faster. Plus, the vendor-neutral ZPE Cloud management platform gives you complete control and visibility on your distributed network infrastructure so you can monitor for issues and implement recovery protocols from anywhere in the world.

Learn more about network disaster recovery:

★  Customer Strategies in Ukraine to Protect Privacy and IP
★  Data Center Environmental Monitoring: How to Stop Disaster Before It Strikes
★  3 Tips to Improve Edge Network Resilience

Execute your network disaster recovery plan checklist with the Nodegrid solution from ZPE Systems.

Get in contact with us or call 1-844-4ZPE-SYS for a free demo.

Contact Us