Providing Out-of-Band Connectivity to Mission-Critical IT Resources

Dissecting the MGM Cyberattack: Lions, Tigers, & Bears, Oh My!

Dissecting the MGM Cyberattack

This article was written by James Cabe, CISSP, whose cybersecurity expertise has helped major companies including Microsoft and Fortinet.

The recent MGM cyberattack reportedly caused the company to lose millions in revenue per day. The successful kill chain attack — originally a military tactic used to accomplish a particular objective — granted inside access to the attackers, who encrypted and held for ransom some of MGM’s most prized assets. These ‘crown jewel’ assets, as they’re called in the cybersecurity realm, are most critical to the accomplishment of an organization’s mission. Because ransomware attacks persist in corporate networks until fully cleared, organizations must be ready to “fight through” an attack using resilient systems and effective procedures. This should involve identifying these crown jewels and designing them in a way that ensures they can operate through attacks.

When these types of large-profile attacks occur, many cast their eyes at cybersecurity leaders for failing to fend off the bad guys. The reality is these leaders struggle to get budget, corporate buy-in, and digital assets that are required to build a strong defense for business continuity. For MGM, it’s likely they also faced difficulty operationalizing current assets across a gigantic digital estate, and ultimately lacked a plan to recover from a total outage of crown jewel assets.

From the attacker’s perspective, an exceptional level of intelligence and preparation are required in order to understand a target’s internal operations and architecture and execute a successful kill chain. Successfully attacking a sophisticated organization like MGM requires rapid information stealing to capture and leverage cloud credentials, as well as to lock up those resources and lock out the most important support staff in an organization. This is the crux of the issue: infostealers and ransomware automate the mass grabbing of resources and quickly set up a denial of services for the stakeholders that are responsible for fixing these systems.

How did the MGM cyberattack start? After MGM discovered the breach, how did the attacker stay one step ahead? What approach should organizations take to ensure they can recover if they’re targeted?

Who Started The MGM Cyberattack, and How?

The MGM cyberattack began after an adversary group named “Scattered Spider” used phishing over the phone, an approach called ‘vishing,’ to convince MGM’s customer support rep into granting them access with elevated privileges. Scattered Spider is the same group responsible for the SIM-swapping campaign that happened a few months ago, where they successfully subverted multifactor authentication. Their primary tactic involves social engineering, which they use to steal personal information from employees.  

MGM and many other casinos currently use advanced Zero Trust identity security from Okta. However, the attacker was able to trick the service desk into resetting a password to gain access into the network. Even with newer Zero Trust identity solutions, most organizations unravel once attackers get to the real chewy center” of the network: the humans operating them

Spider Bug Insect graphic

Okta is quoted saying, “In recent weeks, multiple US-based Okta customers have reported a consistent pattern of social engineering attacks against their IT service desk personnel, in which the caller’s strategy was to convince service desk personnel to reset all multi-factor authentication (MFA) factors enrolled by highly privileged users.” Okta further warned, “The attackers then leveraged their compromise of highly privileged Okta Super Administrator accounts to abuse legitimate identity federation features that enabled them to impersonate users within the compromised organization.” 

The MGM cyberattack and those like it are more about processes than technology. Let’s explore how the attack progressed, and how the criminals were successful at staying persistent and ultimately hitting their goal. 

How Did A Simple Authentication Attack Morph Into a Complex Attack?

The Scattered Spider threat actors use a platform written by UNC3944 or AlphaV (known by several names). This is a middleware developer for attack platforms that allow criminals to follow a specific set of instructions (a kill chain) to gain access and ultimately encrypt and exfiltrate data from a targeted company. AlphaV’s platform is called BlackCat, which they use to establish a foothold, establish Command and Control (C2) for the malware, and exfiltrate data, to ultimately get paid.

With elevated Okta privileges at MGM, Scattered Spider deployed a file containing a Java-based remote access trojan, which became a “vending machine” for other remote access trojans (RATs) that sought out other nearby machines to spread quickly. The AlphaV RAT would ‘pwn‘ MGM’s Azure virtual servers to gain access, then sniff for more user passwords and create dummy accounts.  

These RATs leveraged a built-in tool called “POORTRY,” the Microsoft Serial Console driver turned malicious, to terminate selected processes on Windows systems (e.g., Endpoint Detection and Response (EDR) agents on endpoints). AlphaV, the platform maintainer, signed the POORTRY driver with a Microsoft Windows Hardware Compatibility Authenticode signature. This helped the malware to evade most Endpoint Detection software. 

This tool was used to get elevated and persistent access to the Okta Proxy servers that were in the scope of the attack and accessible remotely by the attacker. This attack can evade a lot of detection tools. This access allowed them to capture AM\IAM accounts that allowed them greater access to the organization. This stealing of credentials from the Okta Proxy servers was confirmed by Okta responders as well as the threat actor on their blog. This is called a “living off the land” attack. 

Alphv statement on MGM

How Did MGM Discover the Cyberattack?

The first notification of the hack was dropped on the VXUnderground forums. The staff there verified through chat contact with the threat group UNC3944\AlphaV, who works in conjunction with the Scattered Spider threat actor, The attacker also confirmed this on their blog on the darknets.

On September 11, 2023, anyone attempting to visit MGM’s website was greeted by a message stating that the website was currently unavailable. The attack also stopped hotel card readers, gaming machines, and other equipment critical to MGM’s day-to-day operations and revenue generating activities. 

Screenshot showing MGM casino's website down.

How Did the Attacker Maintain Control?

The initial attack allowed AlphaV, who runs the C2 (Command and Control) networks for the RattyRat trojan, to have remote access to the VMware server farm that services the guest systems, the gaming control platforms, and possibly the payment processing systems. They maintained control despite all of MGM’s attempts to mitigate the problem, because they were able to establish elevated access in places the organization could not easily remove them from without removing access to the whole organization. They established something called “persistence.”

From the attacker’s blog on the darknet, “MGM made the hasty decision to shut down every one of their Okta Sync servers after learning that we had been lurking on their Okta Agent servers sniffing passwords of people whose passwords couldn’t be cracked from their domain controller hash dumps. At this point MGM being completely locked out of their local environment. Meanwhile the attacker continued having super administrator privileges to their Okta, along with Global Administrator privileges to their Azure tenant. They made an attempt to evict us after discovering that we had access to their Okta environment, but things did not go according to plan. On Sunday night, MGM implemented conditional restrictions that barred all access to their Okta (MGMResorts.okta.com) environment due to inadequate administrative capabilities and weak incident response playbooks. Their network has been infiltrated since Friday. Due to their network engineers’ lack of understanding of how the network functions, network access was problematic on Saturday. They then made the decision to ‘take offline’ seemingly important components of their infrastructure on Sunday. After waiting a day, we successfully launched ransomware attacks against more than 100 ESXi hypervisors in their environment on September 11th after trying to get in touch but failing.“

MGM tried many things to remove access into their network. However, because of an advanced attack that installed a shadow identity provider in their own Identity Solution, they were able to maintain access long enough to redeploy access to most of the assets they found to be the backbone of the company. AlphaV was then able to encrypt most of the crown jewels of MGM’s operations network.

Is There a Way to Stop These Types of Attacks? 

The MGM cyberattack required physical reconnaissance, patience, and a lot of planning to set up the kill chain. Playbooks that can protect against this kind of attack are hard to create, because it can mean taking all guest services offline for a period, which requires very high authority in the organization. One of the comments from the attacker was that the organization did not act fast enough to take all remote access offline to their management framework that consisted of Okta Proxy Servers. When they did, the adversary was then able to lock them out by submitting a Multifactor Authentication Reset. To stall the attacker, they would have had to induce a full outage of their crown jewels while a formal assessment of all assets could be performed. Taking assets offline requires buy-in at the board level and executive level, which are difficult to come by even if an organization emphasizes its operational excellence, detection, and defense.

Organizations should have a plan to quickly recover from a total loss of a site, outside of backups (which can be lost) and disaster recovery sites. Organizations need to be properly hard-segmented into a full IMI (Isolated Management Infrastructure). Keeping crown jewels safe from an attacker that targets the chewiest part of an organization should be top of any list going from 2023 budget to 2024 planning.

The following is a light version of what can be done in a fully-automated response that can take mere hours instead of days for an outage (a full operations blueprint will be out in the near future).

Isolated Management Infrastructure diagram

An IMI can host an IRE (Isolated Recovery Environment), which is used to cut off all user data and remote access (except for OOB) to an entire infected site. A properly implemented recovery environment should automate most of these activities to speed up the recovery. One of the first considerations is the requirement for a secondary organization in your IAM that is not attached to normal operations. This is what is known as a set of “Break the Glass” accounts. These are known in military circles but have made it into formal practice as part of a strong playbook for ransomware. Once you do this, you can instantiate selected Zero Trust remote access to the site using credentials that are not in the scope of the attack, and then bring up a communications channel for a virtual war room using software like Rocket Chat, Jitsi, Slack, or other standalone communications tools that are installable on the IRE environment. 

Avoiding normal authentication methods or IAM and normal communication channels is required for the integrity of the recovery and strengthens the recovery playbook. During this time, no email may be used that is associated directly with the organization. Ideally, email should never touch an account that is associated with it either.

The next step is to create a new set of clean side networks that do not directly connect to the main backbone or put it behind another firewall for triage good/bad. Using a sniffer software running on the IRE, the recovery team can then run a passive scan or an active scanner against all machines continuing to try to send email to exchange\M365. You can give access to people that are deemed good (not sending traffic) but lock off (with an EDR) the ability to open Outlook for a while, while keeping them on the web email. From there, continue working through to find all the sending drivers to see if they have a good backup. If not, back up the infected drive for offline data retrieval for later. Then reimage while scanning the UEFI BIOS during boot (if needed, run an IPMI scan). If the site has a list of assets that are considered crown jewels, prioritize these.

Once you have a segmented “clean side” established with all the network services required to operate the site (DNS, IAM, DHCP), then Internet access can be restored to this site on a limited basis; which means only out-bound communications, nothing in-bound. Restorative operations can continue apace. making sure that the infected side assets are captured in backup for later forensics following chain-of-custody if damages exceeding insurance limits are found to be the case. This is decided in the war room.

Get the Blueprint for Isolated Management Infrastructure

Maintaining control of critical systems is something security practitioners deal with in the Operational Technology (Industrial Control Systems) side of an organization. For them, the critical and most impactful part of the problem is the loss of control rather than the loss of data, a problem highlighted by the MGM cyberattack. Operational Technology Safety and Security teams set up and maintain Safety Systems as a fallback measure in case of any kind of disaster. This automation allows fallback of services safely, from which point they can recover operations. In 2023, most of our business is done on computers and networks. It is how to plan for business continuity. Now is the time that IT started following this safety system blueprint as well. 

Download the Network Automation Blueprint now, which helps you lay the groundwork for your IMI so you can recover from any attack.

Get in touch with me!

True security can only be achieved through resilience, and that’s my mission. If you want help shoring up your defenses, building an IMI, and implementing a Resilience System, get in touch with me. Here are links to my social media accounts:

The Biggest Ransomware Attack You Haven’t Heard of…Yet

James Cabe CISSP

This article was written by James Cabe, CISSP, whose cybersecurity expertise has helped major companies including Microsoft and Fortinet.

MOVEit over SolarWinds — The largest and most successful ransomware attack ever recorded is happening. Right now. It’s attacking healthcare and financial institutions with high rates of success, and recently stole sensitive data of 4 million more healthcare patients. It uses something called CL0P ransomware, and the threat actor is a well-known criminal group with the name FIN11. Many organizations are finding it difficult to stop the attack because they have no way to access infected devices, take them offline, patch, or even replace them. So, what exactly is going on?

The group responsible for the attack

FIN11 is a cybercriminal group that has been active since 2016 or before, originating from the Commonwealth of Independent States (CIS). While the group has historically been associated with widespread phishing campaigns, their focus has shifted towards other initial access vectors. FIN11 often runs high-volume operations targeting industries in North America and Europe for data theft and ransomware deployment, primarily leveraging CL0P (aka CLOP).

FIN11 is responsible for multiple widespread, high-profile intrusion campaigns leveraging zero-day vulnerabilities, and the group likely has access to the networks of many more organizations than it is able to successfully monetize. Despite this, they’re currently attacking MOVEit, a well-known SaaS provider who relies on a file transfer appliance called Accellion lFile Transfer Appliance (FTA). This legacy product remains unpatched, which has led to the breach of many Fortune 100 companies and state and federal agencies.

FIN11

How did the ransomware attack start?

The ransomware attack began with several Accellion FTA customers, including those in industries like healthcare, legal, finance, retail, and telecom. Companies such as Jones Day Law, Kroger, Singtel, and many others had no idea that they had been attacked, because the initial breach was quiet and headless.

Their only indication came after receiving a threatening email aimed at extortion. 

In this email, the group threatened to publish stolen data on the “CL0P^_- LEAKS” .onion website, according to an investigation from Accellion. The Federal Bureau of Investigation (FBI) and the Cybersecurity and Infrastructure Security Agency (CISA) are releasing this joint CSA to disseminate known CL0P ransomware IOCs and TTPs identified through FBI investigations as recently as June 2023.

According to the investigation, four zero-day security holes were exploited in the attacks:

  • CVE-2021-27101 – SQL injection via a crafted Host header
  • CVE-2021-27102 – OS command execution via a local web service call
  • CVE-2021-27103 – SSRF via a crafted POST request
  • CVE-2021-27104 – OS command execution via a crafted POST request

And, the published victim data appears to have been stolen using a “WEB SHELL”. These web shells give remote administrative access to the web server and create a jumping off point to attack the rest of the internal network. Mandiant, a well-known cyber investigation arm of Google, added, “The exfiltration activity has affected entities in a wide range of sectors and countries” (Threatpost). Exfiltration is the unauthorized removal of important or damaging data from an organization.

However the biggest problem is that these web shells are what researchers call “PERSISTENCE”. This means that an attacker can remain in your network indefinitely to continue damaging and attacking your resources. Researchers call these “APTs,” or Advanced Persistent Threats.

Why is the ransomware attack still going strong?

The ransomware attack is still going strong because there’s no patch available. According to open source information, beginning on May 27, 2023, CL0P Ransomware Gang began exploiting a previously unknown SQL injection vulnerability (CVE-2023-34362) in Accelion’s appliance that is the backbone of a solution known as Progress Software’s MOVEit Transfer service. Internet-facing MOVEit Transfer web applications were infected with a web shell named LEMURLOOT, which was then used to steal data from underlying MOVEit Transfer databases. In similar spates of activity, TA505, which is the group responsible for the Dridex trojan and Locky ransomware, conducted zero-day-exploit-driven campaigns against Accellion FTA devices in 2020 and 2021, and Fortra/Linoma GoAnywhere MFT servers in early 2023.

What most organizations want to know is: How do you quickly respond to issues like these? How can you be properly prepared to respond to an issue you didn’t cause or didn’t expect?

Patching is a good response. However, it takes an average of 205 days to patch a recently known zero-day exploit like the MOVEit vulnerability. While patching alone is typically the ideal response, it isn’t automatic nor can it be done quickly.

Another approach involves removing the offending software or appliance, or cutting off access to the software or appliance. But once you remove this access, how do you continue normal operations, and how can you easily bring the software/appliance back online? Without adequate infrastructure in place, physically deploying to each site is not practical, especially for distributed organizations.

CISA and the FBI encourage organizations to implement the recommendations in the Mitigations section of this CSA to reduce the likelihood and impact of CL0P ransomware and other ransomware incidents. The Mitigations section describes many approaches, including patching, removing software/appliance access, and implementing a recovery plan. But all of these take too much time and too many resources, which leaves organizations vulnerable as they scramble to create an adequate response.

The great news is, organizations can cover all their bases without having to reinvent the wheel. This approach is recommended in one of CISA’s recent directives, and gives organizations somewhat of a silver bullet that allows them to quickly defeat ransomware and remain prepared for any future attack.

What approach does CISA recommend to address ransomware attacks?

CISA’s recent directive (23-02), which addresses the vulnerability of Internet-exposed management interfaces, calls for organizations to create an isolated management infrastructure (IMI) via out-of-band connectivity. This is a drop-in solution that the military, telcos, and hyperscalers/cloud companies use to respond to widespread ransomware and other issues impacting security and resilience. This approach — which ZPE Systems has perfected in the last decade with the help of Big Tech — gives organizations a completely separate control plane through which they can monitor and manage their entire IT infrastructure in a safe and dedicated fashion.

What is isolated management infrastructure?

Isolated management infrastructure consists of the hardware and software that create a management network that’s fully separate from other production and management networks. The key to this is in out-of-band connectivity, which is defined as connectivity other than TCP/IP. Out-of-band can include direct USB, serial, or even non-routed zero-trust connections to crown-jewel assets.

Essentially, the IMI gives an organization complete oversight and control of their widespread IT infrastructure, in a way that is secure and accessible only to their IT teams.

In this diagram, the production infrastructure (blue ring) sits at each distributed location. The out-of-band infrastructure for LAN (OOBI-LAN) is the green ring and surrounds the production infrastructure with one layer of isolated management. The OOBI-WAN (orange ring) is what provides a second layer of isolated management, which teams can access from a central or remote location, to gain access to the OOBI-LAN and ultimately the production infrastructure.

ZPE Automation

Knowing these assets and providing access across the organization can be easy and does not have to disrupt current operations. 

How can IMI stop the FIN11 ransomware attack?

In the ongoing FIN11 ransomware attack, Internet-facing applications are targets of the zero-day exploit. This means that no amount of security solutions can pre-mitigate the attack (i.e., there’s nothing you can do to stop it). This is where IMI shines.

Isolated Management Network diagram sitting beside production infrastructure

Remember the OOBI-LAN/OOBI-WAN diagram? Here’s a zoomed-in view of the isolated management infrastructure sitting beside the production infrastructure. The IMI connects via serial, Ethernet, and USB to production gear, and provides the necessary functions (routing, storing golden images, hosting jumpbox tools, etc.) to recover from attack. But how?

IT teams can use OOBI-WAN to remotely access their OOBI-LAN and production gear. They can pull affected devices offline and bring them in for forensics, which takes place in an Isolated Recovery Environment (IRE). This means these assets and networks are still reachable by analysts and responders, but isolated from other vulnerable assets. This allows an organization to quickly and even automatically deploy tools and resources inside of this environment through devices like ZPE Systems’ Nodegrid.

To combat the FIN11 attack, organizations don’t need to unplug cables or shut their devices off. They can instead deploy their IMI as the framework for closing the attack surface while maintaining access and critical data to aid in recovery.

Get the blueprint for isolated management infrastructure

Don’t wait until the next attack to shore up your defenses. ZPE Systems has worked with Big Tech for ten years developing the isolated management infrastructure. It’s now available inside the Network Automation Blueprint, and walks you through how to implement your own IMI. Download the blueprint now to stay ready for any attack.

Get in touch with me!

True security can only be achieved through resilience, and that’s my mission. If you want help shoring up your defenses, building an IMI, and implementing a Resilience System, get in touch with me. Here are links to my social media accounts:

Atsign: Why Choose ZPE Systems to Host IoT Security?

Colin

A Conversation with Atsign CTO & Co-Founder, Colin Constable

This is a guest post composed by Atsign, creators of zero-attack-surface solutions including atProtocol.

We recently sat down with our CTO and Mariposa Rotary Club extraordinaire, Colin Constable, to discuss our partnership with our friends over at ZPE Systems. Let’s explore the driving force behind this powerful partnership, and how together we’re securing IoT devices and the data shared between them.

Why is this partnership strategically important?

We are a software company that helps people connect beyond the edge of the Internet. And as a software company, we need to have hardware to run our software on. After looking at a number of hardware platforms, ZPE stood out as an organization that provides a strong array of network connectivity options. Our software running on ZPE’s hardware serves as an edge platform that gives customers reliable access to edge-generated data.

What are some of the synergies between Atsign and ZPE?

First and foremost, ZPE’s hardware was designed from scratch to provide the openness and flexibility that we were looking for in a hardware platform. If I were going to design something like this myself, it would look very much like a ZPE box! It is incredibly easy to drop our Docker containers straight onto the platform, and they just simply work, which is quite a joy. To have a Docker container environment on an edge box is really the thing that makes ZPE stand out as a platform. Combine that with the fact that ZPE boxes are running x86, which makes things easy–plus actually having dual SIM cards–we can work with our MVNO partners to provide constant connectivity; even if hardlines go down, there’s cellular backup. The thing we can offer ZPE and their customers is if the box can see the Internet, then you’ll be able to address it, get data to and from it, and actually even log into it, and get hold of the built-in UI on the box.

Tell us about ZPE’s Docker Container support

Our docker containers literally just ran perfectly on the ZPE hardware. I went into the UI, selected my docker container, and it just ran. It doesn’t get much easier than that. Plus, there’s the promise of being able to have the docker container talk to connected devices like V.24 cables to provide connectivity to IoT devices.

Once IoT devices become directly addressable, then it opens up all kinds of opportunities for more efficient delivery or sharing of information that can save customers tons of money by eliminating a lot of the current infrastructure they currently use to do that job.

What are some real-world use cases for Atsign and ZPE Systems?

Because ZPE boxes have lots of connectivity options (e.g. serial ports, 4/5G backhaul, and ethernet–with more coming!) for connecting IoT devices, then you can have always-on devices at the edge, and be able to address and get data to and from them. For example, a radio station that has DSL connectivity, and cellular backup would be able to just automatically move over to cellular backup, notify the radio station that it’s on cellular backup, but use that connectivity until the ADSL line comes back online and at all times be able to get information from the equipment at the radio station. This is critical for radio stations, as it eliminates “dead air,” that moment when the transmitter is not transmitting. Sponsors rely on radio stations to put out notifications for what their businesses are doing, so having constant, uninterrupted connectivity is essential.

Do Atsign & ZPE Systems improve sustainability?

Traditional solutions would have you installing many different boxes. What we really like about the ZPE platform is that although the hardware provides lots of connectivity options–that reduces the footprint for starters–there’s no need to have different modems and firewalls, and any other services can be added via docker containers, so you actually have an environment where you have a single box, and it can do multiple functions at the edge.

What are your final thoughts on the partnership between Atsign and ZPE Systems?

As a software company, we need hardware to deploy on. We especially need hardware that can sit on the edge with all the right connectivity points. Atsign and ZPE Systems is really a perfect combination of great software and great hardware at the edge.

Bonus: What is Colin’s favorite firewall configuration for a ZPE box?

My favorite firewall rule is the one that costs the least money, and is ultimately the most secure firewall ruleset: Deny All. If you’ve got Deny All, that means that you don’t have to deal with the pain and complexities of firewall rules in order to address devices, which is what the real cost of networking is these days; it’s not necessarily the hardware, it’s actually having people to administer firewall rulesets. Having zero network attack surfaces, having a Deny All ruleset, just means you don’t have to have people changing rulesets all the time, which is a good thing.

Best Intel NUC Alternatives

Intel NUC Alternatives

Service providers often struggle with the hybrid nature of their business. Even as they transition more towards a consumable service-based model that’s decoupled from traditional hardware solutions, there’s still a need for some sort of box to be deployed physically at a customer’s premises. Providers frequently rely on COTS (Common Off The Shelf) hardware to reduce costs and simplify the deployment process.

One commonly used COTS device is the Intel NUC, or “Next Unit of Computing,” which is a small appliance-like mini computer. Some service providers utilize Intel NUC devices as jump boxes, while others use them as a platform to deploy their services on-site. While these mini-computers are relatively inexpensive and easy to install, they create added security risks and management headaches that service providers need to be aware of.

This post highlights the challenges and security risks involved in relying on Intel NUC devices before discussing enterprise-grade Intel NUC alternatives that solve these problems.

Table of contents:

 

Why is Intel NUC so popular in IT infrastructure?

Managed Service Providers (MSPs) and Managed Security Service Providers (MSSPs) often use Intel NUC jump boxes to remotely access the control plane of critical client infrastructure. These mini PCs typically run bare bones software to reduce licensing costs, which means they are unpatched, unmonitored, and unsecured. This lack of oversight and management makes Intel NUCs popular access points for hackers to breach client networks.

Why consider Intel NUC alternatives?

Service providers like to use Intel NUC boxes because they’re cheaper, faster to install, and take up less space than a full PC or server. NUCs are often deployed without antivirus, monitoring agents, or other security software installed, which excludes them from the service provider’s security coverage. Plus, clients are frequently unaware that these devices are in their racks accessing their infrastructure, so they don’t access them in security and compliance audits. Other Intel NUC challenges include:

  • Lack of centralized management – Each Intel NUC is an island that’s managed and accessed individually, which makes it impossible to efficiently deploy updates, install new tools, or monitor for problems.
  • Insecure, unpatched OS – Operating systems and software contain thousands of potential vulnerabilities that hackers can exploit, so a lack of monitoring and patch management creates a huge security risk.
  • No hardware security – Intel NUC boxes lack any hardware security, which means someone could steal the device and use it to deploy malware or access client resources – or even just pawn the hardware.
  • Regulatory issues – When providers use unmanaged jump boxes to access client infrastructure, they expose their customers to potential noncompliance with privacy laws like HIPAA that require strict data access controls.
  • Affects insurance eligibility – Using an unsecured Intel NUC may also disqualify customers from receiving cybersecurity insurance benefits in the event of a successful breach.

While Intel NUCs are a quick and inexpensive way for MSPs, MSSPs, and other service providers to remotely access client infrastructure, they also make it easier for cybercriminals to breach enterprise networks. To reduce the attack surface without increasing the cost, hassle, or footprint of deploying jump boxes, you need an enterprise-grade solution that combines networking functions, security, and remote out-of-band access to the control plane to eliminate the need for a separate device.

Intel NUC alternatives from ZPE Systems

The Nodegrid product line from ZPE Systems simplifies the tech stack in data centers and network closets with all-in-one infrastructure management solutions. Nodegrid devices roll up gateway routing, switching, Wi-Fi, and 5G/4G/LTE out-of-band management to cut down on the number of boxes in the rack. They’re also enterprise solutions, which means they can be onboarded with your security team and covered by your monitoring, intrusion detection, antivirus, and other security controls.

In addition, all Nodegrid boxes are protected by hardware security features such as BIOS protection, self-encrypted disk (SED), UEFI Secure Boot, and Signed OS. Plus, Nodegrid’s hardware and software are completely vendor-neutral, allowing easy integrations with third-party security solutions and SAML 2.0 authentication. Nodegrid can even directly host other vendors’ security software to further reduce your tech stack.

Key Nodegrid features

 

All Nodegrid Devices Include:

Key features

Strong Out-of-band management integration

Extensible applications with virtualization and containers

Zero Touch Provisioning (ZTP) over the WAN

Vendor-neutral, unified management via ZPE Cloud/Nodegrid Manager

Modern x86-64bit Linux Kernel

Extended automation based on actionable data

Failover to 4G/5G/LTE & Wi-Fi

Power control and monitoring

Orchestration support via Puppet, Chef, Ansible, RESTful

Security

BIOS protection

TPM 2.0

UEFI Secure Boot

Signed OS

Self-Encrypted Disk (SED)

Geofencing

X.509 SSH certificate support, 4096-bit encryption keys

Selectable cryptographic protocols for SSH and HTTPS (TLSv1.3)

Selectable cypher suite levels: high, medium, low, custom

SSL VPN (Client and Server)

IPSec, Wireguard, and Strongswan with support for multi-sites

Local, AD/LDAP, RADIUS, TACACS+, Kerberos, authentication

SAML support via DUO, OKTA, Ping Identity

Local, backup-user authentication support

User-access lists per port

Group/role-based authorization: AD/LDAP, RADIUS, TACACS+

Fine grain and role-based access control

Firewall – IP packet and security filtering, IP forwarding support

MD5 / SHA System Configuration Checksum™

System event syslog

Custom security settings

Strong password enforcement

Two-Factor Authentication with RSA and DUO

Networking

IPv4 / IPv6 Support

Embedded Layer 2 switching

VLAN

Layer 3 Routing

BGP

OSFP

RIP

QoS

DHCP (Client and Server)

RIPv1, RIPv2

VXLAN

DDNS

NTP

To learn more about the benefits of Nodegrid’s Intel NUC alternatives, contact ZPE Systems.

Nodegrid product comparison

The Nodegrid family of network edge routers delivers secure, Gen 3 OOB management for reliable remote access to distributed customer sites like branch offices or manufacturing centers.

Nodegrid Service Delivery Platform Family

 

Link SR

Bold SR

Hive SR

Gate SR

Net SR

Mini SR

CPU

X86-64bit Intel 

X86-64bit Intel

X86-64bit Intel 

X86-64bit Intel 

X86-64bit Intel 

X86-64bit Intel 

Cores

2

4 or 8

4 or 8

2, 4 or 8

2, 4, 8 or 16

4

Guest VM

1

1

1-2

1-3

1-6

1

Guest Docker

2+

2+

2+

2+

2+

2+

Storage

16GB – 128GB

32GB – 128GB

16GB – 128GB

32GB – 128GB

32GB – 128GB

14GB SED

Additional Storage

Up to 4TB

Up to 4TB

Up to 4TB

Up to 4TB

Up to 4TB

Wi-Fi

Yes

Yes

Yes

Yes

Yes

Yes

Cellular modem

1

1-2

1-2

1-2

1-6

1

5G

Yes

Dual 5G

Dual 5G

6x 5G

Sim slots

2

4

4

4

12

1

Serial Console Switch

1

8

Via USB

8

16-80

Via USB

Network

1x Gb ETH 1x SFP

5x Gb ETH

2x GbE ETH 2x 10 Gbps

4x 10/100/1000/2.5 Gbps RJ-45

2x SFP 5x Gb ETH

4x 1Gb ETH PoE+

2x 1Gb ETH 2x SFP+ Multiple expansion cards

2x 1Gb ETH

Data Sheet

Download

Download

Download

Download

Download

Download

The Nodegrid family of Intel NUC alternatives from ZPE Systems can help MSPs and MSSPs ensure secure, reliable remote management access to customer infrastructure without increasing costs.

Ready for a Demo?

To see one of ZPE’s Intel NUC alternatives in action, request a free Nodegrid demo! Request a Demo

Cisco 2900 EOL: Replacement Options

cisco 2900 eol

The Cisco ISR 2900 series of branch routers went EOS (end-of-sale) on the 9th of December 2017, and Cisco concluded support on the 31st of December 2022. In this guide, we’ll compare migration options for the Cisco ISR 2900 EOL models to help you select a solution that supports your business use case, deployment size, and future growth.

Disclaimer: This comparison was written by a third party in collaboration with ZPE Systems using data gathered from publicly available data sheets and admin guides, as of 5/12/2023. Please email us if you have corrections or edits, or want to review additional attributes: Matrix@zpesystems.com

 

Table of Contents

Cisco ISR 2900 overview

The Cisco ISR 2900 is a line of enterprise gateway routers designed for branch and edge networking. It’s a modular solution that can be expanded with optional Network Interface Modules (NIMs) and Service Modules (SMs) for more functionality. There are two primary use cases for the 2900:

Converged branch networking – The ISR 2900 easily integrates with Cisco’s SD-WAN, SD-Branch, cloud security, and DNA network management software, can be extended with optional modules for added hardware capabilities, and supports NFV (network functions virtualization) for all-in-one branch networking.

Out-of-band (OOB) management – Using serial port modules, the ISR 2900 turns into an out-of-band (OOB) serial console solution that provides remote management access to the control plane of branch infrastructure.

The ISR 2900 is officially EOL as of the 31st of December 2022. The EOL models include all 2901, 2911, 2921, and 2951 ISR product SKUs.

Looking for replacement options for your other Cisco ISR EOL products? Read our guide to Cisco ISR EOL Replacement Options.

 

Cisco 2900 EOL replacement options

The discontinuation of the Cisco 2900 has left many organizations looking for migration options. Let’s compare two direct replacements from Cisco before discussing alternative options that deliver better branch management capabilities and greater opportunities for automation.

Cisco ISR 1100

Cisco ISR 1100 is a series of enterprise branch routers, though in this comparison we’re only looking at the models that support SD-WAN and thus serve as direct replacements for the discontinued 2900 models. The capabilities of the 1100 series vary, mostly because only some of the models are modular. For example, the fixed form-factor 1100-4G/4G LTE models have cellular functionality but offer fewer networking and security features. Conversely, the 1161X-8P and 112x-8P models are modular and can be extended with optional modules (like cellular for the 1161X or terminal server ports for the 112x-8P).

Even with these expansions, the compact ISR 1100s are best suited for smaller deployments in branch offices or small, provider-managed edge data centers. If your organization uses the ISR 2900 for converged branch networking, the 1100s are the closest Cisco replacement, though it supports OOB serial modules as well.

Cisco Catalyst C8300

The Cisco Catalyst C8300 series is a modular branch and edge networking solution, though due to its large size, it’s sometimes used as a primary on-premises gateway router. There are four models to choose from – two 2RU units with 2 SM and 2 NIM slots, and two 1RU units with 1 SM and 1 NIM slot. Each chassis comes with 6 embedded Layer3 Ethernet ports (1 Gbps and/or 10 Gbps) as well as a console port and USB port. All other port configurations and capabilities come via Cisco expansion modules, including options for 5G/4G cellular.

The Catalyst C8300 is a big, robust solution that’s designed for medium to large deployments such as campuses, colocation sites, and AI/machine learning data centers. The C8300 is primarily a converged branch networking solution like the ISR 1100 series, but it provides OOB management with optional serial cards.

Cisco 2900 replacement option comparison table

 

Cisco ISR 2900 (EOL)

Cisco ISR 1100

Cisco Catalyst 8300

Nodegrid Net SR

Nodegrid Serial Console Plus

Form Factor

1-2 RU

Desktop-1RU

1-2 RU

1 RU

1 RU

Max IPsec Throughput

Not defined

Up to 18.8 Gbps

Up to 18.8 Gbps

600 Mbps – 1.2 Gbps

600Mbps

Total Onboard WAN or LAN 10/100/1000 Ports

2-3

4-6

4-6

2

2

Total Onboard WAN or LAN 10Gbps Ports

0

0

0-2

2

2

WAN Ports

2-3

0-6

2-6

1+, configurable

0-4

LAN Ports

2-3

0-6

2-6

4-84

0-4

Slots

2-3

0-1

2-4

5

0

Default Memory

512 MB

4 GB

8 GB

8 GB

4 GB

Max Memory

2 GB

8 GB

32 GB

64 GB

16 GB

Compute

UCS-E Card

On-board, Compute card

On-board

OOB Capabilities

Requires Serial Card

Requires Serial Card

Requires Serial Card

Included

Included

Environmental Monitoring

N/A

N/A

N/A

Included

Included

For users looking for a Cisco solution to replace their EOL ISR 2900, the ISR 1100 series and Catalyst C8300 are the closest direct replacements. However, both product lines suffer from a major limitation – they aren’t vendor-neutral.

While Cisco routers integrate with some third-party partners, they do not support custom or third-party applications for automation and orchestration, which limits you to the automation offered by Cisco’s software. This lack of open integrations increases the chances that a Cisco solution won’t be able to hook into all the hardware and software components of a distributed and multi-vendor network architecture.

For example, if you utilize different SD-WAN and next-generation firewall (NGFW) vendors at some of your remote sites, Cisco’s automation may not extend to these devices. That means you’ll need to send out technicians to all remote sites (which could number in the dozens or hundreds) just to set up these services when you otherwise could have deployed them automatically.

Want to learn more about breaking free of locked ecosystems? Read The Benefits of Vendor Agnostic Platforms in Network Management

When network solutions like the Cisco 2900 go EOL, it’s the perfect opportunity to look for alternative options that provide the functionality you need without locking you into an ecosystem or limiting your automation capabilities.

Cisco 2900 direct replacement options from ZPE Systems

ZPE Systems provides a line of vendor-neutral solutions for branch and edge networking called Nodegrid. The Nodegrid Net Services Router (NSR) and Nodegrid Serial Console Plus (NSCP) serve as direct replacements for Cisco 2900 EOL products.

Nodegrid Net Services Router (NSR)

The Nodegrid NSR is a modular branch networking solution that you can customize to increase your terminal server ports, storage space, processing power, or switch ports. The NSR delivers converged branch networking capabilities like SD-WAN, SD-Branch, and NFVs, plus it can host your choice of custom and third-party applications for automation, security, and more.

While the NSR is the perfect converged branch solution to replace the Cisco ISR 2900, it also provides 3rd generation (or Gen 3) OOB management. That means Nodegrid’s OOB network is completely vendor-neutral and can extend automation capabilities to all your legacy and mixed-vendor infrastructure for efficient deployments, management, and orchestration.

Want to see the Nodegrid converged branch networking solution in action? Watch a Demo

Nodegrid Serial Console Plus (NSCP)

The NSCP is a robust, scalable branch networking and out-of-band serial console solution. The NSCP comes in 16-, 32-, 48-, and 96-port models, so you can choose the solution that’s right-sized to your deployment and use case. Plus, you can get built-in 5G/4G LTE and Wi-Fi options for failover and out-of-band.

Like the NSR, the NSCP is also an open platform that can run your choice of software to expand your capabilities and reduce your tech stack. Like the NSR, the NSCP delivers Gen 3 OOB management of all connected infrastructure, enabling true end-to-end automation in data centers, branches, and other remote sites. The NSCP is the perfect replacement for enterprises utilizing the Cisco 2900 for out-of-band management, though it also provides converged branch networking capabilities at any scale.

All Nodegrid devices run the open, Linux-based Nodegrid OS which can host your choice of third-party or custom applications, freeing you from vendor lock-in. You can even integrate infrastructure orchestration tools like Puppet, Chef, and Ansible to extend automation to end devices, regardless of vendor. This is what makes Nodegrid the world’s first Gen 3 branch networking solution.

Want to see how Nodegrid stacks up against Cisco’s replacement options? Click here to download the services routers comparative matrix.

Global support and supply chain

Leaving a trusted ecosystem behind to adopt alternative options can be risky, so it’s important to find a vendor that offers the support you need to make the transition and keep your operations running smoothly. ZPE Systems offers global product support using the “follow the sun” model, which means you get support when you need it, regardless of your timezone. You also won’t have to worry about supply chain issues causing stock shortages – ZPE supplies hyperscalers in 10K+ units per quarter and has great, consistent supply chain control.

Need to replace your Cisco 2900 EOL?

To learn more about replacing your Cisco 2900 EOL solution with the vendor-neutral Nodegrid platform and our shipping in as little as two weeks, contact ZPE Systems today. Contact Us

Cisco 2900 EOL product tables with migration SKUs

Cisco 2900 EOL Model

In Scope Features

Replacement Product (modular form factor)

Cisco ISR 2901

Cisco ISR 2911

Cisco ISR 2921

Cisco ISR 2951

Serial Console Module, Routing, 16 serial ports

ZPE-NSR-816-DAC with 1 x 16 port serial module 1 x ZPE-NSR-16SRL-EXPN

Cisco ISR 2901

Cisco ISR 2911

Cisco ISR 2921

Cisco ISR 2951

Serial Console Module, Routing, 32 serial ports

ZPE-NSR-816-DAC with 2×16 port serial module 2x ZPE-NSR-16SRL-EXPN

Cisco ISR 2901

Cisco ISR 2911

Cisco ISR 2921

Cisco ISR 2951

Serial Console Module, Routing, 48 serial ports

ZPE-NSR-816-DAC with 3×16 port serial module 3x ZPE-NSR-16SRL-EXPN

Cisco ISR 2901

Cisco ISR 2911

Cisco ISR 2921

Cisco ISR 2951

Serial Console Module, Routing, 60 serial ports

ZPE-NSR-816-DAC with 4×16 port serial module 4x ZPE-NSR-16SRL-EXPN

80 serial port option – no Cisco equivalent

Serial Console Module, Routing, 80 serial ports

ZPE-NSR-816-DAC with 5×16 port serial module 5x ZPE-NSR-16SRL-EXPN

 

Cisco 2900 EOL Model

In Scope Features

Replacement Product (fixed form factor)

Cisco ISR 2901

Cisco ISR 2911

Cisco ISR 2921

Cisco ISR 2951

Serial Console Module, Routing, 16 serial ports

ZPE-NSCP-T16R-STND-DAC

Cisco ISR 2901

Cisco ISR 2911

Cisco ISR 2921

Cisco ISR 2951

Serial Console Module, Routing, 32 serial ports

ZPE-NSCP-T32R-STND-DAC

Cisco ISR 2901

Cisco ISR 2911

Cisco ISR 2921

Cisco ISR 2951

Serial Console Module, Routing, 48 serial ports

ZPE-NSCP-T48R-STND-DAC

96 serial port option – no Cisco equivalent

Serial Console Module, Routing, 96 serial ports

ZPE-NSCP-T96R-STND-DAC

Want to see how Nodegrid compares to other serial console solutions?

Defusing Cisco SD-WAN Time-bomb requires out-of-band access

Viptela SD-WAN devices are used at large enterprise branches all around the world. The success of SD-WAN replaced dedicated service provider managed MPLS with customer managed boxes that used commodity internet connectivity giving more options and power to leadership and engineering. It solved the single-point-of-failure issues with Internet connectivity and using overlay networking, created a secure WAN topology never thought possible with commodity Internet connectivity. The unsolved issue is the platform itself. Viptela SD-WAN vEdge devices like many others have a fatal flaw in built-in encryption and authentication. The issue reared its head on May 9th 2023. The boxes shipped with a 10 year root certificate that was created in 2013. The flaw, designed 10 years ago, is that the certificate is a single point of failure to ensure the platform can’t be trusted to form encrypted connections any longer after the certificate expires. This flaw takes down the entire control plane of the platform which in turn takes down the entire dataplane for all user traffic.

Designing PKI certificate management into the platform during the development cycle would have ensured that this never happened. The platform QA team would then be alerted that the cert is about to expire and securely rotate it or simply build a new software patch with a new 10 year root certificate that pushes out the validity window. These are common problems and do fall into cracks from time to time taking down branch networks and what IT teams need to do to fix it is even worse. Today we see many companies calling emergency meeting for “PKI lifecycle management”

The fix to this problem requires upgrade of the control software in the cloud or datacenter which is not so bad. To automate and properly secure the certificate on the platform the branch hardware also needs to be upgraded. This due to limitations for secure chips like a TPM (Trusted Platform Module) to correctly secure the supply chain of the platform.

In the Viptela case, SD-WAN device in most customer environments was the only way to get into the branch then there is no way to upgrade it when its down. Cisco website requires an out-of-band device to have been previously installed at remote locations. If an out-of-band serial console device is not installed then the fix will require a costly truck roll at a rough cost of $1200/site. This will not count the cost of downtime.
ZPE systems has a cost calculator on the website that shows the cost of downtime for this Cisco outage for an organization with 400 branch locations is ~$5M USD. This is the cost of truck rolls at $1200 each and cost of downtime for 8 hours at $1K/hour. ZPE Systems’ Cost of Downtime ROI Calculator
Many customers will suffer as it may not be possible to drive to 400 locations., This problem will take down many branches for at least a week. Cisco Viptela was so successful with this product there are 1000’s of customers that are impacted each with 100’s of locations and there are not enough resources to fix this in a timely fashion.

For this reason Cisco requires out-of-band connectivity devices to recover from this issue. To be a completely touchless solution, the device should also be deliverable with Zero Touch Provisioning (ZTP) so that the device can be simply shipped in, and physically connected by onsite staff. In the Cisco article below you’ll see the note that the only way to recovery from this issue is to have out-of-band connectivity to service as a dedicated control plane to get back into your remote networks and remediate quickly and automatically.

Note Cisco caution below:
Caution: To recover these devices, out-of-band access is required.

Cisco

Source: https://www.cisco.com/c/en/us/support/docs/routers/sd-wan/220448-identify-vedge-certificate-expired-on-ma.html

At the time of purchase it’s hard to sign a check for a device that may not be used that often, but not only its used often, it actually saves money and increases productivity and here’s how.

The Resilience System with out-of-band such as ZPE Systems Nodegrid Bold SR shown below creates an isolated control plane network (left side of graphic below) that can be accessed independent from the production network (right side of graphic below). IT admins and automation systems connect to this network through ZPE cloud to gain access to the system in production network. This is fundamental validated reference design that is now the foundational requirement for resilient networks. This solution will enable the engineers to securely update the certificates on Cisco Viptela. Automation built into the Resilience Systems, will enable all branches to be updated simultaneously.

The Solution

ZPE Systems Out-of-Band Infrastructure Recovery Kit

ZPE is the leader in out-of-band serial console and service routers and directly addresses the resilience and uptime challenged this Cisco issue has caused. We are making our ZPE out-of-band recovery devices available as a subscription to help the community to address this immediate issue.

Screenshot 2023-05-16 at 9.52.18 AM

Existing Viptela customers who are affected by the current issue and are struggling in recovering their Viptela environment across the globe, can utilize ZPE System’s “Out-of-Band Infrastructure Recovery Kit” to avoid truck rolls and bring sites up faster.

The kit contains a Nodegrid Mini SR, with global LTE connectivity, a Cisco Console cable and all the connectivity and capabilities to recover your Viptela environment. Customers can order the kit directly from ZPE Systems and we ship it to your HQ or any other location in the world. The unit will automatically call to ZPE Cloud, using its LTE connection. Using ZPE Cloud you claim the Nodegrid Mini SR unit and can gain access to the SD-WAN hardware console and management interface without the requirement to setup a complex VPN connection or client. The setup is easy and with zero-attack surface in the remote location.

Administrators can easily distribute the Viptela firmware to all locations using ZPE Cloud storage and use the build-in tools to recover the Viptela appliances, without the need to send a highly skilled and over-worked network admin on-site. And we have experts on standby to help you with the scripts you need to enable the recovery.

ZPE Systems Out-of-Band Infrastructure Recovery Kit – Overview

HSR-KIT

SKU: ZPE-MSR-24-4G-KIT

  • ZPE Systems Nodegrid Mini SR, with global LTE modem and global data sim covering, allowing the unit to communicate with ZPE Cloud out of the box
  • Buit-in global LTE modem
  • ZPE Cloud – provide global VPN and Clientless communication with MiniSR
  • ZPE Cloud Storage holds the vEdge images
  • USB Cisco console cable
  • All required tools to recover the Viptela appliance, including TFTP, Console access, connectivity testing and more

Get your Out-of-Band Recovery Kit to fix those ticking time bombs

Please get in touch with us if you need more details on the Out-of-Band Recovery Kit or want a trial unit. Send an email to info@zpesystems.com or use the form to get started.