Viptela SD-WAN devices are used at large enterprise branches all around the world. The success of SD-WAN replaced dedicated service provider managed MPLS with customer managed boxes that used commodity internet connectivity giving more options and power to leadership and engineering. It solved the single-point-of-failure issues with Internet connectivity and using overlay networking, created a secure WAN topology never thought possible with commodity Internet connectivity. The unsolved issue is the platform itself. Viptela SD-WAN vEdge devices like many others have a fatal flaw in built-in encryption and authentication. The issue reared its head on May 9th 2023. The boxes shipped with a 10 year root certificate that was created in 2013. The flaw, designed 10 years ago, is that the certificate is a single point of failure to ensure the platform can’t be trusted to form encrypted connections any longer after the certificate expires. This flaw takes down the entire control plane of the platform which in turn takes down the entire dataplane for all user traffic.

Designing PKI certificate management into the platform during the development cycle would have ensured that this never happened. The platform QA team would then be alerted that the cert is about to expire and securely rotate it or simply build a new software patch with a new 10 year root certificate that pushes out the validity window. These are common problems and do fall into cracks from time to time taking down branch networks and what IT teams need to do to fix it is even worse. Today we see many companies calling emergency meeting for “PKI lifecycle management”

The fix to this problem requires upgrade of the control software in the cloud or datacenter which is not so bad. To automate and properly secure the certificate on the platform the branch hardware also needs to be upgraded. This due to limitations for secure chips like a TPM (Trusted Platform Module) to correctly secure the supply chain of the platform.

In the Viptela case, SD-WAN device in most customer environments was the only way to get into the branch then there is no way to upgrade it when its down. Cisco website requires an out-of-band device to have been previously installed at remote locations. If an out-of-band serial console device is not installed then the fix will require a costly truck roll at a rough cost of $1200/site. This will not count the cost of downtime.
ZPE systems has a cost calculator on the website that shows the cost of downtime for this Cisco outage for an organization with 400 branch locations is ~$5M USD. This is the cost of truck rolls at $1200 each and cost of downtime for 8 hours at $1K/hour. ZPE Systems’ Cost of Downtime ROI Calculator
Many customers will suffer as it may not be possible to drive to 400 locations., This problem will take down many branches for at least a week. Cisco Viptela was so successful with this product there are 1000’s of customers that are impacted each with 100’s of locations and there are not enough resources to fix this in a timely fashion.

For this reason Cisco requires out-of-band connectivity devices to recover from this issue. To be a completely touchless solution, the device should also be deliverable with Zero Touch Provisioning (ZTP) so that the device can be simply shipped in, and physically connected by onsite staff. In the Cisco article below you’ll see the note that the only way to recovery from this issue is to have out-of-band connectivity to service as a dedicated control plane to get back into your remote networks and remediate quickly and automatically.

Note Cisco caution below:
Caution: To recover these devices, out-of-band access is required.


Source: https://www.cisco.com/c/en/us/support/docs/routers/sd-wan/220448-identify-vedge-certificate-expired-on-ma.html

At the time of purchase it’s hard to sign a check for a device that may not be used that often, but not only its used often, it actually saves money and increases productivity and here’s how.

The Resilience System with out-of-band such as ZPE Systems Nodegrid Bold SR shown below creates an isolated control plane network (left side of graphic below) that can be accessed independent from the production network (right side of graphic below). IT admins and automation systems connect to this network through ZPE cloud to gain access to the system in production network. This is fundamental validated reference design that is now the foundational requirement for resilient networks. This solution will enable the engineers to securely update the certificates on Cisco Viptela. Automation built into the Resilience Systems, will enable all branches to be updated simultaneously.

The Solution

ZPE Systems Out-of-Band Infrastructure Recovery Kit

ZPE is the leader in out-of-band serial console and service routers and directly addresses the resilience and uptime challenged this Cisco issue has caused. We are making our ZPE out-of-band recovery devices available as a subscription to help the community to address this immediate issue.

Existing Viptela customers who are affected by the current issue and are struggling in recovering their Viptela environment across the globe, can utilize ZPE System’s “Out-of-Band Infrastructure Recovery Kit” to avoid truck rolls and bring sites up faster.

The kit contains a Nodegrid Mini SR, with global LTE connectivity, a Cisco Console cable and all the connectivity and capabilities to recover your Viptela environment. Customers can order the kit directly from ZPE Systems and we ship it to your HQ or any other location in the world. The unit will automatically call to ZPE Cloud, using its LTE connection. Using ZPE Cloud you claim the Nodegrid Mini SR unit and can gain access to the SD-WAN hardware console and management interface without the requirement to setup a complex VPN connection or client. The setup is easy and with zero-attack surface in the remote location.

Administrators can easily distribute the Viptela firmware to all locations using ZPE Cloud storage and use the build-in tools to recover the Viptela appliances, without the need to send a highly skilled and over-worked network admin on-site. And we have experts on standby to help you with the scripts you need to enable the recovery.

ZPE Systems Out-of-Band Infrastructure Recovery Kit – Overview

  • ZPE Systems Nodegrid Mini SR, with global LTE modem and global data sim covering, allowing the unit to communicate with ZPE Cloud out of the box
  • Buit-in global LTE modem
  • ZPE Cloud – provide global VPN and Clientless communication with MiniSR
  • ZPE Cloud Storage holds the vEdge images
  • USB Cisco console cable
  • All required tools to recover the Viptela appliance, including TFTP, Console access, connectivity testing and more

Get your Out-of-Band Recovery Kit to fix those ticking time bombs

Please get in touch with us if you need more details on the Out-of-Band Recovery Kit or want a trial unit.