Providing Out-of-Band Connectivity to Mission-Critical IT Resources

Data Center Environmental Monitoring: How to Stop Disaster Before It Strikes

Business,Risk,Control,Concept,,Businessman,Protect,Wooden,Block,Fall,To

Environmental threats—such as heat, moisture, power, smoke, and tampering—constitute a significant cause of data center downtime. According to a recent ITIC survey, a single hour of downtime could cost over $300,000 in lost business, which means you can’t afford to ignore the environmental risks in your data center.

Let’s discuss the most prominent environmental threats you need to prepare for, and how data center environmental monitoring can help you prevent expensive outages by detecting disasters before they strike.

What are the environmental risks in your data center?

When you host critical infrastructure in a data center or colocation facility, you may not be able to physically view your equipment or sense the conditions in your cabinet. This is especially true for highly distributed networks in which critical remote infrastructure resides in small data centers hundreds or thousands of miles away from your IT team’s headquarters. However, that infrastructure is still vulnerable to environmental damage, including:

Heat

A data center appliance—like a switch, router, or console server—generally has an optimal temperature range in which it will operate most efficiently. Keeping the environment in your data center within that temperature range will reduce the energy consumption of your appliances and help control power costs. However, the significant risk with temperature is overheating beyond just rising electricity bills. If the temperature in the data center increases beyond acceptable limits due to an AC unit failure, your critical infrastructure could overheat and malfunction. Detecting temperature fluctuations before systems overheat is thus crucial to preventing outages and costly equipment failures.

Moisture

Moisture (from atmospheric humidity, water leaks, etc.) is essentially kryptonite to your data center infrastructure. When moisture collects in or on your appliances, it can cause corrosion, shorts, and component failures. It is  important that you keep data center humidity within acceptable limits and monitor for moisture within your cabinet.

Power

All data center equipment, including your appliances, climate controls, and physical security devices (biometric scanners, CCTV cameras, etc.), require consistent and uninterrupted power. It’s important to monitor the flow of current coming into your facility, cabinet, or rack so you can detect outages as soon as possible and enable backup measures or an orderly shutdown.

Smoke

While data center fires may be rare, smoke is a more common risk to your critical infrastructure, and can also be the first warning sign of another serious issue. For example, an uninterrupted power supply (UPS) may overheat and generate smoke, which can damage the sensitive internal components of your appliances and/or give you advanced warning that a power outage is about to occur.

Tampering

When you have critical infrastructure hosted at a remote colocation facility, you need to prevent and detect unauthorized access to your equipment. Physical security controls like cabinet locks, CCTV cameras, and biometric doors will deter most malicious actors. However, you also need to be notified whenever your cabinet door is opened or closed if someone makes it past these barriers and accesses your equipment without authorization.

Your critical data center infrastructure is at risk from various environmental threats. Still, you may not have on-site staff available to monitor the conditions in your cabinet physically. That’s why you need a comprehensive data center environmental monitoring solution to prevent a disaster from causing expensive downtime.

Data center environmental monitoring prevents disaster before it strikes

A robust data center environmental monitoring solution consists of three key components:

  • Environmental sensors to collect data on the conditions in your cabinet. For example, the Nodegrid environmental monitoring solution includes sensors for temperature, humidity, airflow, particulates, smoke, and more.
  • A serial console server to connect those sensors to your data center infrastructure. A high-density serial console like Nodegrid gives you the ability to monitor and manage up to 96 devices in a single 1U appliance.
  • A data center infrastructure management (DCIM) solution provides central management and analysis of your environmental data. For instance, ZPE Cloud gives you a powerful, web-based dashboard from which to monitor all your environmental sensors as well as manage your entire data center infrastructure.

A complete data center environmental monitoring solution like Nodegrid gives you all the tools you need to optimize the conditions in your rack and detect potential issues before they cause downtime.

Learn more about how Nodegrid’s data center environmental monitoring solution can help you prevent disaster before it strikes.

Contact ZPE Solutions today or request a free demo.

Contact Us

Q&A With a 20-Year DCIM Expert

Businessman,Putting,A,Card,With,Text,Ask,The,Expert
Data center infrastructure management, or DCIM, involves monitoring and controlling the infrastructure within a data center. That means supporting the appliances and the underlying infrastructure—everything from servers and switches to power and HVAC systems. 

Though DCIM is crucial for maintaining your business-critical systems and networks, it’s often misunderstood even by other IT professionals. That’s why we’ve asked two DCIM experts with 20+ years of experience to answer some of the most frequently asked questions about data center infrastructure management.

Tell us a bit about your experience in DCIM work. 

Expert 1: At the company I previously worked for, I was a Sales Engineer selling and supporting their DCIM solution both pre and post-sales.

Expert 2: I have 25 years of DCIM experience, working as the director of design-build for a colocation company, and as the director of engineering/CTO for a company named Global DCIM. 

Tell us about a typical day in the life of a DCIM engineer.

A typical DCIM engineer is responsible for the following day-to-day activities:

  • Expert 2: Asset management: recording the physical location, unit height, power control mapping, and other crucial information in the DCIM. 
    • Expert 1: Also, asset tagging appliances within the data center. 
  • Expert 1: CLR (Continuous License Reconciliation) management: keeping licenses up-to-date and optimizing subscription utilization.
  • Expert 2: Rack management: strategically organizing racks and monitoring and managing power loads, temperature, the weight of racks, and other physical and environmental factors.
    • Expert 1: This also changes rack and cabinet configuration based on the changing environment.
  • Expert 2: New install planning: Determine the best location and configuration for new equipment and physically install it in the data center. 
    • Expert 1: Recording new installs and configurations in the DCIM software, including the power control mapping.
  • Expert 2: Decomm planning: identifying end-of-life devices and organizing retirement activities such as uninstalling equipment and selling or disposing of it.
  • Expert 2: Change management compliance: implementing new equipment and solutions while maintaining compliance with relevant regulations or industry standards. 

Is there a part of DCIM work that people don’t hear about enough? Too much? 

Expert 1: Yes—in my opinion, there are three components of DCIM that often go unrecognized or overlooked:

Capacity Planning:

Capacity planning means you’re not just planning for your current setup, but also preparing to scale up in the future. For instance, let’s say you get four new Dell PowerEdge R730xd and six new HP ProLiant Gen10 servers. What if you placed them in rack-5, row-2? How much extra heat would they generate, and would you have enough cooling to support them? Do you have enough available power? DCIM capacity planning gives you the ability to prepare for these “what if” scenarios.

Project and workflow planning

It’s a DCIM expert’s responsibility to plan out new builds and installs and determine the workflow for all stakeholders in the project. How much time will the facilities team need to build out the racks? When will the rack and stack be ready for the IT team to jump in? This often involves using a tool such as a Gantt chart to plan and manage these workflows and timelines. 

Capture. DCIM

Inventory Management:

Inventory management involves tracking the physical location of inventory in the data center using the following:

  • Asset tag number
  • IP address
  • Nameplate
  • “Friendly name”

A quality DCIM solution should allow you to see server or asset locations on the map of the data center. It should give you a 3D view that you can zoom into on the rack level to see the exact U-space location of the asset. You should also be able to “flip the rack” view from front to back. The DCIM should also give you a proper visual representation of each device.

Expert 2: Most people don’t realize that an appropriate expert of DCIM tracks and labels every cable and cross-connect in the data center, which is a massive undertaking.

What are the major concerns for modern data centers?

Expert 1: Ensuring there’s available power, backup, and cooling will always be a concern.

Expert 2: Our biggest concerns are power, cooling, and space, which are the same issues we were worried about 20 years ago. The difference now is that the client typically wants to be part of this discussion, and in the past, they didn’t.

In your experience, do your clients approach your profession with realistic expectations of what you can do?

Expert 2: The client always says they want everything done right and documented thoroughly, but then when the timeline gets tight, they say, “just do it.” Then, when the install or maintenance window is complete, you’re not allowed to touch anything to label and document it correctly.

What do you think is the best thing data centers can do to reduce emissions and power/water consumption?

Expert 1: Reduce data center emissions; it’s essential to downsize, streamline, and organize. You should eliminate redundant and “ghost” servers, for instance, and consolidate your infrastructure.

Expert 2: As data center density increases, the power, cooling, and water requirements per rack will also increase. However, telemetry-based lighting and evaporative cooling will help reduce emissions.

Do you view DCIM cloud migration as a boon or a potential risk for your industry?

Expert 2: If anything, data center cloud migration makes the job on the ground more critical. However, moving to the cloud gives DCIM experts a single pane of glass to manage their assets, cabling, and power, which makes our jobs easier.

How have data centers changed over the last 20 years?

Expert 1: Implementing cloud computing and more streamlined and centralized management solutions have created a more efficient DCIM environment.

Expert 2: To produce cloud services, compute power density has grown exponentially. This has led to higher power and cooling consumption.

What do you see as the future of data centers?

Expert 2: The future involves finding more efficient cooling techniques. For example, building sites on cold water streams, or completely submerged in liquid. Or, DC power bussing to eliminate thermal load at a gear level.

What advice would you give to young DCIM up and comers?

Expert 2: Be diligent in doing things right, or it will come back to bite you in the ass later!

What are the biggest DCIM industry trends coming in 2022?

Expert 2: DCIM typically revolves around layer 1—the physical layer—of the OSI model, which means it cannot be changed or replaced in the near future. However, as I mentioned before, I predict we’ll see a rise in alternative cooling strategies.

The right solution for your DCIM needs

A DCIM expert is responsible for many crucial tasks in your data center, including asset management, environmental monitoring, and strategic planning. One way you can make their job easier is by consolidating your infrastructure management behind one cloud-based pane of glass. For instance, Nodegrid gives your DCIM engineers complete control over your data center assets—including on-premises, colocation, and cloud infrastructure—from anywhere in the world. With additional features like remote OOB (out of band) management and a full range of environmental sensors, Nodegrid provides full data center visibility even during outages.

Give your DCIM experts the tools they need to succeed.

Contact ZPE Systems or request a demo of the Nodegrid infrastructure management solution today.

Contact Us