Stratus Blog

Showing archives for category High Availability

Scenic and Critical Water Systems at Shanghai Disney Stay Safe with Stratus

4.26.2017High Availability, WastewaterBy:  

Imagine annually hosting 25 million theme park visitors that expect a magical and safe experience and the public health consequences if water contamination occurred at the site. That’s why Shanghai Disney Resort, the newest and largest Disney theme park in Asia, takes no chances. They run their water monitoring and control systems on Stratus continuous availability solutions.

Shanghai Disney Resort is a massive theme park with six “lands” and shopping, dining, and entertainment venues. It spans a lush, 963-acre property rich with scenic lakes and public green spaces. Behind the scenes, a comprehensive water circulation and irrigation system keeps these natural areas safe and inviting for visitors by ensuring strict adherence to water quality standards.

The complex water treatment system has the capacity to treat 24,000 cubic meters of water per day. It relies on an automated software system to track, control, and analyze the equipment managing water flow, as well as data measuring water quality. Even brief unplanned system downtime could lead to water contamination and turn a nice day at the park into a public health nightmare.

To avoid such disasters, Stratus built the core infrastructure of Shanghai Disney Resort’s water treatment system on Stratus ftServers. Intelligent, self-healing ftServers prevent unplanned downtime from occurring in the first place. As a result, the park’s critical systems run continuously with no data loss.

In a Stratus case study about the park, Mr. Pang, Shanghai Disney Resort’s chief engineer, explains, “We can count on the Stratus ftServer solution to eliminate single points of technical failure so this critical system runs continuously—without unscheduled interruptions—for smooth, reliable operations.”

In fact, since going live, the ftServer solution has delivered continuous transmission of vital water quality and control system data with zero unplanned downtime. This allows the resort to consistently meet— and exceed—quality standards for second-class surface water throughout the property.

The Stratus solution also has been easy for Mr. Pang’s team to manage and maintain. They didn’t have to write any special failover scripts as in a traditional disaster recovery solution, or modify their applications to be cluster-aware. The software simply runs just as it would on any industry-standard server, except that the ftServer is continuously available.

As further testimony of the easy serviceability of Stratus solutions, the park’s IT staff uses a single ftServer dashboard to monitor the water treatment system infrastructure. The team can automatically restart applications if an unrecoverable error occurs and avoid any downtime or loss of in-flight data.

Stratus keeps water systems like the one at Shanghai Disney Resort up and running for organizations worldwide. That’s one of the key reasons the resort chose the ftServer. It’s proven to be reliable and easy to manage in numerous other water treatment facilities. By removing the complexity of preventing downtime, Stratus translates to low total cost of ownership and high return on investment. For Shanghai Disney Resort, that return is best demonstrated by the delight and safety of tens of thousands visitors that enjoy its vast facility every day.

Thinking of choosing virtualized high availability software over a continuous availability solution? So were 50% of Stratus customers until they learned this…

3.8.2017Cost of Downtime, High AvailabilityBy:  

In a recent survey of over 500 Stratus customers, 50% responded that they were also considering virtualized high availability software during their buying process. “Aren’t HA (high availability) solutions good enough?”, “Can’t we just have a good DR system in place?”, and “Do we really need continuous availability?” are just a few questions that come to mind when exploring availability options.

Does this sound familiar? Here is what our customers learned as they compared options during their decision process:

Reaction is not prevention

Downtime prevention methods include a full range of options, from disaster recovery (backup and restoration), to data replication, to application failover and restart. Despite their differences, the one thing they hold in common is that they are reactive solutions – they cannot stop an outage from occurring – resulting in the very real risk of lost data and halted processing.

Choosing the right availability solution requires understanding your cost of downtime

A common way of distinguishing between availability solutions is by their percentage of availability. Traditional high availability solutions (such as clustering) can have an availability percentage of 99.95%, or 4.38 hours of downtime per year, where a continuous availability solution guarantees an availability percentage of 99.999+%, or between two and a half and five and a quarter minutes of unplanned downtime per year. These percentages may not seem like substantial differences, but they could mean the difference between hours – and big dollars. According to the Aberdeen Group, the cost of downtime went up $260,000 per hour on average between 2014 and 2016.

The Stratus approach

While HA solutions can be sufficient in some instances, Stratus’ continuous availability solutions are vital for applications and systems that are business critical, especially when they reside outside of the datacenter. When it comes to industries where system failure, or even an hour of downtime could mean destroyed reputations, damaged environments, the loss of thousands of dollars – or even worse – lives, anything less than continuous availability is simply not an option.

That is why our customers rely on Stratus to prevent the failure from occurring in the first place – and guess how operationally simple it made their lives.

Here are just a few examples of what those same Stratus customers have been experiencing:


Want to learn more about availability options? Click Here

3 Tips for Ensuring Continuous Availability of Smart Building Infrastructure

9.8.2016Building Automation, High AvailabilityBy:  

The state of the art in building automation and security is evolving with incredible speed. But one thing is certain: Construction companies and building owners will become increasingly reliant on digital systems to keep their buildings safe, secure, comfortable and energy efficient. Focusing on the issue of fault tolerance right from the blueprint stage of any new construction or major renovation project is becoming increasingly important.

But how do you design an approach that rationalizes the infrastructure and management of all this disparate technology coming from numerous vendors in a streamlined, consistent way? When developing an availability approach, consider the following key questions.

1. Is it simple?

Automated building systems may be expanding, but building management budgets are not. An availability solution should be easy to deploy without any specialized development skills. And it should be easy to manage and easy to service in the event of a failure. Avoiding large, multi-component systems in favor of an all-in-one “appliance-like” solution reduces complexity, as well as physical footprint.

In addition, the availability solution should provide a single, end-to-end view of the entire building automation and security infrastructure. This simplifies management and makes it easier for building staff to proactively identify potential issues before they become problems.

2. Does it leverage industry standards?

Sophisticated building automation and security systems may involve literally dozens of applications from an array of vendors in a virtualized environment. This requires an underlying availability infrastructure that is based on industry standards, with the flexibility to support a wide range of applications and vendors. Standards-based solutions also allow the use of lower-cost off-the-shelf servers, further reducing total cost of ownership.

3. Is it optimized for smart building deployment?

A solution with little or no track record in building automation and security may not deliver on its promises. Look for technologies from vendors that have experience in the field and deep relationships with building automation and security application vendors. That’s a good sign of an ecosystem that’s been proven in many different building deployments. Don’t be shy—ask about their experience and connections within the industry.

You can read more about the growing need for fault tolerance as buildings become more automated in my article published recently in Construction Executive.


READ MORE: Check out the Stratus Always-On Infrastructure for Smart Buildings

Security System Availability: Balancing Your Needs and Your Budget

8.26.2016Building Automation, High AvailabilityBy:  

For organizations across the industry spectrum, heightened awareness of both physical and cyber threats is driving increased investment in automation systems for building security. They are deploying more access control, more cameras, more alarms, more backup power systems, more logs and databases.

Yet these and other building automation and security systems are only effective as long as the servers that support them are up and running.

Approaches to building automation and security availability generally fall into three categories:

  1. Data backups and restores
  2. High availability (HA)
  3. Continuous availability (CA)

Which of these three general approaches is needed for your building security applications will depend on a range of factors.

First, however, it’s important to determine the state of your current security automation infrastructure. While your system architecture may be billed as “high availability,” this term is often used to describe a wide range of failover strategies—some more fault tolerant than others. In the event of a server failure, will there be a lapse in security? Can critical data be lost? Is failover automatic, or does it require intervention?

Assessing the potential vulnerabilities of your infrastructure can help you avoid a false sense of security that could come back to haunt you. This insight will also help you define your needs, guiding you toward the most appropriate availability strategies for your security environment.

So how much availability do you need? Obviously, deploying the highest level of fault tolerance for all of your security applications across the enterprise would be ideal. But the cost of such a strategy could be prohibitive. Moreover, not all security applications require the highest level of uptime.

For example, some applications may be deployed in a multi-tiered approach. With this arrangement, there could be a “master server” in a centralized location controlling a network of site servers, which regularly cache data back to the master server. In this scenario, you might configure the master server as FT but decide that HA is adequate for the site servers, given their workloads. It all depends on the criticality of each server’s function within the security automation architecture.

Carefully assessing your requirements for each security application and planning your infrastructure to provide the appropriate level of availability is the key to balancing your real-world needs with the realities of your budget.

Are your building security and automation systems ready for a disaster? Check out this Infographic containing key statistics from the Stratus 2015 Building Security and Automation Survey.

View Infographic

Security System Availability: Understanding Your Options

8.12.2016Building Automation, Fault Tolerance, High AvailabilityBy:  

As building automation and security systems become increasingly reliant on server technology, ensuring the availability—or uptime—of the applications running on those servers is absolutely critical. But how much availability is “good enough”? And what’s the best way to achieve that level of availability?

To answer those questions, it’s important to understand the three basic approaches to server availability:

1. Data backups and restores:

Having basic backup, data-replication, and failover procedures in place is perhaps the most basic approach to server availability. This will help speed the restoration of an application and help preserve its data following a server failure. However, if backups are only occurring daily, significant amounts of data may be lost. At best, this approach delivers approximately 99 percent availability.

That sounds pretty good, but consider that it equates to an average of 87.5 hours of downtime per year—or more than 90 minutes of unplanned downtime per week. That might be good enough for a business application that is not mission critical, but it clearly falls short of the uptime requirements for building security and life-safety applications.

2. High availability (HA)

HA includes both hardware-based and software-based approaches to reducing downtime. HA clusters are systems combining two or more servers running with an identical configuration, using software to keep application data synchronized on all servers. When one fails, another server in the cluster takes over, ideally with little or no disruption. However, HA clusters can be complex to deploy and manage. And you will need to license software on all cluster servers, increasing costs.

HA software, on the other hand, is designed to detect evolving problems proactively and prevent downtime. It uses predictive analytics to automatically identify, report and handle faults before they cause an outage. The continuous monitoring that this software offers is an advantage over the cluster approach, which only responds after a failure has occurred. Moreover, as a software-based solution, it runs on low-cost commodity hardware.

HA generally provides from 99.95 percent to 99.99 percent (or “four nines”) uptime. On average, that means from 52 minutes to 4.5 hours of downtime per year—significantly better than basic backup strategies.

3. Fault-tolerance (FT)

Also called an “always-on” solution, FT’s goal is to reduce downtime to its lowest practical level. Again, this may be achieved either through sophisticated software or through specialized servers.

With a software approach, each application lives on two virtual machines with all data mirrored in real time. If one machine fails, the applications continue to run on the other machine with no interruption or data loss. If a single component fails, a healthy component from the second system takes over automatically.

FT software can also facilitate disaster recovery with multi-site capabilities. If, for example, one server is destroyed by fire or sprinklers, the machine at the other location will take over seamlessly. This software-based approach prevents data loss, is simple to configure and manage, requires no special IT skills, and delivers upwards of 99.999 percent availability (about one minute of downtime a year)—all on standard hardware.

FT server systems rely on specialized servers purpose-built to prevent failures from happening and integrate hardware, software and services for simplified management. They feature both redundant components and error-detection software running in a virtualized environment. This approach also delivers “five nines” availability, though the specialized hardware required does push up the capital cost.

Making server availability a cornerstone of your building security automation strategy pays dividends both in terms of day-to-day management and when situations arise that test your security. With the right strategy up front, your building’s security systems will be there when it really counts today and in the future. In today’s constantly changing, “always-on” world, that’s all the time.

Smart Buildings – The “hot topic” of the Week at IFSEC London

6.30.2016Building Automation, High Availability, IIoT, Smart BuildingsBy:  

Having spent three days at IFSEC with our partner Tyco and 33,000 plus global security professionals, it was clear that Smart Buildings was the topic of the week.

Are Smart Buildings just hype and the buzz word of the week that the bandwagon is jumping on, or are they a reality?

Buildings today are varied, complex structures with equal systems and technology.

Over time development and advancements, with needs of security managers, has led to significant improvement with additional investment. These improvements have touched access, lighting, heating, ventilation, security and air conditioning.

Taking building security and access technologies as an example, it’s true that these systems are becoming more and more intelligent by the day. The access cards or dongles that would have traditionally served the single purpose of giving you access to a building are expanding their functionality to additional use cases – from starting up your PC, to ordering your coffee, to activating lighting in your office. With more and more functionality, interconnectivity and use cases being addressed, the need for a reliable technological foundation for building management and security environments becomes greater.

Security managers have started to look at what comes next and to understand the new and future needs of building owners and employees. This also extends to environmental elements, such as electricity. To meet these new requirements, it is not enough to simply offer comfort, light and safety. The multiple systems needed that are managed independently today, will be managed centrally by IT command and control centers in the future.

When consolidating multiple technologies with IT and to ensure compliance with new objectives, businesses today are virtualizing their applications. In security, putting all your eggs in one basket requires significant and robust systems that must not fail.

Stratus is addressing this need with the latest evolution of our software solution – the Stratus Always-On Infrastructure for Smart Buildings. According to Vernon Turner, Senior Vice president of Enterprise Systems and IDC Fellow for the Internet of Things, “This latest evolution of the Stratus software portfolio combines Stratus’ always-on infrastructure with enhanced monitoring capabilities. The strong underlying infrastructure and visibility delivered by everRun Enterprise with everRun Monitor is addressing the challenge of managing the requirements for the consistent operation of Smart Buildings’ disparate infrastructure while maximizing efficiency and reducing cost.”

The evolution of the Smart Building is in its infancy and will remain so until buildings can really think for themselves without human intervention. That said, it’s not merely a reality of the future; advancements are being made every day. And analyst firm Gartner stated that by 2020 there will be over 26 billion connected devices – that’s a lot of connections!

The Path to Smarter Buildings

6.24.2016Building Automation, High Availability, Smart Buildings, VirtualizationBy:  

The buildings we sit in or public spaces we visit (like airports) today are getting smarter all the time. A simple case in point is the lights that automatically turn on when you enter your office. A more advanced example is when your badge reader is tied to your company’s HR database and provides secure access to a room. A future example is when you can access a room with your badge (or phone) and that room’s lighting and climate is automatically set to meet your preferences. This future is real and a lot of technology is beginning to converge to usher it in. These advancements are all very exciting, but for those directly involved in creating smarter buildings, we should not underestimate the complexity involved. Here are some key considerations when charting your course towards a smarter building.

  1. Plan to consolidate your building technology– Right now every different building control (heating, power monitoring, video, access control) is on a separate application which is likely to be deployed on separate servers. This leads to a heavy footprint that is hard to manage and is likely costing you too much money. So, often the first step towards a smarter building is to virtualize your building’s software infrastructure. Stratus and our partners can provide you with the reliable foundation required for this with our recently announced Stratus Always-On Infrastructure for Smart Buildings.
  2. Take a close look at your needs for availability and fault tolerance – Once you have consolidated your solutions, you’ll invariably be forced to decide how and where to virtualize these applications. The easy answer is to just add the VMs into your existing data center. That’s a pretty good idea if your needs for availability and compliance are pretty basic (say in an office campus). But if you have critical areas to serve (such as access controls into a clinical environment or runway lighting controls at an airport) where no amount of downtime is acceptable, you may need a specialized solution deployed on site that ensures that failures of service won’t happen. And remember the more applications or building services you consolidate onto an infrastructure the more likely it needs fault tolerance.
  3. Learn what you can do to eliminate downtime with Application Availability Solutions from Stratus.

  4. Understand that the smart building infrastructure is pervasive and expanding– The internet of things is enabling the deployment of cheaper devices to help build smarter buildings. However, all of those devices need some degree of monitoring and visibility. This is why we have built everRun® Monitor powered by Sightline Assure® into our Always-On Infrastructure for Smart Buildings. It goes beyond the standard server based infrastructure and can monitor the entire gambit of smart building technology, giving building managers the insights they need to secure and operate their buildings more effectively.
  5. Get ready for analytics and compliance– A big part of the business case for smart buildings is the fact the new intelligence driven by the data that gets produced by the end point devices (sensors, cameras, badge readers), will help reduce costs and/or make buildings more secure. The application of analytics to these new building services will deliver those efficiencies and improvements provided that the data produced is consistent and available.

The smart buildings of the future are both realistic and beneficial. There are a lot of cost efficiencies to be gained, as well as safer spaces for people to work and visit. However, like many things it needs to start with a reliable technical foundation on which to build upon.

What does it take to eliminate unplanned downtime and future-proof automation system assets?

6.16.2016Cost of Downtime, Fault Tolerance, High Availability, IIoT, Industrial Automation, technical webinar, uptimeBy: Craig Resnick of the ARC Advisory Group shared his insights on how to eliminate unplanned downtime and future-proof automation system assets in a recent webinar. The webinar reviewed the ever-present consequences that can occur from unplanned downtime and some of the leading causes. Strategies to reduce unplanned downtime through implementing updated SCADA systems and using technologies such as virtualization and fault-tolerant computers were discussed, as well as how organizations can leverage those strategies to prepare for the coming wave IIoT.

Here’s a summary of the key take-aways:

  • Understanding the true impact of unplanned downtime can lead to a better understanding of where investments can be made in automation systems to reduce such events.
  • Unplanned downtime can occur from a variety of areas, including human errors, failure of assets that are not part of the direct supervisory and control chain, and failure of the SCADA systems themselves. The result is lowered OEE, decreased efficiency and reduced profitability.
  • Adopting standards-based platforms and implementing technologies such as virtualization can consolidate SCADA server infrastructure and deliver a range of benefits, such as simplified management, easy testing and upgrading of existing and new applications and preparation for the IIoT.
  • When virtualizing it is important to understand that you need to protect your server assets, as moving everything to a single virtualized platform means that everything fails if the platform fails. There are various strategies to prevent this, but it is important to ensure that you don’t swap the complexity of a single server per application for a complex failure recovery mechanism in a virtualized environment.
  • Fault-tolerant platforms are a key way to avoid this complexity, delivering simplicity and reliability in virtualized implementations, eliminating unplanned downtime and preventing data loss – a critical element in many automation environments, and essential for IIoT analytics. It is important to note that disaster recovery should not be confused with fault-tolerance. DR provides geographic redundancy in case of catastrophic failures, but will not prevent some downtime of data loss. In fact fault-tolerance and DR are complementary and they are often implemented together.
  • IIoT is driving OT and IT together so it is important to understand the priorities of each organization. In fact, OT & IT share a lot of common ground when it comes to key issues and this is a good starting point to cooperate in the move towards IIoT. Common requirements include no unscheduled downtime, cyber-security, the need for scalable and upgradeable systems and applications, as well as measurable increases in ROI, ROA and KPI’s. Last but not least is future-proofing systems and preparation for future IIoT applications.

This webinar is a good way to start the process of looking into what needs to be considered for upgrading and modernizing automation compute assets, using technologies such as virtualization and fault tolerance, as the industry evolves to increased levels of efficiency and moves towards implementing IIoT.

Watch the webinar

The ABCs of the Industrial Internet of Things

6.13.2016High Availability, IIoT, Industrial Automation, VirtualizationBy: The Industrial Internet of Things (IIoT) holds huge rewards for manufacturing companies from consumer goods makers to petrochemical firms to utilities. Companies, large and small, already are crediting IIoT with hard cost savings and advances in operational efficiency and product quality. This blog will answer frequent questions about IIoT we get from our industrial customers that you also might have.

What Is IIoT Anyway?

Sensor data, machine-to-machine communication, and automation systems have existed in industrial environments for years. IIoT builds on these technologies and bakes smart devices, machine learning, big data, and analytics into the mix.

With additional data sources and better intelligence and analytics embedded into the supply chain, you can adjust your industrial processes in real time. From there, you can expect tangible progress toward improved operational efficiency, return on assets, and profitability. That is the heart and soul of IIoT.

My Production Line Is Working Fine. Why Would I Change Things?

One of the biggest drags on inventory and order flow is unplanned downtime. For example, one hour of downtime for a large turbine powering a production line can cost a company up to $10,000 an hour. To avoid outages, manufacturers take production systems offline for periodic maintenance—needed or not. Not only does this get costly but even planned downtime is disruptive.

Alternatively, some manufacturers are using IIoT for predictive maintenance of factory line equipment. In these situations, a smart sensor attached to an assembly line motor monitors performance and reports on changes, such as temperature or vibration, which may signal failing parts. A proactive repair of the motor could avoid a complete failure and potentially weeks of downtime, costing millions of dollars in lost revenue. Or, it could shave seconds from the assembly line process and help the business fulfill orders and recognize revenue faster.

Such improvements translate into a compelling competitive advantage since the firms embracing IIoT turn out products faster and at a lower cost. That alone is a viable reason to embrace IIoT.

I’m Ready. How Do I Get Started?

Before getting started you need to ask yourself if your infrastructure is ready for IIoT.

Our recommended first step is to look at virtualization technologies to reduce your infrastructure and maintenance costs. The work effort involved in securing virtualized environments is less intensive than existing approaches and they are far easier to update and scale.

The good news is that by virtualizing you can continue running your existing automation systems to minimize your upfront investment. To ensure uninterrupted uptime, a fault-tolerant server that will keep other connected virtual servers running in the presence of a hardware problem is essential. Unlike clustered solutions, fault-tolerant systems are easier to manage and not subject to downtime when failover occurs.

Once you have your IIoT infrastructure in place, you can begin to enjoy the rewards of manufacturing processes that run faster, more cost-efficiently, and reliably than ever before.

Preparing for new applications that will come with IIoT (Industrial Internet of Things)

6.1.2016High Availability, IIoT, Industrial Automation, VirtualizationBy: There are several themes that we at Stratus hear repeatedly from our Industrial Automation customers and prospects. The current hot topic is IIoT, and although many companies have no immediate plans to implement it, everyone wants to know more about what it is and how they can prepare for its arrival. A perennial question we get is, “what can I do to prevent unplanned downtime?”, or the closely related “what can I do to prevent data loss when my server fails?”.  This is often followed by questions such as “I’m hearing that virtualization can simplify my HMI/SCADA/MES…. systems but won’t that take down everything if the server fails?” and “doesn’t virtualization mean I need a new complex system to prevent unplanned downtime and data loss?”

With new initiatives like IIoT and with the increasing threats to cyber-security, there is also no doubt that operational technologists and information technologists need to collaborate more deeply than ever before. It’s quite often a challenge, as perspectives and priorities can be quite different, but getting a productive conversation started can be a challenge.

Solving these types of problems and understanding how to approach these issues is, after all, why companies turn to Status, it’s what we do.

Not everyone is ready to engage in a direct discussion with Stratus, so we have asked Craig Resnick of the ARC Group to create a webinar, to help companies work through what is involved in applying technologies such as virtualization to eliminate unplanned downtime and prepare for new applications that will come with IIoT. If you are interested, register here

Pageof 16