How to Create the Optimal Disaster Recovery Architecture

How to Create the Optimal Disaster Recovery Architecture

We have worked through designing and configuring a backup strategy. We have dispatched others on a quest to define their needs and roles in a disaster situation.

Now we need to focus on the critical aspect for the IT admin – to architect a solution that will carry technology solutions through. The most appropriate label for this portion is “business continuity”.

We want to enable our systems to maintain enough functionality to support business processes through adverse situations. This expands well beyond simple backup.

Using Secondary Sites for Business Continuity and Disaster Recovery

Geographically distant alternative operating sites are the most direct way to achieve business continuity in a disaster situation. Larger businesses may have additional locations that they can designate as secondary to others.

In order to qualify as a secondary or alternative site, the location must have sufficient computing capability and space for dislocated personnel to stand in for the primary site. Logistical constraints may force operations to run at diminished capacity but set your goal at reasonable continuation of business functions.

Recent technological advances have alleviated the pressure for secondary sites to match primaries.

Evaluating Secondary Site Viability

Merely owning another building does not automatically mean that you can use it as a disaster recovery location. Most importantly, the sites must have enough geographical distance between them that a single disaster won’t disable both.

A secondary site must also not require a great deal of effort to make into a functional workspace. An empty warehouse will not replace a datacenter and call center in short order, for example.

If you don’t already have business justification for a secondary site, then you might need the same employees to operate out of both locations. In that case, they must be close enough that traveling to the alternative site doesn’t constitute a hardship.

If the most probable types of concerns in your region are building fires and tornadoes, then a few miles should suffice. If hurricanes, tsunamis, wildfires, or earthquakes threaten you, then you might face a greater challenge.

Always consider the primary business function of a site. Sadly, secondary sites simply cannot save some. If you distribute widgets from your main warehouse and a fire eliminates all the inventory, would a backup site accomplish anything meaningful? If you can file insurance claims against the data loss and redirect suppliers and carriers to another warehouse, then you can answer, “yes”. If you cannot find a way for a backup site to continue the business functions of its primary, then it adds overhead without value.

Handling Split Responsibilities

With the limitless variety of configurations, one article series cannot cover all possibilities. This section is written as if all sites perform all roles (operations, finance, computing, etc.). Reality ranges somewhere between that and locations dedicated to specific activities.

In your documentation, rather than pairing one site to another, you can match a function of a site to the location that can act as its secondary. For example, a site that is just a datacenter might fail over to a building that houses server hardware and sales staff.

The computing equipment at the alternative location could use the dedicated datacenter as its secondary, but the sales functions would need to be targeted somewhere else.

Planning Hot Secondary Sites

A hot site can take over for a primary site very quickly. It has sufficient hardware onsite and is operational and receives regular data updates directly from the primary site. Enabling such a feat requires detailed planning, high quality equipment, frequent maintenance, and constant monitoring.

You will need regular, perhaps permanent, onsite staff. That staff must know how to keep the inter-site replication operating and how to fail over from the primary and back.

As you might expect, this level of functionality carries a significant cost. It works best in companies that have enough resources and volume to justify multiple locations even before considering business continuity. To operate as a hot secondary site, the location must have:

  • Sufficient connectivity to the primary site during normal operations to support replication; measure speed and stability
  • Server hardware powerful enough to operate in the absence of the primary site
  • Physical space for personnel
  • Suitable connectivity for failover conditions; think of computer and voice networking

An upcoming article dives deeper into replication. For now, understand that a hot site needs nearly constant data updates from its main site. That means that you need a fast and sturdy data connection between them. At the high end, you can order direct fiber runs between locations.

Research the available options for point-to-point services. If you cannot find or afford such services, you will need to use general Internet connectivity instead. If possible, utilize two different providers per site. For maximum value, separate providers should use different infrastructure.

It greatly reduces your redundancy if both follow the same circuits to your building or if they route through the same intermediate facilities. Rural installations have great susceptibility to outages from ditch-digging accidents. Identify such concerns and plan mitigations and workarounds.

Place a premium on data security implementation. If you can afford point-to-point technology, then you have a lower risk profile for data interception.

For the greatest protection, encrypt traffic as it traverses sites. Have devices under your control perform the encryption and decryption. Even lower-end equipment frequently supports site-to-site VPN technology. Forcing all traffic that crosses the line through an encrypted tunnel prevents the need to police all communications separately.

As a bonus, you can alleviate the CPU load on computing equipment by allowing your replication software to skip its own encryption functions.

Be mindful of the computing and data storage needs of a hot site. It will require at least as much as the primary, and perhaps more. It may become a “data dump” for archival purposes.

As a secondary site ages without handling a catastrophe, it might find some of its resources “temporarily” repurposed. You will probably not have any real power to stop that from happening, and these “temporary” activities tend to become permanent.

Make certain to maintain a minimum level of functionality and capacity at each secondary site.

Employee spaces need to be prepared to accept personnel at any time. Prepare it like any other work site. It needs:

  • Power
  • Water
  • Lighting
  • Seating
  • Desktop computing
  • Voice support
  • Air handling

You might face some struggles acquiring maintenance support to make this viable. While the data recovery portions of a plan obviously fall to IT, these types of business continuity responsibilities fall outside its purview.

Your business managers will be reluctant to devote resources like this to any building that does not have a continuous personnel presence.

Even if you get sign-off in the first year, that does not preclude someone from looking back in a few years and deciding that it was wasteful and that the resources should go somewhere else. In those cases, you might lose your alternative site entirely.

If you are uncertain that you can maintain a hot site into perpetuity, strongly consider implementing a warm or cold site instead.

Planning Warm Secondary Sites

Warm sites mainly differ from hot sites in the lack of continuous data updates. We only treat that as a convention, not an unbreakable definition. In practical usage, a warm site may simply mean the closest that an organization gets to having a hot site. A warm site has two major distinctions from a hot site:

  • A warm site needs more than a few minutes’ effort to resume operations from the primary
  • The inter-site network connection between a primary site and a warm site does not need to pass any special quality tests

Because a warm site does not receive continuous updates, you must have a plan in place to transfer data to the site when needed.

You can achieve that by having employees transport backup tapes or drives to the site and restoring them on the hardware there. You can relay data through a cloud provider. Since your plan cannot depend on the presence of any specific individual, use the most generic descriptions and instructions possible.

Anyone that the task might fall to must understand their responsibilities before needing to undertake them.

The site does need to meet all the other tests that apply to a hot site. But, if it can’t function as an alternative location, then it fails the test entirely.

However, you have more flexibility as the architecture and definition of a warm site include an expectation that it will take some time to spin up. To properly distinguish itself from a cold site, it must have adequate onsite computing abilities to resume business functions from the primary site.

Planning Cold Secondary Sites

Cold sites have the widest definition of the three alternative sites. Anything that could replace primary site functionality can qualify. Like warm sites, they lack an active replication scheme. They differ from warm sites in that they do not contain enough computing hardware. Such a site requires significantly less cost and effort to maintain, especially at hardware refresh intervals.

These savings come with a risk trade-off. If you lose your primary site due to a localized building fire, then you can probably get replacement hardware quickly. If the calamity is widespread and affects a large number of businesses, you might face significant supply and delivery challenges.

At the same time, if both your primary and secondary sites exist within the danger zone, you might work from the odds that one of the buildings remains usable. In that situation, it might make sense to only gamble with the contents of one facility.

A cold site must pass most of the non-computing tests of a hot site without the always-on restrictions. The time waiting for computers to arrive and data restoration to be completed also gives you time for office furniture delivery.

The power and environmental systems must function before people start, so find out if your utility companies can make that happen quickly enough that you do not need to maintain them when not in use.

Cold sites require a meaningful amount of time to begin work. They reduce your ability to continually conduct business. Another upcoming article will explore the technologies that can greatly mitigate these shortcomings.

Ongoing Maintenance for Secondary Sites

If all goes well, you will never need to use a secondary site. Unfortunately, such good fortune can also cause a loss of interest and long-term unwillingness to sink further funds into it. You must include all secondary locations in the regular updates of your plan.

Ask:

  • Does the site still have sufficient hardware to take over for the primary?
  • Do we know that power, water, lighting, and environmental systems function?
  • Do current employees know how to get to the site?
  • Do current employees understand their role in transitioning to the alternative site?
  • Do we have monitoring in place that guarantees the quality of replication?
  • Are we replicating everything? Have we added any systems since we last answered this question?

Site maintenance goes well beyond the functions of IT. Keep the relevant departments invested. When you perform reviews of the disaster recovery plan, invite them to provide updates.

Analyzing Disaster Recovery Hardware Needs

In a perfect world (perfect except for disasters, anyway), you would establish secondary sites as complete mirrors of their primaries. Budgets and managerial tolerances rarely make that possible. So, you’ll need to document hardware needs to enable disaster recovery.

If you can afford secondary sites, then determine acceptable hardware levels. Whether you have only one site or many, you must have access to the necessary hardware to make disaster recovery possible.

End-User Infrastructure and Systems

Some things have no room for reduction. Every knowledge worker will need a computing device. Every one of those devices will need some way to attach to the network. On the non-technical side, each person will need a chair and a work surface.

The business managers responsible for related operations will need to participate in planning. They can provide headcounts and need assessments relevant to replacement and secondary site concerns.

End-user networking will require a physical survey of any secondary sites. You can determine port counts easily, but even a cold site should not wait for cabling.

You might also uncover conditions that dictate a different deployment strategy, such as per-floor local hardware instead of home runs from each endpoint.

Inter-site and Internet connectivity need planning as well. If you want the secondary to act as a hot site, you will need enough bandwidth, reliability, and security to safely transmit data from the primary.

If the site has another use when not in business continuity mode, then its current Internet connection may not have sufficient bandwidth to accommodate overflow employees from the primary. Consult employees and have them think through what they need to conduct a normal days’ business. Plan for printing, faxing, and other needs.

Server Infrastructure and Systems

For single-site recovery planning, usually you only need the specifications of your hardware. If you buy using any sort of account with a vendor, they probably maintain a purchase history. However, they probably don’t know the purpose of any of it.

For the best results, include a hardware catalog in your disaster recovery planning. Specify the hardware’s purpose, then its specifications. For general purpose equipment that exists to extend coverage, such as end-user aggregator switches and printers, you can use locations instead.

If you use hot or warm secondaries, they will need to have server systems onsite. Take care when configuring standby server hardware. There will be temptation to purchase lower-powered equipment than what you have in the primary site.

Since you may need to run at reduced functionality, that seems logical. However, you may fail to receive sufficient funding for the secondary site when it comes time to refresh the hardware at the primary site. If that’s a concern, then consider using somewhat higher-end equipment than you strictly need.

You might need to add switching, routing, firewall, and load-balancing equipment for any servers that will only operate in a failover condition. Having enough between sites to enable replication does not mean that it can also suddenly take on the load of dozens or hundreds of users performing their daily roles.

Inter-Site Hardware

Beginning the use of multiple sites for disaster recovery requires additional equipment. You have many architectural decisions to make, and some might challenge your networking teams’ current knowledge levels. Among the things to consider:

  • Will you use point-to-point networking to enable replication?
  • Will you maintain a constant direct Internet connection at the secondary, or will you have it physically connected but only have your provider turn it on when needed?
  • If you have a constant point-to-point connection, will you also have constant Internet connections at the secondary?
  • Will you require the remote sites to tunnel through the primary for their Internet access and only enable direct-connect in the event of an emergency?
  • Do you have the necessary hardware to perform the desired functions at each location?
  • Does your staff have the networking knowledge to configure this as desired?
  • Will you use temporary consultants to configure things and repair them on-demand or train your staff?

Much of your decision-making will depend on how much of a shift these secondary sites represent for your organization. If you already have multiple sites and deal with these problems today, then you likely also have the expertise on hand. If you have always used a single site with simple networking, adding even one connected site can greatly complicate everything.

While the staff that you have today can certainly learn the additional functionality, you have no guarantees that they will stay. If you cannot afford to hire that level of talent into perpetuity, then consider hiring a professional networking firm to architect and maintain the inter-site links.

Disaster Recovery Hardware

With all of the talk of remote sites, networking tends to dominate the discussion. Don’t neglect the systems that will truly make disaster recovery possible. If you use tapes, make certain that you have access to tape drives that can read them. For tapes recorded last week, that’s easy. For tapes recorded in 2002, you might have to work harder.

As things transition more to using commodity hardware and online services, this concern shrinks. Look through your backup systems for anything that might require special handling if you lose the entire facility where the backup was taken. Make sure that you can restore its contents on alternative hardware.

Disaster Recovery Hardware

Maximizing Disaster Recovery Architecture

The hardest question in business continuity planning: “What are we missing?” Even comprehensive guides don’t prepare you for everything. Sometimes, after going over a prepared checklist or write-up, we have a hard time thinking beyond it.

For help, review your brainstorming sessions from earlier articles. Reach out to colleagues that have a stake but have not seen what you’ve already come up with. Take a physical walk through your primary site and look for anything that wasn’t brought up in meetings.

At no point should you claim that you have “finished” your planning. Always leave a few blank lines, at least metaphorically, for more information. Add disaster recovery tie-ins to any formalized processes for starting new or updating existing projects of any kind. Start up a system for employees to suggest items that didn’t make the initial plans.

Another recent “wrinkle” in this planning is the adoption of work from home or work from anywhere schemes. Depending on your industry vertical everyone may not need to be in an office to perform their duties.

However, this presents other challenges to include in your planning, if your solution to a burnt-out office is “everyone just works from home”, do you have the security, networking and systems infrastructure to facilitate this? And if you do, what if the fire was larger and many of those homes have also been destroyed?

To properly protect your virtualization environment and all the data, use Hornetsecurity VM Backup to securely back up and replicate your virtual machine.

We ensure the security of your Microsoft 365 environment through our comprehensive 365 Total Protection Enterprise Backup and 365 Total Backup solutions.

For complete guidance, get our comprehensive Backup Bible, which serves as your indispensable resource containing invaluable information on backup and disaster recovery.

To keep up to date with the latest articles and practices, pay a visit to our Hornetsecurity blog now.

Conclusion

Establishing the optimal disaster recovery architecture involves a multifaceted approach. Having navigated through backup strategies and defined roles in disaster situations, the focus shifts to business continuity. Geographically distant secondary sites play a crucial role, requiring thorough evaluation for viability.

The distinction between hot, warm, and cold secondary sites necessitates careful planning, considering factors like hardware needs, inter-site connectivity, and ongoing maintenance. Analyzing end-user and server infrastructure, as well as maximizing disaster recovery architecture, ensures a comprehensive strategy.

The ever-evolving nature of business demands continuous review and adaptation, leaving room for improvement and innovation in disaster recovery planning.

FAQ

What are the steps in disaster recovery?

Steps in disaster recovery:

  • Risk Assessment: Identify potential risks and assess their impact.
  • Business Impact Analysis (BIA): Determine critical business functions and acceptable downtime.
  • Planning: Develop a comprehensive disaster recovery plan.
  • Data Backup: Regularly back up critical data offsite.
  • Redundancy: Implement redundant systems and infrastructure.
  • Testing: Regularly test the disaster recovery plan to ensure effectiveness.
  • Training: Educate employees on their roles during a disaster.
  • Documentation: Maintain up-to-date documentation of systems and procedures.

What is the best method for disaster recovery?

The best method for disaster recovery involves a combination of:

  • Data Backups: Regularly back up critical data.
  • Cloud Services: Leverage cloud platforms for data storage and application deployment.
  • Redundancy: Implement redundant systems and infrastructure.
  • Regular Testing: Regularly test the disaster recovery plan to identify and address potential issues.
  • Automation: Use automation tools for faster recovery processes.

How long should disaster recovery take?

The duration of disaster recovery varies based on factors like the extent of the disaster, IT infrastructure complexity, and the effectiveness of the recovery plan. Organizations set a Recovery Time Objective (RTO), ranging from minutes to hours, depending on business priorities and criticality. RTOs differ for each system or service, aiming for swift restoration to minimize downtime.

Essential Objectives your Disaster Recovery Strategy Must Achieve

Essential Objectives your Disaster Recovery Strategy Must Achieve

When we talk about disaster recovery and business continuity planning, we devote most of our time and space to backup operations. For technologists and systems administrators, that makes sense. However, this is not the full story.

For almost everyone else in a business, other things matter more. We touched on some of these points briefly while discussing the planning phase.

In this article, we will expand on them. Because IT often drives continuity and recovery planning, it may fall to you to motivate the other business units to participate.

Disaster Recovery Planning Beyond the Datacenter

During risk analysis, you likely found many hazards that would damage much more than data and systems. Fire threatens everyone and everything.

Floods have as much ubiquity; even if you build your business atop a mountain miles away from a river, you still have plumbing. Some sites need to worry about civil unrest and terrorism. Simple mistakes can have far-reaching implications.

Your disaster recovery strategy must consider all plausible (and perhaps some implausible) risks. Just as you plan to recover systems and data, you must also think of your people, buildings, equipment, and other property.

Use your findings as a basis to bring the non-technical groups into the process. You might need to convince executives in order to gain the leverage that you need. Remember that the best data protection and recovery schemes mean nothing if the organization has no plan to continue operations.

A question template to capture interest: “How do we satisfy customers/meet our contractual obligations/ continue making money in the event of…?”

Establish plan scope

During the initial discovery phase, you involved the leaders of other departments and teams. In addition to their technical requirements, they also have knowledge of the personnel, locations, and items needed to carry out their group’s function.

All of that information is vital to understanding the business’ critical IT operations. Hopefully, those sorts of things were included in the early stages of planning. The sample checklists provided in the appendixes have a handful of questions related to people and things.

To fully encompass your organization, your disaster recovery plan must expand further.

Covering locations

We will revisit the subject of business sites a few times during this discussion. Right now, we need to think in terms of which locations to include in your plan and what to record about them. It’s unlikely that you would exclude any working site, but your company might own some empty or unused properties.

When you update your plan in the future, you may need to remove locations that the organization no longer owns or controls.

When you place sites into the scope of your disaster recovery plan, you should centralize their role within that context. Labels like “main campus” don’t mean much to someone trying to address a catastrophe.

For every place listed in your plan, include:

  • Where: Identify locations by address (e.g., 7904 West Front Road). If your organization has informal shorthand and anyone likely to read your disaster recovery document will understand it, you can use that as well (e.g., “Pinewood B” to represent the B building on your Pinewood Road campus).
  • Normal operations purpose: Succinctly and clearly note the typical purpose of the site (e.g., “primary accounting site”). Small companies may not list anything other than “main operations” or “main office” or the like.
  • Disaster recovery operations purpose: List the expected use of the site during or after a disaster if it survives. Be creative and get others involved, such as head of the team or department responsible for building maintenance. Use entries such as “overflow site for accounts receivable”, “alternative shipping and receiving facility”, and “tornado shelter”.

This exercise will expose many things that staff should consider. Where will employees go if you lose a site? What about customers? Does every location have a disaster response plan? Where can we shelter people?

If a loss impacts the loading dock, where else can we ship and receive freight? Look out for other vital information to include.

Protecting equipment

The term “equipment” connotes different things, depending on your industry and business function.

Your disaster recovery plan needs to account for all kinds. If it’s tangible and isn’t land, a building, or inventory that you sell as part of your normal business operations, then it might also have a place here. The equipment inventory portion of a disaster recovery document addresses these concerns:

  • What was lost or damaged in a disaster, and what is still serviceable
  • What qualifies for an insurance claim?
  • What leases, loans, and rentals were impacted?
  • What bulk small items would need replacements? (e.g., pens, paper stock)

Make your search broad. You need to cover vehicles, printers, desks, and anything else that your business would need in order to return to full functionality after a destructive event.

You will likely separate this portion from your major IT inventory. Personal computers, devices, and printers could logically appear in either place. You might also wish to create separate lists for different departments, sites, etc.

Use any organizational method that makes sense and would have value in the hectic aftermath of a calamity.

Handling business inventory

If your business works with inventory flow (retail, manufacturing, etc.), then you will need a section for it, separated from the equipment entry. Due to its fluid nature, precision gains you nothing but work. Instead, document its supporting structure. Examples:

  • Link to sites (warehouse, retail outlet, etc.)
  • Methods to account for damaged, destroyed, or stolen inventory
  • Disposal processes for damaged and destroyed inventory
  • Mitigations in place (fire suppressant systems, backup generators for freezers, etc.)
  • Contacts to repair, replace, and reset mitigations
  • Information on insurance coverage and processes

Remember to stay on target: you are not trying to duplicate your inventory system. You only want to incorporate inventory management into your disaster recovery plan.

Preparing and safeguarding personnel

Human life and safety concerns make this portion of your planning stand out. For the bulk of your physical items, you will have to wait to respond until after the disaster. People need immediate attention and care.

As this article primarily targets technology workers, these responsibilities may not fall to you. Either way, the organization’s disaster recovery plan must include people-related preparedness and response points. Elements to consider:

  • Building exit points
  • Evacuation routes
  • Emergency power and light sources
  • First aid kits
  • Fire suppression devices and systems
  • Extreme weather shelter
  • Staff drills
  • Notification/call trees
  • Check-in procedures

Some of these items might be excessive for your situation. For instance, if you have a one-room shop with two employees, then your evacuation route is “door”, your call “tree” is a flat line, and drills are not the best use of your time.

If something does not make sense to include in your plan, skip it or use some sort of placeholder to fill in as your company grows. No one should disregard this step entirely. All organizations of all sizes can find something of value here.

Do not consider this list comprehensive. Compare it to your identified risks. Gather input from others. If your business already has safety or emergency management teams, join forces. They have likely worked through all of this, and you only need to find logical connection points to your data and systems recovery plans.

Even if you do not have a designated team, someone might have done some of this work to comply with regulatory demands.

Prioritizing and documenting departmental tie-ins

Take exceptional care with organizational seams. In the example that mixed IT and sales, the sales staff will depend on other teams (depending on the disaster conditions). They will be limited in their response until those other components fall into place.

Without a predefined process, IT will likely receive a call every few minutes from a different salesperson asking, “Are we up yet?” To prevent that, specify how response teams will utilize the notification system.

Designate points of contact between the departments. When you take those spurious calls, notify the caller who is in their department and was selected to disseminate information. These small preparations reduce frustration and interruptions.

Delegating information collection and retention

One person will not bear the responsibility of all items covered in this chapter. Depending on the size and makeup of your organization, you may need to involve people from several departments. You will have the best luck with subject-matter experts, especially if they have already done some of this work.

For instance, most manufacturing companies that existed before computers have had equipment and inventory control systems in place for a long time.

Plan to encounter some resistance. Department heads might see this as an encroachment into their territory. Some people just will not want any more work to squeeze into their day. Have some points ready to show the benefits of cooperation.

Disaster recovery planning is a unified effort for the entire company, not a way to wrest control. In a disaster, every department will be expected to take responsibility for their part. A plan will help to establish the rules in advance.

The central plan only needs sufficient information to coordinate a recovery effort. If a department already has a detailed procedure, then you might only need to document its input and output junction points to other departments.

However, make it clear that you cannot simply use something like, “to recover the billing department, call Bill.” A disaster recovery plan cannot depend on any individual.

If all else fails, you should have gotten executive sign-off when beginning this project. Leverage that as a last resort.

Just as you have learned that business continuity planning is an ongoing process, not an event, you will need to make the leaders of the other business units aware. Set up a quarterly or biannual schedule for everyone to bring their updates and to review the document.

Keep as much of the documentation together as possible. Some departments may insist on maintaining their own. IT often takes on business continuity planning because it is the hub, but the plan belongs to the entire organization. Integrate as much of it as you can.

Restoring Non-data Services

Just as we need to document and prepare for disaster to strike our business units, we also need to have a path back to functionality.

We’ve touched on this in previous points of the text where it made sense. Recovery requires a substantial amount of time and effort to perform. It should have a matching level of representation in your plan.

If you worked straight down through this section, then you have already begun to think about the necessary components. Buildings need to be made usable, equipment needs to return to service, inventory needs to flow.

All those activities represent individual pieces. As they start up and come online, employees will move to the next tier: coordinate the pieces to enable your business’ functions and services.

With modern digitized infrastructure, IT often provides a backbone for everyone else. Unfortunately, the world will not stop just because the computers do not work. You need an alternative course of action.

Downtime procedures

Each department or team needs to develop downtime procedures. Define a minimum time period that staff have no access to the system before switching procedures. If IT can bring things back in a few minutes, the switchover probably consumes too much time.

The upcoming “Business Process” section will expand these concepts. Right now, we mostly want to present its tie-ins to the non-technology portions of the business. A few ideas to get you started on developing downtime procedures:

  • Prepared paper forms, receipts, and ledgers
  • Traditional telephone lines as backup to VoIP
  • Cell phones or handheld radios for internal communications

To reiterate, these procedures specifically apply to continuing business processes through a system failure. Do not mix them with response and recovery activities.

As an example, “periodically add gasoline to the backup generator” is fine to include. Something like, “notify carriers of alternative pick-up and drop-off sites” should go elsewhere.

Staff that do not need to devote their time to resumption efforts will follow downtime procedures.

Dependency hierarchies

Create visual aids that illustrate the differences between necessary and desired dependencies. For simple designs, text trees might suffice. However, you already have tools that can easily produce suitable graphics. As an example, consider the following diagram of a shipping/receiving department’s operation:

Dependency hierarchies

The structure of this graphic implies that the “Shipping and Receiving” function must have people to work the freight and a place to load and unload. The placement of the service atop two separate blocks shows that it can operate with either but must have at least one.

Inventory software appears first (in a left-right culture), indicating that it has preference over the paper-based solution. If some other service relies on the shipping and receiving function, then start another diagram for it with a single “Shipping/ Receiving” block at its bottom.

An image like this works well for simple hierarchies and is easy to make. This particular graphic was built in Microsoft PowerPoint and exported as an image. PowerPoint also includes several standard flowcharting shapes for more complicated trees.

If you need even more detail, you can bump up to a dedicated diagramming product such as Microsoft Visio.

Organizing disparate documentation

With everything in this section added in with your IT documentation, your disaster recovery plan may become unwieldy. To minimize the problem:

  • Decide on a logical document organization strategy and apply it uniformly
  • Use your tools’ features to create a navigable table of contents
  • Split the document

If possible, build a template for departments to follow. If you find that needs are too diverse for a single form, you can create multiple forms from the same source, or you can have a generic starting point with custom content.

Maintain the core principle that the people who initially create and innately understand the document may not be the same people that put the document into action. Recovery efforts might be coordinated by project managers who don’t know what some of the words mean. Clarity, consistency, and uniformity matter.

Microsoft Word and most common PDF generation tools can establish intra-document links. Some can automatically create a table of contents from headers and sub-headers.

Although a major disaster may make it possible that no one will have a digital copy, the pervasiveness of small devices and highly redundant cloud storage heavily reduce those odds.

When connecting different parts of a document, ensure that the linkage exists as both a clickable item and with words (e.g., “voice engineering must have restored telephone services before inbound sales can resume”.

Consider splitting the document. This action breaks and sometimes removes links, so use it only as a last resort. You will need to maintain an absolute minimum of one complete digital and one complete physical document.

Each revision will require you to repeat splitting operations. Even if you only revise one section, a shift in page numbering could throw off all unsynchronized copies and splits. As a mitigation, you can use only monolithic digital copies but allow people to selectively print the pages that pertain to them.

You can also use a document management system or multi-document software, such as SharePoint, OneNote, or a wiki. Tools of that kind often have challenges with creating hard copies, though.

To properly protect your virtualization environment and all the data, use Hornetsecurity VM Backup to securely back up and replicate your virtual machine.

We ensure the security of your Microsoft 365 environment through our comprehensive 365 Total Protection Enterprise Backup and 365 Total Backup solutions.

For complete guidance, get our comprehensive Backup Bible, which serves as your indispensable resource containing invaluable information on backup and disaster recovery.

To keep up to date with the latest articles and practices, pay a visit to our Hornetsecurity blog now.

Wrapping up Non-technical Planning

Now you can gather the other people needed for the tasks in this section. If they were not included in previous steps, make sure that they understand the goal: surviving, working through, and recovering from major and minor disasters that affect their departments, their customers, and their interactions with other parts of the business.

As they start work gathering information and preparing their documentation, you can move on to the technical details of recovery.

FAQ

What is a disaster recovery strategy?

A disaster recovery strategy is a comprehensive plan outlining the steps and procedures an organization follows to resume operations after a disruptive event, such as a natural disaster, cyberattack, or system failure.

Why is a disaster recovery strategy important for businesses?

A disaster recovery strategy is crucial for businesses to minimize downtime, protect data, and ensure continuity of operations in the face of unforeseen events. It helps mitigate risks and safeguards critical business functions.

What components should be included in a disaster recovery strategy?

A robust disaster recovery strategy typically includes risk assessments, data backup procedures, communication plans, IT infrastructure recovery plans, and regular testing to ensure effectiveness.

How often should a disaster recovery strategy be updated?

Disaster recovery strategies should be reviewed and updated regularly to align with changes in technology, business processes, and potential risks. Many experts recommend annual reviews, but the frequency may vary based on organizational changes.

How Often Should I be Taking Backups

How Often Should I be Taking Backups

Now you understand your organization’s data protection needs and you have the means to implement them. To bring it to life, you need to design the schedules for your backups. Unless you have very little data or a high budget for backup, you will use more than one schedule.

You will use three metrics:

  1. Value
  2. Frequency of change
  3. Application features

Understanding How the Value of Data Affects Backup Scheduling

The frequency of your full backup schedule directly determines how many copies of data that you will have over time. The more copies you have of any given bits, the greater the odds that at least one copy will survive catastrophe. So, if you have data that you cannot lose under any circumstances, then your schedule should reflect that.

Understanding How the Frequency of Change Affects Backup Scheduling

Data that changes frequently may need an equally frequent backup. As you recall from part one, recovery point objectives (RPO) set the maximum amount of time between backups, which establishes the boundaries of how much recent data you can lose. You must also consider how often that data changes independently of RTO.

If you have data that does not change often, then you might consider a longer RPO. If you only modify an item every few months, then it might not make sense to back it up every week. However, that might have unintended consequences.

As an example, you set a monthly-only schedule for your domain controller because you rarely have staff turnover and only replace a few computers per year. Then, you hire a new employee and supply them with a PC the day after a backup.

If anything happens to Active Directory during that month, then you will lose all that new information. Your schedule needs to consider such possibilities.

Understanding How Backup Application Features Affect Scheduling

You will find that modern commercial backup applications have more in common than different. They all have some way to schedule jobs. Each one uses some way to optimize backups. The exact features in the solutions that you use will influence how you schedule.

The following list provides a starting point for you to determine how to leverage the features in your selected program:

Virtual machine awareness

If your backup software understands how to back up virtual machines, then you can allow it to handle efficient ordering. If not, then you will need to schedule to back up the guest operating systems such that the jobs do not overwhelm your resources.

Space-saving features

If your backup tool can preserve storage space, that has obvious benefits. Everything involves trade-offs – ensure that you know what you give up for that extra space.

Some common considerations:

  • Traditional differential and incremental backups complete more quickly than the full backups they depend on. They mean nothing without their source full backup. Design your schedule to accommodate full backups as time and space allow;
  • Newer delta and deduplication techniques save even more space than differential and incremental jobs but require calculation and tracking in addition to the requisite full backups. They should not use significant CPU time, but you need to test it. Also check to see if and how your application tracks changes. Some will use space on your active disks;
  • If you have extra space in your storage media, then do not depend overly on these technologies. Create more full backups if you can.

Time-saving features

Many of the features in the previous bullet point save time as well as space. As with space, do not try to save time that you do not require.

Replication

Replication functions require bandwidth, which can cause severe bottlenecks when crossing Internet links. If a replication job is not completed before the next job begins, then you might end up with unusable backups.

Media types

Due to the wide variance in performance of the various backup media types, the option(s) that you choose will determine how you schedule backups and what space-saving features they use. For instance, if you need to back up several terabytes to tape and a full backup requires twelve hours to perform, then you can only run a full backup when you have twelve hours available.

Snapshot features

If your backup application integrates with VSS (Volume Shadow copy Services – a feature of the Windows Operating System) or uses some other technique to take crash-consistent or application-consistent backups, then you have greater scheduling options.

Backup uses system resources and you do not want one job to conflict with another, but snapshotting allows you to run backups while systems are in use.

You should have become well-acquainted with your backup program during the deployment phase. Take the time to fully learn how your backup program operates. Keep in mind the need for periodic full backups.

Putting It in Action

Since taking full backups every time would quickly exceed any rational quantity of time and media, you must make compromises. Remember that, if possible, you would take a complete backup of all your data at least once per day.

Guidelines for backup scheduling:

  • Full backups need time and resources, even with non-interrupting snapshot technologies. Try to schedule them during low activity periods.
  • Full backups do not depend on other backups. Therefore, they have greatest value after major changes. As an example, some organizations have intricate month-end procedures. Taking a backup immediately afterward could save a lot of time in the event of a restore.
  • Incremental, differential, delta, and deduplicated backups require relatively little time and space compared to full backups, but they depend on other backups. Use them as fillers between full backups.
  • If your backup scheme primarily uses online storage, make certain to schedule backups to offline media. If that is a manual process, implement an accountability plan.

Just as administrators tend to perform backups at night, they also like to schedule system and software updates at night. Ensure that schedules do not collide.

Grandfather-father-son sample plan

“Grandfather-father-son” (GFS) schemes are very common. They work best with rotating media such as tapes. One typical example schedule:

  • “Grandfather”: full backup taken once monthly. Grandfather media is rotated annually (overwrite the January 2020 tape with January 2021 backup, February 2020 with February 2021 data, etc.). One “grandfather” type per year, typically the one that follows your organization’s fiscal year end, is never overwritten, following data retention policy.
  • “Father”: full backup taken weekly. “Father” media is rotated monthly (i.e., you have a “Week 1” tape, a “Week 2” tape, etc.).
  • “Son”: incremental or differential backups are taken daily, and their media overwritten weekly (i.e., you have a “Monday” tape, a “Tuesday” tape, etc.).

The above example is not the only type of GFS scheme. The relationship of the different rotation components is how it qualifies. You have one set of very long-term full media, one shorter-lived set of full media, and rapidly rotated media.

Some implementations do not keep the annual media. Others do not rotate the monthly full, instead keeping them for the full backup retention period. Some do not rotate the daily media every week. Your organization’s needs and budget dictate your practices.

With a GFS scheme, you are never more than a few pieces of media away from a complete restore. Remember that a “differential” style backup needs the latest “son” media and the “father” immediately preceding whereas an “incremental” style backup needs the latest “father” media and all of its “sons”.

The downside of a GFS scheme is that you quickly lose the granular level of daily backups. Once you rotate the daily, then anything overwritten will, at best, survive on the most recent monthly or perhaps an annual backup. The greatest risk is to data that is created and destroyed between full backup cycles.

Online media sample plan

If your backup solution uses primarily online media, then the venerated GFS approach might not work well. Most always-online systems do not have the same concept of “rotation”. Instead, they age out old data once it reaches a configured retention policy expiration period.

For these, your configuration will depend on how your backup program stores data. If it uses a deduplication scheme and only keeps a single full backup, then you have little to do except configure backup frequency and retention policy.

Due to the risks posed by having only one complete copy of your data, you must enforce periodic full backups to offline media. You should also consider some form of replication, whether to the cloud or an alternative working site.

Continuous backup sample plan

Many applications have some form of “continuous” backup. They capture data in extremely small time increments. As an example, Hornetsecurity’s VM Backup has a “Continuous Data Protection” (CDP) feature that allows you to set a schedule as short as five minutes.

Scheduling these types of backups involves three considerations:

  1. How does the backup application store the “continuous” backup data?
  2. How quickly does the protected data change?
  3. How much does the protected data change within the target time frame?

If your backup program takes full, independent copies at each interval, then you could run out of media space very quickly. If it uses a deduplication-type storage mechanism, then it should use considerably less. Either way, your rate of data churn will determine how much space you need.

For systems with a very high rate of change, your backup system might not have sufficient time to make one backup before the next starts. That can lead to serious problems, not least of which is that it cannot provide the continuous backup that you want.

You can easily predict how some systems will behave; others need more effort. You may need to spend some time adjusting a setting, watching how it performs, and adjusting again.

Mixed backup plan example

You do not need to come up with a one-size-fits-all schedule. You can set different schedules. Use your RTOs, RPOs, retention policies, and capacity limits as guidance.

One possibility:

  • Domain controllers: standard GFS with one-year retention
  • Primary line-of-business application server (app only): monthly full, scheduled after operating system and software updates, with three-month retention
  • Primary line-of-business database server: continuous, six-month retention
  • Primary file server: standard GFS with five-year retention
  • E-mail server: uses a different backup program that specializes in Exchange, daily full, hourly differential, with five-year retention
  • All: replicated to remote site every day at midnight
  • All: monthly full offline, following retention policies

To properly protect your virtualization environment and all the data, use Hornetsecurity VM Backup to securely back up and replicate your virtual machine.

We ensure the security of your Microsoft 365 environment through our comprehensive 365 Total Protection Enterprise Backup and 365 Total Backup solutions.

For complete guidance, get our comprehensive Backup Bible, which serves as your indispensable resource containing invaluable information on backup and disaster recovery.

To keep up to date with the latest articles and practices, pay a visit to our Hornetsecurity blog now.

Sum Up

Remember to document everything!

FAQ

What is a backing schedule?

A backup schedule defines the frequency of data backups and the required backup media. Each hardware type offers various rotation schemes, including industry-standard strategies, and these schemes can be customized after creating the backup job.

What is a good schedule to run a backup?

Typically, performing incremental backups for user files during the daytime is recommended. However, it’s advisable to set maximum speed limits for backups to avoid bandwidth saturation. Full backups are best scheduled nightly and weekly on weekdays.

What is the importance of backup schedule?

The importance of a backup schedule lies in its ability to mitigate data loss in the event of a computer or system failure. By scheduling nightly or weekly backups, you can minimize the potential loss of data. Having a scheduled backup provides peace of mind, ensuring that all your information is regularly backed up, thereby reducing the risk of substantial data loss.

The Pros and Cons of All Backup Storage Targets Explained

The Pros and Cons of All Backup Storage Targets Explained

The days of tape-only solutions have come to an end. Other media have caught up to it in cost, capacity, convenience, and reliability. You now have a variety of storage options. Backup applications that can only operate with tape have little value in modern business continuity plans.

Unless you buy everything from a vendor or service provider that designs your solution, make certain to match your software with your hardware.

Use the software’s trial installation or carefully read through the manufacturer’s documentation to determine which media types it works with and how it uses them. Backup software targets the following media types:

  • Magnetic tape;
  • Optical disc;
  • Direct-attached hard drives and mass media devices;
  • Media-agnostic network targets;
  • Cloud storage accounts.

Magnetic Tape in Backup Solutions

IT departments have relied on tape for backup since the dawn of the concept of backup. This technology has matured well and mostly kept up with the pace of innovation in IT systems. However, the physical characteristics of magnetic tape place a severe speed limit on backup and restore operations.

Pros of magnetic tape

  • Most backup software can target it;
  • Tape media has a relatively low cost per gigabyte;
  • Reliable for long-term storage;
  • Lightweight media, easy to transport offsite;
  • Readily reusable.

Cons of magnetic tape

  • Extremely slow;
  • Tape drives have a relatively high cost;
  • Media susceptible to magnetic fields, heat, and sunlight.

For most organizations, the slow speed of tape presents its greatest drawback. You can find backup applications that support on-demand features such as operating directly from backup media. That will not happen from a tape.

Having said that, tape has a good track record of reliability. Tapes stored on their edges in cool locations away from magnetic fields can easily survive ten years or more. Sometimes, the biggest problem with restoring data from old tape is finding a suitable, functioning tape drive. I have seen many techniques for tape management through the years.

One of the worst involved a front desk worker who diligently took the previous night’s tape offsite each night – leaving it on their car dashboard. It would bake in the sunlight for a few hours each evening. So, even though the company and its staff meant well, and dutifully followed the recommendation to keep backups offsite, they wound up with warped tapes that had multiple dead spots.

At the opposite end, one customer used a padded, magnetically shielded carrying case to transport tapes to an alternative site. There, they placed the tapes into a fireproof safe in a concrete room.

I was called upon once to try to restore data from a tape that was ten years old. It took almost a week to find a functioning tape drive that could accommodate it. That was the only complication. The tape was still readable.

Optical Media in Backup Solutions

For a brief time, optical technology advances made it attractive. Optical equipment carries a low cost and interfaces well with operating systems. It even supports drag-and-drop interactivity with Windows Explorer. They were most popular in the home market. Some optical systems found their way into datacenters. However, magnetic media quickly regained the advantage as capacities outgrew optical media exponentially.

Pros of optical media

  • Very durable media;
  • Shelf life of up to ten years;
  • Inexpensive, readily interchangeable equipment;
  • Drag-and-drop target in most operating systems;
  • Lightweight media, easy to transport offsite.

Cons of optical media

  • Very limited storage capacity;
  • Extremely slow;
  • Few enterprise backup applications will target optical drives;
  • Poor reusability;
  • Wide variance in data integrity after a few years.

When recordable optical media first appeared on the markets, people found its reliability attractive. CDs and DVDs do not care about magnetic fields at all and have a higher tolerance for heat and sunlight. Also, because the media itself has no mechanism, they survive rough handling better than tape.

However, they have few other advantages over other media types. Even though the ability to hold 700 megabytes on a plastic disc was impressive when recordable CDs first appeared, optical media capacities did not keep pace with magnetic storage.

By the time recordable DVDs showed up with nearly five gigabytes of capacity, hard drives and tapes were already moving well beyond that limit.

Furthermore, people discovered – often the hard way – that even though optical discs have little observable structural material, their data-retaining material has a much shorter life. Even though a disc may look fine, its contents may have become unreadable long ago.

Recordable optical media has a wide range of data life, from a few years to several decades. Predicting media life span has proven difficult.

Because of its speed, low capacity, and need for frequent testing, you should avoid optical media in your disaster recovery solution.

Direct-Attached Storage and Mass Media Devices in Backup Solutions

You do not need to limit your backup solutions to systems that distinguish between devices and media. You can also use external hard drives and multi-bay drive chassis. Some attach temporarily, usually via USB. Others, especially the larger units, use more permanent connections such as Fiber Channel.

These types of systems have become more popular as the cost of magnetic disks has declined. They have a somewhat limited scope of applications in a disaster recovery solution, but some organizations can put them to great use.

Pros of directly attached external devices

  • Fast;
  • Reliable for long-term storage;
  • Inexpensive when using mechanical drives;
  • Easily expandable;
  • High compatibility;
  • Use as a standard file system target.

Cons of directly attached external devices

  • Difficult to transport;
  • Additional concerns when disconnecting;
  • Mechanical drives have many failure points;
  • Expensive when using solid-state drives;
  • Not a valid target in every backup application.

Portability represents the greatest concern when using directly attached external devices for backup. Unlike tapes and discs, the media does not simply eject once the backup concludes.

With USB devices, you should notify the operating system of pending removal so that it has a chance to wrap up any writes, which could include metadata operations and automatic maintenance.

Directly connected Fiber Channel devices usually do not have any sort of quick-detach mechanism. In an emergency, people should concern themselves more with evacuation than spending time going through a lengthy detach process. In normal situations, people tend to find excuses to avoid tedious processes. Expect these systems to remain stationary and onsite.

Once upon a time, such restrictions would have precluded these solutions from a proper business continuity solution. However, as you will see in upcoming sections, other advances have made them quite viable. With that said, you should not use a directly attached device alone. Any such equipment must be part of a larger solution.

You may run into some trouble using external devices with some backup applications. Fortunately, it would help if you never ran into any modern programs that absolutely cannot backup to a disk target. However, some may only allow you to use disk for short-term storage.

Others may not operate correctly with removable disks. If you purchase your devices before your software, make certain to test interoperability thoroughly.

Even though mechanical hard drives have advanced significantly in terms of reliability, they still have a lot of moving parts. Furthermore, the designers of the typical 3.5-inch drive did not build them for portability. They can travel, but not as well as tapes or discs. Even if you don’t transport them, they still have more potential failure points than tapes. Do not overestimate this risk, but do not ignore it, either.

Networked Storage in Backup Solutions

Network-based solutions share several characteristics with directly attached storage. Where you find differences between the two, you also find trade-offs. You could use the same pro/con list for networked solutions as you saw above for direct attached systems. We emphasize different points, though.

In the “pros” column, networked storage gets even higher marks for expandability. Almost every storage unit built for the network provides multiple bays. You can start with a few drives and add more as needed. Some even allow you to connect multiple chassis, physically or logically. In short, you can extend your backup storage indefinitely with such solutions.

The network components result in a higher cost per gigabyte for network-attached storage. However, the infrastructure necessary to enable a storage device to participate on a network tends to have a side effect: more features.

Almost all these systems provide some level of security filtering. Less expensive devices, typically marketed simply as “Network-Attached Storage” (NAS), may not provide much more than that.

Higher-end equipment, commonly called “Storage Area Network” (SAN), boasts many more features. You can often make SAN storage show up in connected computers much like directly attached disks. All in all, the more you pay, the more you get. Unfortunately, though, cost increases more rapidly than features.

What you gain in capacity and features, you lose in portability. Many NAS and SAN systems are rack mounted, so you cannot transport them offsite without significant effort.

But, because these devices have a network presence, you can place them in remote locations. Using remote storage requires some sort of site-to-site network connection, which introduces higher costs, complexity, security concerns, possible reduction in speed, and more points of failure.

Even though placing networked storage offsite involves additional risks, it also presents opportunity. Most NAS and SAN devices include replication technology. You can back up to a local device and configure it to replicate to one or more remote sites automatically.

If your device cannot perform replication, or if you have different devices and they cannot replicate to each other, your backup software may have its own replication methods.

In the worst case, you can use readily available free tools such as XCOPY and RSYNC with your operating system’s built-in scheduler.

Using commodity computing equipment as backup storage

Up to this point, we have talked about network-attached devices only in terms of dedicated appliances. SANs have earned a reputation for carrying price tags that exceed their feature sets. In the best case, that reduces your budget’s purchasing power. More commonly, an organization cannot afford to put a SAN to its fullest potential – if they can afford one at all.

As a result, you now have choices in software-based solutions that run on standard server-class computing systems. Some backup applications can target anything that presents a standard network file protocol, such as NFS or SMB.

Software vendors and open-source developers provide applications that provide network storage features on top of general-purpose operating systems. These solutions fill the price and feature space between NAS and SAN devices. They do require more administrative effort to deploy and maintain than dedicated appliances, however.

When I built my first backup solution with the intent of targeting a dedicated appliance, I quickly learned that hardware vendors emphasize the performance features of their systems. Since I only needed large capacity, I priced a low-end rack-mount server with many drive bays filled with large SATA drives. I saved quite a bit over the appliance options.

Using commodity computing equipment as backup storage

The role of hyper-converged infrastructure in backup

A comparatively new type of system, commonly known as “hyper-converged infrastructure” (HCI), has taken on a growing role in datacenter infrastructure. In the traditional scale-out model, server-class computers handle the compute work, SAN or NAS devices hold the data, and physical switches and routers connect them all together.

In HCI, the server-class computers take over all the roles, even much of the networking.

Few organizations will design an HCI just for backup. Instead, they will deploy HCI as their foundational datacenter solution.

Originally, datacenters used purpose-built hosts for specific roles, such as domain controllers and SQL servers. As technologies matured, vendors and administrators enhanced their resilience by clustering hosts.

These clusters stayed on the purpose-built path of their constituent hosts. In the second generation, server virtualization started breaking down the pattern of single-use physical hosts. However, for the sake of organization and permission scoping, most administrators continued to deploy hosts and storage around themes.

HCI supersedes that paradigm by enabling true “cloud” concepts. With HCI, we can still define logical boundaries for compute, storage, and networking groups, but the barriers only exist logically. We may not know which physical resource hosts a particular server or database file.

Even if we find out, it could move in response to an environmental event.

With files, the storage tier can scatter the bits across the datacenter – possibly even between well-connected datacenters. In short, HCI administrators only need to concern themselves with the organization’s overall capacity.

If some resource runs low, they purchase more equipment and extend their HCI footprint. When done well, hardware purchases and allocations occur in different cycles and levels than server provisioning and storage allocation.

All this gives you two considerations for backup with HCI:

  1. You could place your infrastructure for on-premises backup hosting and public cloud relays in HCI just like any other server role
  2. You may have concerns about mixing the things that you backup with the backup itself

The first viewpoint has the strongest supportable argument. You should have multiple independent copies of backup anyway, so pushing data to offsite locations reduces the impact of dependence on HCI.

Also, many administrators (and the non-technical people above them in the reporting chain) cannot understand that coexistence does not automatically mean line-of-sight.

You can architect your HCI such that the production components have no effective visibility into backup. It works the same basic way that we have always set up datacenter backup, but the dividers exist in software instead of hardware. However, it does not matter how much anyone can justify their fears.

If you encounter significant resistance to bundling backup in with the rest of your HCI deployment, then architect traditionally. It sacrifices some efficiency, but not to a crippling degree.

Portability represents the greatest concern when using directly attached external devices for backup. Unlike tapes and discs, the media does not simply eject once the backup concludes.

With USB devices, you should notify the operating system of pending removal so that it has a chance to wrap up any writes, which could include metadata operations and automatic maintenance.

Directly connected Fiber Channel devices usually do not have any sort of quick-detach mechanism. In an emergency, people should concern themselves more with evacuation than spending time going through a lengthy detach process. In normal situations, people tend to find excuses to avoid tedious processes. Expect these systems to remain stationary and onsite.

Once upon a time, such restrictions would have precluded these solutions from a proper business continuity solution. However, as you will see in upcoming sections, other advances have made them quite viable. With that said, you should not use a directly attached device alone. Any such equipment must be part of a larger solution.

You may run into some trouble using external devices with some backup applications. Fortunately, it would help if you never ran into any modern programs that absolutely cannot backup to a disk target. However, some may only allow you to use disk for short-term storage.

Others may not operate correctly with removable disks. If you purchase your devices before your software, make certain to test interoperability thoroughly.

Even though mechanical hard drives have advanced significantly in terms of reliability, they still have a lot of moving parts. Furthermore, the designers of the typical 3.5-inch drive did not build them for portability. They can travel, but not as well as tapes or discs. Even if you don’t transport them, they still have more potential failure points than tapes. Do not overestimate this risk, but do not ignore it, either.

Cloud Storage in Backup Solutions

Several technological advances in the past few years have made Internet-based storage viable. Most organizations now have access to reliable, high-speed Internet connections at low cost. You can leverage that to solve one of the most challenging problems in backup: keeping backup data in a location safe from local disasters. Of course, these rewards do not come without risk and expense.

Pros of cloud backup

  • Future-proof;
  • Offsite from the beginning;
  • Wide geographical diversity;
  • Highly reliable;
  • Effectively infinite expandability;
  • Access from anywhere;
  • Security.

Cons of cloud backup

  • Dependencies outside your control;
  • Expensive to switch to another vendor;
  • Possibility of unrecoverable interruptions;
  • Speed.

To keep their promises to customers, cloud vendors replicate their storage across geographical regions as part of the service (cheaper plans may not offer this protection).

So, even though do you need to worry about failures in the chain of network connections between you and your provider and about outages within the cloud provider, you know that you will eventually regain access to your data. That gives cloud backup an essentially unrivaled level of reliability.

The major cloud providers all go to great lengths to assure their customers of security. They boast of their compliance with accepted, standardized security practices. Each has large teams of security experts with no other role than keeping customer data safe.

That means that you do not need to concern yourself much with breaches at the cloud provider’s level.

Yet, you will need to maintain the security of your account and access points. As with any other Internet-based resource, the provider must make your data available to you somehow.

Malicious attackers might target your entryway instead of the provider itself. So, you still accept some responsibility for the safety of your cloud-based data.

When using cloud storage for backup, two things have the highest probability of causing failure. Your Internet provider presents the first.

If you cannot maintain a reliable connection to your provider, then your backup operations may fail too often. Even if you have a solid connection, you might need more bandwidth to support your backup needs.

For the latter problem, you can choose a backup solution such as Hornetsecurity’s VM Backup that provides compression and deduplication features specifically to reduce the network load.

Your second major concern is interim providers. While you can trust your cloud provider to exercise continuous security diligence, many third-party providers follow less stringent practices.

If your backup system transmits encrypted data directly to a cloud account that you control, then you have little to worry about. Verify that your software uses encryption and keep up on updates, and you will have little to worry about beyond the walls of your institution.

However, some providers ship your data to an account under their control that they resell to customers. If they fall short on security measures, then they place your data at great risk. Vet such providers very carefully.

“Cost” did not appear on either the pro or con list. Cost will always be a concern, but how it compares to onsite storage will differ between organizations. Using cloud storage allows you to eliminate so-called “capital expenditures”: payments, usually substantial, made up-front for tangible goods.

If you have an Internet connection, you will not need to purchase any further equipment. You also wipe out some “operational expenses”: recurring costs to maintain goods and services.

You will need to pay your software licensing fees, and your cloud provider will regularly bill you for storage and possibly network usage.

However, you will not need to purchase storage hardware, nor will your employees need to devote their time to maintaining it. You transfer all the hassle and expense of hardware ownership to your provider in exchange for a lower overall fee.

Unfortunately, you should not transfer your entire backup load to a cloud provider. Due to the risks and speed limits of relying on an internet connection, it still makes the most sense to keep at least some of your solution on site. So, you should still expect some capital expense and local maintenance activities.

Putting It in Action

The previous section helped you to work through your software options. If you have made a final selection, then that has at least some control over your hardware purchase. If not, then you can explore your hardware options and work backward to picking software.

The exact deployment style that you use, especially for the on-premises portion of your solution, only matters to the degree that it enables your backups to function flawlessly.

Prioritize satisfying your needs above aligning with any paradigm. You need space to store your backups, software to capture them, and networking and transport infrastructure to move them from live systems.

Four steps to performing hardware selection

Truthfully, your budget plays the largest restrictor in hardware options. So, start there. Work through the features that you want to arrive at your project scope. Your general process looks like this:

1. Determine budget

2. Establish other controlling parameters:

  • Non-cloud replication only works effectively if you have multiple, geographically distant sites;
  • Inter-site and cloud replication need sufficient bandwidth to carry backup data without impeding business operations;
  • Rack space.

3. Decide on preferred media type(s). The above explanations covered the pros and cons of the types. Now you need to decide what matters to your organization:

  • Cost per terabyte;
  • Device/media speed;
  • Media durability;
  • Media transportability.

4. Prioritize desired features:

  • Deduplication;
  • Internal redundancy (RAID, etc.);
  • External redundancy (hardware-based replication);
  • Security (hardware-based encryption, access control, etc.).

If you find that the cost of a specific hardware-based feature exceeds your budget, then your software might offer it. That can help you to achieve the coverage that you need at a palatable expense.

Once you have concluded your hardware selection, you could proceed to acquiring your software and equipment. However, it makes sense to work through the next article on security before making any final decision. You might decide on a particular course for securing data that influences your purchase.

To properly protect your virtualization environment and all the data, use Hornetsecurity VM Backup to securely back up and replicate your virtual machine.

We ensure the security of your Microsoft 365 environment through our comprehensive 365 Total Protection Enterprise Backup and 365 Total Backup solutions.

For complete guidance, get our comprehensive Backup Bible, which serves as your indispensable resource containing invaluable information on backup and disaster recovery.

To keep up to date with the latest articles and practices, pay a visit to our Hornetsecurity blog now.

Conclusion

In summary, exploring various backup storage options reveals both advantages and disadvantages. Each target, whether it’s cloud-based, on-premises, or a hybrid approach, offers unique benefits and challenges. The choice ultimately depends on specific business needs, budgets, and data security concerns. By carefully considering the factors mentioned in this article, organizations can make informed decisions to ensure data protection and accessibility align with their objectives.

FAQ

What is the main purpose of storage?

Storage serves as a system that empowers a computer to store data, whether temporarily or permanently. Devices like flash drives and hard disks constitute a foundational element in most digital devices, enabling users to safeguard a wide array of data, including videos, documents, images, and raw information.

What are the benefits of backup storage?

Backups provide the means to recover deleted files or retrieve data that may have been unintentionally overwritten. Moreover, backups often represent the most reliable choice for recuperating from incidents like ransomware attacks or significant data loss events, such as a data center fire.

What are the benefits of more storage?

Here are several notable benefits associated with additional space that will persuade you to maximize your storage capacity:

  • Enhance organization and minimize clutter;
  • Boost efficiency and productivity;
  • Ensure safety;
  • Enhance accessibility for everyone;
  • Elevate comfort and ergonomics.

These advantages underscore the importance of making the most of available storage space.

The Foolproof Method of Maintaining your Backup System

The Foolproof Method of Maintaining your Backup System

As you might expect, setting up backup is just the beginning. You will need to keep it running into perpetuity. Similarly, you cannot simply assume that everything will work. You need to keep constant vigilance over the backup system, its media, and everything that it protects.

Monitoring Your Backup System

Start with the easiest tools. Your backup program almost certainly has some sort of notification system. Configure it to send messages to multiple administrators. If it creates logs, use operating system or third-party monitoring software to track those as well. Where available, prefer programs that will repeatedly send notifications until someone manually stops it or it detects problem resolution.

Set up a schedule to manually check on backup status. Partially, you want to verify that its notification system has not failed. Mostly, you want to search through job history for things that didn’t trigger the monitoring system. Check for minor warnings and correct what you can. Watch for problems that recur frequently but work after a retry. These might serve as early indications of a more serious problem.

Testing Backup Media and Data

You cannot depend on even the most careful monitoring practices to keep your backups safe. Data at rest can become corrupted. Thieves, including insiders with malicious intent, can steal media. You must implement and follow procedures that verify your backup data. After all, a backup system is only valuable if the data can be restored when needed.

Keep an inventory of all media. Set a schedule to check on each piece. When you retire media due to age or failure, destroy it. Strong magnets work for tapes and spinning drives. Alternatively, drill a hole through mechanical disks to render them unreadable. Break optical media and SSDs any way that you like.

Organizations that do not track personal or financial information may not need to keep such meticulous track of media. However, anyone with backup data must periodically check that it has not lost integrity. The only way you can ever be certain that your data is good is to restore it.

Establish a regular schedule to try restoring from older media. If successful, make spot checks through the retrieved information to make sure that it contains what you expect.

Use this article as a basic discussion on testing best practices. We will revisit the topic of testing in a dedicated post towards the end of this article series.

The activities in this article will take time to set up and perform. Do not allow fatigue to prevent you from following these items or tempt you into putting them off. You need to:

  • Configure your backup system to send alerts on failed jobs at least
  • Establish an accountability for manually verifying that the backup program is functioning on a regular basis;
  • Configure a monitoring system to notify you if your backup software ceases running;
  • Establish a regular schedule and accountability system to test that you can restore data from backup. Test a representative sampling of online and offline media.

Too many organizations do not realize until they’ve lost everything that their backup media did not successfully preserve anything. Some have had backups systems sit in a failed state for months without discovering it. A few minutes of occasional checking can prevent such catastrophes.

Monitoring backup, especially testing restores, is admittedly tedious work. However, it is vital. Many organizations have suffered irreparable damage because they found out too late that no one knew how to restore data properly.

Maintaining Your Systems

The intuitive scope of a business continuity plan includes only its related software and equipment. When you consider that the primary goal of the plan is data protection, then it makes sense to think beyond backup programs and hardware. Furthermore, all the components of your backup belong to your larger technological environment, so you must maintain it accordingly.

Fortunately, you can automate common maintenance. Microsoft Windows will update itself over the Internet. The package managers on Linux distributions have the same ability. Windows also allows you to set up an update server on-premises to relay patches from Microsoft. Similarly, you can maintain internal repositories to keep your Linux systems and programs current.

Maintaining Your Systems

In addition to the convenience that such in-house systems provide, you can also leverage them as a security measure. You can automatically update systems without allowing them to connect directly to the Internet. In addition to software, keep your hardware in good working order.

Of course, you cannot simply repair modern computer boards and chips. Instead, most manufacturers will offer a replacement warranty of some kind.

If you purchase fully assembled systems from a major systems vendor, such as Dell or Hewlett-Packard Enterprise, they offer warranties that cover entire systems as a whole. They also have options for rapid delivery or in-person service by a qualified technician. If at all possible, do not allow out-of-warranty equipment to remain in service.

Putting It into Action

Most operating systems and software have automated or semi-automated updating procedures. Hardware typically requires manual intervention. It is on the system administrators to keep current.

  • Where available, configure automated updating. Ensure that it does not coincide with backup, or that your backup system can successfully navigate operating system outages.
  • Establish a pattern for checking for firmware and driver updates. These should not occur frequently, so you can schedule updates as one-off events.
  • Monitor the Internet for known attacks against the systems that you own. Larger manufacturers have entries on common vulnerabilities and exposures (CVE) lists. Sometimes they maintain their own, but you can also look them up at: https://cve.mitre.org/. Vendors usually release fixes in standard patches, but some will issue “hotfixes”. Those might require manual installation and other steps.
  • If your hardware has a way to notify you of failure, configure it. If your monitoring system can check hardware, configure that as well. Establish a regular routine for visually verifying the health of all hardware components.

To properly protect your virtualization environment and all the data, use Hornetsecurity VM Backup to securely back up and replicate your virtual machine.

We ensure the security of your Microsoft 365 environment through our comprehensive 365 Total Protection Enterprise Backup and 365 Total Backup solutions.

For complete guidance, get our comprehensive Backup Bible, which serves as your indispensable resource containing invaluable information on backup and disaster recovery.

To keep up to date with the latest articles and practices, pay a visit to our Hornetsecurity blog now.

Final Words

Maintenance activities consume a substantial portion of the typical administrator’s workload, so these procedures serve as a best practice for all systems, not just those related to backup. However, since your disaster recovery plan hinges on the health of your backup system, you cannot allow it to fall into disrepair.

FAQ

What is a data backup system?

A data backup system is a method or process designed to create and maintain duplicate copies of digital information to ensure its availability in the event of data loss, corruption, or system failures.

What is an example of a data backup?

An example of a data backup is storing copies of files, documents, or entire systems on external hard drives, cloud services, or other storage media. This safeguards against potential data loss and facilitates recovery if the original data is compromised.

How do companies backup their data?

Companies use a variety of methods to backup their data, including regular backups to external servers, cloud-based solutions, tape drives, or redundant storage systems. Automated backup software is often employed to streamline and schedule the backup process, ensuring data integrity and accessibility.

Hornetsecurity, as cloud security experts, is here to assist global organizations and empower IT professionals with the necessary tools, all delivered with a positive and supportive attitude.

How to Get the Absolute Most Out of Your Backup Software

How to Get the Absolute Most Out of Your Backup Software

In the past, we could not capture a consistent backup. Operations would simply read files on disk in order as quickly as possible.

But, if a file changed after the backup copied it but before the job completed, then the backup’s contents were inconsistent. If another program had a file open, then the backup would usually skip it.

How to Get the Absolute Most Out of Your Backup Software

Microsoft addressed these problems with Volume Shadow Copy Services (VSS). A backup application can notify VSS when it starts a job. In response, VSS will pause disk I/O and create a “snapshot” of the system.

The snapshot isolates the state of all files as they were at that moment from any changes that occur while the backup job runs. The backup signals VSS when it has finished backing up, and VSS merges the changed data into the checkpoint and restores the system to normal operation.

With this technique, on-disk files are completely consistent.

However, it cannot capture memory contents. If you restore that backup, it will be exactly as though the host had crashed at the time of backup. For this reason, we call this type of backup “crash-consistent”. It only partially addresses the problem of open files.

How to Get the Absolute Most Out of Your Backup Software

VSS-aware applications can ensure complete consistency of the files that they control. Their authors can write a component that registers with VSS (called a “VSS Writer”). When VSS starts a snapshot operation, it will notify all registered VSS writers. In turn, they can write all pending operations to disk and prevent others from starting until the checkpoint completes.

Because it has no active I/O (sometimes called “in-flight”) at the time the backup is taken, the backup will capture everything about the program. We call this an “application-consistent” backup.

As you shop for backup programs, keep in mind that not everyone uses the terms “crash-consistent” and “application-consistent” in the same way. Also, Linux distributions do not have a native analog to VSS. Research the way that each candidate application deals with open files and running applications.

Hypervisor-Aware Backup Software

If you employ any hypervisors in your environment, you should strongly consider a backup solution that can work with them directly.

You can back up client operating systems using agents installed just like physical systems if you prefer. However, hypervisor-aware backup applications can appropriately time guest backups to not overlap and employ optimization strategies that greatly reduce time, bandwidth, and storage needs.

When it comes to your hypervisors, investigate applications with the same level of flexibility as Hornetsecurity VM Backup.

You can install it directly on a Hyper-V host and operate it from there, use a management console from your PC, or make use of Hornetsecurity’s Cloud Management Console to manage all of your backup systems from a web browser. Such options allow you to control your backup in a way that suits you.

Agent-Based Versus Agentless

Usually, backup solutions require you to install a software component on each system you want to protect. That software will gather the data from its system and send it directly to media or to a central system. You saw examples of both in the “The Golden Rules to Choosing a Backup Provider” article. The software piece that install on the targets is called an “agent”.

Other products can back up a system without installing an agent. You won’t find much in that category for taking complete backups of physical servers. Some software will back up networked file storage.

These “agentless” products rule the world of virtualization. Hornetsecurity VM Backup serves as a prime example. You install the software in your Hyper-V or VMware environment, and it backs up virtual machines without modifying them.

While VM Backup and similar programs can interact with guest operating systems to give them an opportunity to prepare for a backup operation, they can also work on virtual machines without affecting them.

Without such an agentless solution, you would need to place some piece of software inside every virtual machine. That introduces more potential failure points, increases your attack surface, and burdens you with more overhead.

You need to schedule all backup jobs carefully so that they do not interfere with each other. Agentless systems coordinate operations automatically. They also have greater visibility over your data, making it easier for them to perform operations such as deduplication for smaller, faster backups.

Standard Physical Systems Backup Software

Few organizations have moved fully to virtualized deployments. So, you likely have physical systems to protect in addition to your virtual machines. Some vendors, such as Hornetsecurity, provide a separate solution to cover physical systems.

Others use customized agents or modules within a single application. However, some companies have chosen to focus on one type of system and cannot protect the other.

Single Vendor vs. Hybrid Application Solutions

In small environments, administrators rarely even consider using solutions that involve multiple vendors. Each separate product has its own expertise requirements and licensing costs. You cannot manage backup software from multiple vendors using a single control pane.

You may not be able to find an efficient way to store backup data from different manufacturers. Using a single vendor allows you to cover most systems with the least amount of effort.

On the other hand, organizations with more than a handful of servers almost invariably have some hybridization – in operating systems, third-party software, and hardware. Using different backup programs might not pose a major challenge in those situations. Using multiple programs allows you to find the best solution for all your problems instead of accepting one that does “enough”.

I once had a customer that was almost fully virtualized. They placed high priority on a granular backup of Microsoft Exchange with the ability to rapidly restore individual messages. Several vendors offer that level of coverage for Exchange in addition to virtual machine backup.

Unfortunately, no single software package could handle both to the customer’s satisfaction.

To solve this problem, we selected one application to handle Exchange and another to cover the virtual machines. The customer achieved all their goals and saved substantially on licensing.

Putting It in Action

Using the above guidance and the plan that you created in earlier articles in this series, you have enough information to start investigating programs that will satisfy your requirements.

Phase one: Candidate software selection

Begin by collecting a list of available software. You will need to find a way to quickly narrow down the list.

To that end, you can apply some quick criteria while you search, or you can build the list first and work through it later. Maintain this list and the reasons that you decided to include or exclude a product.

Create a table to use as a tracking system. As an example:

Phase one: Candidate software selection

It might seem like a bit much to create this level of documentation, but it has benefits:

  • Historical purposes: Someone might want to know why a program was tested or skipped
  • Reporting: You may need to provide an accounting of your selection process
  • Comparisons: Such a table forms a feature matrix

Because this activity only constitutes the first phase of selection, use criteria that you can quickly verify. To hasten the process, check for any deal-breaking problems first. You can skip any other checks for that product. While the table above shows simple yes/no options, you can use a more nuanced grading system where it makes sense.

Keep in mind that you want to shorten this list, not make a final decision.

Phase two: In-depth software testing

You will spend the most time in phase two. Phase one should have left you with a manageable list of programs to explore more completely. Now you need to spend the time to work through them to find the solution that works best for your organization.

Keep in mind that you can use multiple products if that works better than a single solution.

For this phase, you will need to acquire and install software trials. Some recommendations:

  • Install trialware on templated virtual machines that you can quickly rebuild;
  • Use test systems that run the same programs as your production systems;
  • Test backing up multiple systems;
  • Test encryption/decryption;
  • Test complete and partial restores.

Extend the table that you created in phase one. If you used spreadsheet software to create it, consider creating tabs for each program that you test. You could also use a form that you build in a word processor.

Make sure to thoroughly test each program. Never assume that any given program will behave like any other.

Phase three: Final selection

Hopefully, you will end phase two with an obvious choice. Either way, you will need to notify the key stakeholders from phase one of your selection status. If you need additional input or executive sign-off to complete the process, work through those processes.
Phase three: Final selection

Unless you choose a completely cloud-based disaster recovery approach, you will still need to acquire hardware. Remember that, due to threats of malware and malicious actors, all business continuity plans should include some sort of in-house solution that you can take offline and offsite.

To properly protect your virtualization environment and all the data, use Hornetsecurity VM Backup to securely back up and replicate your virtual machine.

We ensure the security of your Microsoft 365 environment through our comprehensive 365 Total Protection Enterprise Backup and 365 Total Backup solutions.

For complete guidance, get our comprehensive Backup Bible, which serves as your indispensable resource containing invaluable information on backup and disaster recovery.

To keep up to date with the latest articles and practices, pay a visit to our Hornetsecurity blog now.

Conclusion

Optimizing your backup software is crucial for ensuring the integrity and consistency of your data. When dealing with virtualization and hypervisors, consider solutions that are hypervisor-aware and agentless, as they can offer greater flexibility and efficiency.

For organizations with both physical and virtual systems, it’s essential to select a solution that can cover both adequately.

When deciding between a single-vendor or hybrid approach, weigh the pros and cons carefully to meet your unique needs, as the phased approach to selecting the right backup software involves candidate selection, in-depth testing, and final selection, ensuring you make the best choice for your organization’s data protection and recovery needs.

FAQ

What is the backup software?

Backup software is a type of computer program designed to create and manage copies of data, files, or entire systems for the purpose of data protection, disaster recovery, and data preservation. These software applications automate the process of backing up data to ensure that it can be restored in case of data loss, hardware failure, or other unforeseen events.

What is an example of backup software?

An example of backup software is our Hornetsecurity VM Backup, which is a comprehensive backup solution provided by Hornetsecurity. Hornetsecurity VM Backup is a virtual machine backup solution provided by Hornetsecurity.

It’s designed specifically for virtualized environments and focuses on creating backups of virtual machines. This type of backup software is essential for protecting and recovering data in virtualized server environments.

What is free backup software?

Free backup software refers to backup solutions that are available at no cost, typically with limited features compared to their paid counterparts. These free backup software options are suitable for individuals or small organizations with basic backup needs.