This article appears as part of a seven-part series on Disaster Recovery planning, regardless of institution size and systems in production. Each article will consist of an activity designed to encourage readers to update their own initial DR plan, or even create a new one.
The concept of a disaster recovery plan is simple: ensure business continuity in the wake of an emergency. If something goes wrong, how fast can order be restored?
The reality of a sound DR plan is slightly different. If one is created, the likelihood that it’s maintained, tested, and updated regularly is less certain. New systems and processes are defined throughout the course of an academic year, and so it can become challenging to pause and ensure that all aspects of business function are secure.
Without intimating a picture of doom and gloom, everyone knows that there are a multitude of risks that could exist. However, the good news is that getting started isn’t as time consuming or difficult as the nature of the task may allude. In fact, if you are interested in updating your disaster recovery plan, this series will guide you. Spend about 30 minutes over two business days and concentrate on identifying your institution's most important systems. We believe in less than an hour, your team can initiate a clear path to an updated disaster recovery plan.
To formulate a solid disaster recovery plan, one must begin with identifying a list of critical business functions and supporting systems. From that list, determine the level of importance for each system in production. For example, if a system like Colleague or Banner were to go offline during a power outage and its data corrupted, business functions would cease. If the website were to go offline, however, the impact is more marketing-related and only affects those trying to gain access to certain materials, yet it doesn’t render the institution inoperable.
Once each system has been assigned a level of importance, consider whether each is mission critical and then assess the complexity of rebuilding or restoring services were each to go offline unexpectedly. Force yourself to list systems from most to least important in a single file list, even when it feels like two systems are equally important. (Your budget works linearly, and so should your planning.)
To take the plan one step further, consider starting a spreadsheet that contains specific information about each system identified in your exercise. With just a few key metrics, the institution can prioritize each system. This list will form the basis of the disaster recovery plan. (Don’t worry about adding every system. If you can’t think of a system right away, chances are that it will go on the bottom of the list, and it can be safely added later.) Assign each system to a matrix structure:
- System Name
- Complexity to rebuild
- Mission Criticality
- Life and Death: Will your college exist tomorrow if the system is lost?
Every institution should place a core set of services on the list: its ERP (Colleague, Banner, etc.); email server; network infrastructure; document imaging system (i.e. Soft Docs); content management and file shares; learning management system and other online course materials; website server(s); and telephone systems.
In some instances, communications might fail, but consider each system’s importance in relation to what it might take to rebuild.
Not all disaster recovery plans come with a hefty price tag, although there are some costs to consider. As one can imagine, this is a discussion worth having before any systems are in the cross hairs and teams placed under pressure to restore business operations. This exercise can also help provide a business case for working with stakeholders to understand the importance of what is being backed up and the costs associated with prioritizing certain systems over others. With the advent of cloud based services (i.e. Amazon Glacier, Amazon Web Services, Google Apps) and other infrastructure options, institutions can consider allocating funds into its operational budget and prepare to make resources available before an emergency.
As an example, for a mid-sized institution running Colleague, maintaining a secondary Colleague server capable of assuming basic college functions will cost between $20,000.00 and $60,000.00 every four years (if funded with capital improvement cash). This type of system should take approximately one month to setup and four hours per week of staff attention to maintain.
The response to an emergency is overrated when compared to its avoidance. By setting priorities and focusing on how a team might respond if something were to go awry, attention is drawn to areas of the business that are paramount to being sustainable. As critical functions and systems are placed atop the list, institutions should immediately begin a thorough review of all processes impacted and driven by such systems.