Lessons from the IRS "Tax Day" Outage
The IRS outage has served to remind us of the need to back up our critical systems and data sources while ensuring they can handle peak demands.
- By Mike Schiff
- May 14, 2018
The April 17 Tax Day outage of the Internal Revenue Service's online e-filing system could not have come at a worse time. This was last day for filing timely individual tax returns or automatically granted extension requests; many taxpayers were inconvenienced and the IRS had to extend the filing deadline by one day.
Although the specific causes of the IRS incident were not immediately revealed, they serve as a reminder of the importance -- if not the necessity -- of backing up our files and having failover systems for our critical operational and data warehouse systems.
In fact, one of the suspected causes of the IRS outage was the inability of some IRS systems to access one of the agency's master files, a taxpayer data hub. Furthermore, and especially with systems that are subject to spikes in demand, stress testing or quickly configured incremental capacity are also critical to avert failure during usage peaks.
Unfortunately, although some data warehouse practitioners (or their managers) only consider operational systems to be mission-critical, this is no longer a truism. Data warehouses, operational data stores, data hubs, and the applications that feed off them can be equally important to an organization's success. Consequently, we need to ensure that they are protected from failure and take steps to safeguard them with data backups and failover systems capable of taking over in the event of unrecoverable hardware crashes.
We must remember that even totally redundant, fully debugged (assuming this is ever truly possible), stress-tested systems are not immune to failure when the backups are co-located and a major catastrophic event (such as an earthquake, tornado, or burst water pipe burst) occurs that simultaneously wipes out the primary and failover systems. Even if the failover system is remotely located, it is imperative that it has access to up-to-date data. Although cloud storage can help achieve this, the data needs to be reachable by the failover system.
Fortunately, in addition to data storage, the cloud offers other advantages. If applications are cloud-resident, then they can likely be accessed from almost anywhere with Internet connectivity. Large companies can create their own private cloud implementations; smaller businesses can use a variety of commercially available public clouds as their application platform. By doing so, the cloud vendor takes care of providing necessary failover redundancy (and make sure they do!) and appropriate service-level guarantees while minimizing or even eliminating underlying infrastructure and operational costs. Furthermore, many application vendors provide both on-premises and software-as-a- service implementations of their offerings.
Another advantage of public cloud-based hosting is the vendor's ability to expand or contract the scale of the underlying computer power to satisfy peaks and valleys in usage demands. Large companies may have the resources to build data centers that can handle peak demands that can be several times average demands, but smaller companies may not be able to afford this.
It is also important to recognize that not every application requires 100% uptime. Organizations need to identify which are critical (e.g., air traffic control, hospital patient monitoring, online order entry, stock market trading, etc.) and which can be unavailable for limited times (e.g., monthly general ledger closings) without causing major problems.
The IRS outage was a good reminder that we need to recognize that not only operational systems but also some of our data warehouse systems need to be included under our mission-critical umbrellas and that cloud computing can facilitate our efforts.
The bottom line: an ounce of system redundancy is worth a pound of recovery!
About the Author
Michael A. Schiff is founder and principal analyst of MAS Strategies, which specializes in formulating effective data warehousing strategies. With more than four decades of industry experience as a developer, user, consultant, vendor, and industry analyst, Mike is an expert in developing, marketing, and implementing solutions that transform operational data into useful decision-enabling information.
His prior experience as an IT director and systems and programming manager provide him with a thorough understanding of the technical, business, and political issues that must be addressed for any successful implementation. With Bachelor and Master of Science degrees from MIT's Sloan School of Management and as a certified financial planner, Mike can address both the technical and financial aspects of data warehousing and business intelligence.