Category Archives: Resilience

The shift in Work Area Recovery

Last month in advance of a seminar on Work Area Recovery for the BCI London Forum, I conducted a survey of London area Business Continuity Institute members.  The seminar was a sell out.  It seems that a lot of companies are re-evaluating their strategy for coping with denial of access to their building (usually through fire or flood) or some other melt down that requires staff to work in alternate locations. Companies like SunGard, ICM and IBM must be seeing a decline in traditional outsourced work area recovery where those companies’ sites are kept on standby for customers in the event of ….

Two factors seem to lie at the root of that decline.

  • Much closer scrutiny of costs in the past two years caused all budgets to be reviewed for value for money.
  • The increase in home working and hot desking have obviated the need for traditional office environments.

The biggest concern in the survey regarding home working was network load – network traffic at the last mile and in the server room.  I can’t help but think that capacity will rise as demand increases to a point where, if disaster strikes, provided it doesn’t affect everyone (in which case, our problems are bigger than our businesses) and provided the company has mirrored IT, there will be sufficient bandwidth at both ends to cope with spikes.

1 Comment

Filed under Business Continuity, Resilience, Virtual Teams

Black Swans over Europe

In all my conversations with business continuity professionals in London, I don’t recall anyone, ever mentioning the risk to supply chains or personnel posed by Icelandic volcanoes. What we just experienced was a classic Black Swan – something unheard of in this part of the world, something so far off the radar, it didn’t appear on a European’s risk register outside Iceland.  We had a relatively rare event – a steam-laden eruption coinciding with just the right weather conditions to allow an ash cloud to hang over Europe for several days – combined with a zero tolerance for ash+aviation, through lack of data.

On June 24,  1982 British Airways Flight 9  en route to Perth, flew through the ash cloud of Mount Galunggung, Indonesia losing all four engines for 13 minutes before managing to restart one, then all four, then losing one again, before landing in Jakarta with a badly damaged plane, abraded leading edge surfaces and obscured windscreen.  On July 13, another 747, a Singapore airlines plane, was forced to shut down 3 engines in the same area.   Since then authorties have developed strict guidelines, that we find in retrospect were (hopefully) too harsh. We didn’t know what the safe levels of ash were and now that we have much more data gathered over the past week, the aviation authorities have come up with a level that they feel represents a safe threshold. 

But weather patterns are fickle, and volcanoes are even more so.   Eyjafjallajökull is a mere pup compared to its next door neighbour, Katla.  The last three times Eyjafjallajökull erupted, Katla also erupted shortly after. Iceland’s President Olafur Grimsson this week referred to the recent event as a small rehearsal for what will happen when Katla comes alive.  Suddenly risk managers and business continuity professionals have something new to consider – or rather something very old, but overdue.  The past six days has also demonstrated that unless we have a balanced, reasonable approach to risk, we endanger the mechanisms of our society and commerce through excess caution.

Leave a comment

Filed under Business Continuity, Resilience

Executive Luck and the Resilience Parodox

Ten times a year, I help organize an evening get together of the top business continuity professionals in London over a beer. It is called BANG. We always have a controversial speaker from the business continuity industry. This week, crisis management consultant Gareth Jones spoke of how organizations learn (or don’t learn) from incidents. He said something about luck which I felt was very insightful. It explains why business continuity is such a hard sell in large organizations – why it goes through phases of importance – and why organizations so often get caught out with disastrous consequences. It may even explain why the consequences of the credit crunch cascaded so widely.

Business continuity pros help organizations mitigate and deal with incidents, things that happen when an organization’s luck runs out – a power failure, an explosion, a flood, a large systems outage. They put plans into place, they exercise and they educate. Frequently their budgets get cut. They may work for an organization for a couple of years, get frustrated and then go into consultancy for a few years then go back into another organization. Lifers are rare. It is a comparitively young profession, but an old ethos.

Resilience is a board level responsibility. The board has a duty to shareholders to protect as well as grow shareholders’ wealth. To understand organizational risks fully and stay prepared, they need to contemplate some very unpleasant possibilities and consequences. Many are completely incapable of doing so.

To understand why, one needs to consider how an organization’s senior executives got into their positions. It was through a combination of ability and luck – but mostly luck. The unlucky ones – the ones with equal abilities who experienced an organizational failure or a disaster that had serious ramifications – aren’t in those powerful positions any longer. Business continuity managers sell their wares to senior levels. They talk about bad luck to the lucky. What a tough sell that must be! They don’t get heard, they don’t get budgets, they get frustrated and leave, leaving and the unheeding organization vulnerable. The more lucky an organization is, the more likely it is to suffer badly from the consequences of a disaster when its luck runs out.

Leave a comment

Filed under Business Continuity, Resilience

Elastic Communications in a Crisis

In an earthquake, well designed buildings in Japan absorb shocks by separating the building from the base, by using deformable building materials or with internal counter balances.  Buildings are more elastic.  Communications in a crisis also need to be elastic, to absorb shockwaves. In a crisis, services, technology, people and resources you take for granted may not be available where they are needed.  Computer networks, email systems, phones and power – any of these may be degraded or lost through a crisis, or they may be the cause of a crisis.  Obtaining intelligent awareness of a situation, working collaboratively towards decisions, conveying instructions and obtaining feedback depend on timely, accurate and digestible communications.  A crisis that limits usual communications choices necessitates resourcefulness.  It demands elasticity in the way individuals, teams and organisations think and act.  It demands elasticity in our infrastructures, our processes and policies.  Elasticity buys time and rigidity spends it.

Elastic thinking is the ability to recognize new priorities quickly, to broaden peripheral vision, to make fast decisions and to act appropriately.  If appropriate, security gives way to expediency, best guess becomes good enough,  quick and dirty wins over detailed and thorough.   Elastic communications is the ability to switch media to minimize the loss of your normal media choices.  If your corporate email is down, can you easily switch to a public mail system such as Google Mail? If it is necessary to communicate two way rapidly with employees and their loved ones over public instant messaging or SMS, can that be easily done in a Starbucks? Do you even have those instant messaging addresses on your laptop?  Are they a facebook or LinkedIn group? If your continuity plans are held on internal servers, are they also on third party servers “in the cloud” or on usb sticks, or handhelds?  How would you use Twitter to broadly disseminate information and solicit rapid feedback in a crisis? 

Elasticity is the least expensive path to build resilience.  It is provides a higher return on investment than building infrastructure.  Bespoke infrastructure contingencies are expensive.  Using multiple, off the shelf, public services are cheaper.  Elasticity builds agility. An elastic organization is a more open one, one that listens to customers better, responds more quickly and works more collaboratively with suppliers.   The benefits of elasticity go far beyond those of organizational resilience and the ability to withstand large seismic shocks.

1 Comment

Filed under Business Continuity, Coordination, Presence, Resilience, Security, Social Networking, Virtual Teams

Dual Disasters

Into the palace in Samarkand rushed a messenger telling the Emperor news of famine in a country far away.

“Famine has not touched my lands in all the time I have been emporer. Thanks be to Allah,” declared the Emperor.

Said Nasrudin, “Allah does indeed work great mysteries, but He even would not visit two disasters upon Samarkand at the same time.”

Leave a comment

Filed under Nasrudin, Resilience

Business continuity, risk and resilience

Business continuity is a discipline that ensures that an organization’s critical functions are available to stakeholders, not just during a crisis, but every day.  There is a higher view, that business continuity should concern itself with organizational resilience, both short and long term. In other words, it works to ensure the survivability of an organization.  That view projects the discipline well beyond the notion of protecting physical assets, data, communications, people and reputation.

  
Today most practitioners have allowed business continuity to be boxed in and side-lined.  They may guide an organization to help them pass a BS25999 audit and tick the boxes but they haven’t grasped the resilience nettle. When one looks at business continuity budgets, it is obvious that data and communications resilience take the lion’s share.  Practitioners do not penetrate to the core of organizational resilience.   Resilience is an outcome from a number of disciplines’ better practice.  The International Consortium for Organizational Resilience, a non profit, education and credentialing body, lists ten disciplines that it covers: Business Continuity Management; Crisis Management and Communications; Technical Infrastructure; Emergency Management; Facility Management; Legal Compliance & Audit; Organizational Behaviour; Risk Management & Insurance; Social Resilience; and Supply Chain, Logistics & Transportation Management.  But even that broad set either does not go far enough or each individual discipline lacks the power in the organization to build resilience.  Enron might well have ticked all the above boxes, as might have Lehman Brothers, Bear Stearns, Northern Rock, Royal Bank of Scotland, Citigroup, GM, Ford, Chrysler, and even Iceland (the country not the company). One could glibly say that you cannot protect against board level stupidity, but that would plainly ignore the fact that all of the above entities employed or still employ highly intelligent people working hard to grow the short term and long term value of their stakeholders.

Often the root cause of an organizations’ demise is deemed to be short sightedness, but that label is only useful in hind sight.  One has to ask, what is at the root of short sightedness?  Is it simply letting board level short term greed come in the way of long term greed, or is it a human’s innate belief that it won’t happen to them, that it is our very nature to discount risk?  Look at the number of people who fail to save adequately for their retirement, or the high numbers of people without a will or those with inadequate life insurance.  It won’t happen to us. That is probably a survival mechanism so we concentrate on the here and now.  It worked well on the plains of Africa.  It doesn’t work well when in an organization spread over multiple time zones or is at least significantly affected by external consequences that may be two or more steps removed from our normal daily perspective, such as was the case with the drop in housing prices in the US following an extended, too liberal financial regulatory regime.

Perhaps the problem is how we scope risk.  Is it too narrow?  Are risk issues too easily pigeonholed into Financial Risk Management which only looks at how to use finance instruments to manage exposure to risk?   Or perhaps with other types of risks, we focus on those that are small, frequent and obvious – such as the risk that a server will fail.  Servers fail all the time.  The risk is measurable and the probability is high that every data centre will experience one or more during the course of a year, and we can take tried and tested steps to prevent or mitigate that risk.  If our view of risk is broader, one cannnot mitigate it with finance instruments or by conventional preventative measures or disaster recovery.   And with a broader view of risk comes a realization that threats are more numerous than we had imagined, as does the realization that something out of the blue will hit us for sure.

We are bound to get caught out by the improbable, as was pointed out by Nassim Nicholas Taleb in his book “The Black Swan, the Impact of the Highly Improbable.”  He rightly argues that the improbable is more likely than we prefer to imagine.  This tendency to be overly sanguine, lies at the heart of why business continuity lacks power, budget and scope and why large organizations fail in spite of their collective intelligence.    Making the unlikely more likely is our tendency to take larger risks in groups than we would as individuals. 
We do not build sufficient protection in our portfolio from the improbable, and paradoxically we also do not take enough gambles to be open to highly improbable rewards. Taleb does not argue against risk taking and in fact is an advocate of high risk.  But, he argues high risk should be confined to a smaller proportion of an organization’s and individual’s investments and activities, and the rest should be more conservative.  By increasing risk but confining it to a smaller part of a portfolio, an individual or organization takes advantage of the highly improbable, impossible to predict upsides.  By adopting a more conservative approach to the vast majority of our assets, we build protection against the highly improbable and high impact downsides.

I would argue that there is a third issue that constrains our notion of resilience and that is our definition and view of assets.  An organization has a broad range of off the balance sheet assets or risks that have nothing to do with derivatives or other financial instruments.  They are soft intangibles. Though their relationship to tradable value is more indirect than, say, a deposit in a current account or a receivable, they are still measurable.  Fluctuations in their values have direct impacts on shareholder value, though there are time lags. As such, they are useful leading indicators. Examples include:

  • Customer loyalty
  • Business partner loyalty
  • Supplier dependence on the firm
  • Customer dependence on the firm
  • Channel power
  • Customer permission
  • Employee loyalty
  • Employee drive
  • Organizational citizenship
  • Innovation behaviour
  • Brand attributes and reputation
  • Brand recognition
  • Virtual distance
  • Patents granted
  • Carbon footprint
  • Responsiveness

Many of these are inter-related and co-dependent.  Many of their values are proprietary and thus difficult by outsiders to benchmark against other firms.  Yet they all represent things that affect short term and long term shareholder value.   If business continuity is to survive, it needs more than a re-branding exercise.  There must be a new, separate discipline, one which is not an aspect of finance, IT, sales, marketing or strategy.   Like all other disciplines, it assumes aspects of all the others, but it is distinct.  It needs to be distinct to counteract our instinct.  We need to call out our ostrich tendencies, and create a function that does nothing but own organizational resilience, defined broadly, on the board.  It needs a name.

Leave a comment

Filed under Business Continuity, Resilience

Groove in emergencies

An article in Microsoft’s TechNet Magazine post Katrina on Louisiana State University’s Emergency Operations Center, highlighted some deficiencies in ordinary IT during crises.  First, you can’t depend on vast bandwidth or even any bandwidth at all in an emergency.  Second, your constituency can increase dramatically.  One day you’re managing the IT needs of staff, the next, you have to deal with external agencies, contractors or volunteers; people that aren’t in your directory or who can be provisioned quickly and easily. 

That’s where Groove comes in.  Once files are downloaded in a Groove “workspace”,  information to be shared with others, only the changes are exchanged.  Work can continue off-line, and when one goes back on-line, documents are synchronized.    For cross agency or inter company working, it’s like manna.  No files attached to emails, and all the versioning problems those entail.  No server to set up.   Everyone has their own synchronized copy of the information on laptops.   For keeping business continuity plans up to date, ensuring they are distributed and always available, or for sharing project information in a distributed team, it has a lot going for it.

4 Comments

Filed under Business Continuity, Coordination, Resilience, Security, Virtual Teams