Friday 3 October 2014

Sound the alarm!

US Secret Service under scrutiny

On 19th September 2014, an intruder armed with a knife scaled a fence surrounding the White House in Washington, DC. He managed to run across the North lawn, past a guard posted in the entrance hall and the stairs to the living quarters, before being tackled in the East Room. There were a number of factors involved in his success in accessing one of the most iconic buildings in the world, as detailed in this Washington Post article

One suggested contributing factor was the muting of an intruder alarm which would have alerted the guard in the entrance hall that the perimeter had been breached. According to the Washington Post, "White House usher staff, whose office is near the front door, complained that they were noisy" and "were frequently malfunctioning and unnecessarily sounding off."

Alarms, or their equivalents, have a number of everyday uses such as waking us up in the morning, telling us we've left the car headlights on, or informing us that we've burned the toast again. In healthcare, alarms are meant to draw our attention to occurrences which need to be acknowledged or require action. However, just as happened with the Secret Service, alarms (and their misuse) can cause their own problems.

The 4 most common abuses of alarms

(This is a list derived from personal experience and observations in the simulation centre, it is not definitive)

1) Alarm not switched on
Some healthcare devices allow alarms to be set but these are not the default option. For example the "low anaesthetic gas" alarm is often switched off by default by the manufacturer of the anaesthetic machine. Their reasoning may be that the alarm will sound inappropriately during the wash-in phase of anaesthesia as the anaesthetic gas increases from 0% to the set concentration. The down-side is that inattention by the anaesthetist may lead to patient awareness under anaesthesia if the anaesthetic gas falls below an appropriate level. A contributing factor to the lack of attention paid to the anaesthetic gas level may be a (not unreasonable) assumption that the machine would warn the anaesthetist of low anaesthetic level as the machine does alarm for most other variables if they are below a safe level.

2) Inappropriate alarm limits
Most alarms have default limits set by the manufacturer of the device. A pump may alarm if a given pressure is exceeded or an ECG machine may alarm if a given heart rate is not achieved. Some default limits are however outside safe levels. For example, some oxygen saturation monitors will not alarm until the saturation falls below 90%. With normal saturations of 99-100%, many healthcare personnel would prefer to be alerted at a higher level in order to begin countermeasures. 

3) Alarm not muted
One consequence of not muting an alarm is that the noise may be "tuned out" and therefore ignored. Another consequence is that some healthcare devices do not change tone as alarms stack up, i.e. if a second variable, such as heart rate, causes another alarm to trigger when the first alarm is still sounding, the original alarm masks the new alarm.

4) Alarm muted inappropriately
This was the case with the White House intruder and the ushers did not feel it was inappropriate at the time. The decision as to whether an alarm was muted inappropriately is often one taken in hindsight. The consequences are obvious, an alarm does not sound when it is supposed to. In addition, a false sense of security may occur, especially if not everyone is aware that the alarm is muted. In the White House example, the guards on the North Lawn pursuing the intruder may have believed that he would not gain access to the entrance hall as the guard inside is meant to lock the door if the alarm sounds.


Solutions

Intruder alarms tend to have low specificity and high sensitivity which may lead to repeated activation. In the White House case, with the wonderful retrospectoscope, the muting of the intruder alarm should have triggered an investigation and a search for alternative solutions. Perhaps the alarm could be a visual rather than auditory one, or perhaps the alarm could be relayed to an earpiece carried by the guard.

As always there is not one but many "solutions". Device users should be trained to know what the alarm settings are, how to alter them and the possible consequences of alarm (mis)use. Organisations should be aware of how their devices are being used, should set standards for critical alarm defaults and examine near-misses and critical events where alarms were contributory factors. Device manufacturers should involve end-users from the design stage of the equipment, should test their devices under realistic conditions (e.g. in a simulator) and should act on feedback from end-users to modify their devices.

Wednesday 1 October 2014

Book of the month: The Blame Machine: Why Human Error Causes Accidents by Barry Whittingham

About the author

According to the back cover, R.B. (Barry) Whittingham is "a safety consultant specialising in the human factors aspects of accident causation. He is a member of the Human Factors in Reliability Group, and a Fellow of the Safety and Reliability Society." He is also the author of "Preventing Corporate Accidents: The Ethical Approach".

 

Who should read this book?


Whittingham wrote this book for non-specialists, avoiding discussion of complex, psychological causes of human error and concentrating instead on system faults.

In summary


The book is split into 2 Parts. The first part looks at the theory and taxonomy of human error as well as the methods for calculating and displaying the probability of human error. The second part is a series of case studies of mishaps and disasters in a variety of industries, organised by error type.
  • Part I: Understanding human error
    • Chapter 1: To err is human
      • Whittingham looks at definitions of human error. He explains that it is impossible to eliminate human error, but that with system improvements these can be reduced to a minimum acceptable level.
    • Chapter 2: Errors in practice
      • In this chapter, Whittingham details two error classification systems: Rasmussen’s Skill, Rule and Knowledge (SRK) and Reason’s Generic Error Modelling System (GEMS) taxonomies.
    • Chapter 3: Latent errors and violations
      • Whittingham has placed these two subjects together for convenience rather than relation. He explains the preponderance of latent errors in maintenance and management, as well as the difficulty in discovering latent errors. He looks at ways of classifying violations, their causes and control.
    • Chapter 4: Human reliability analysis
      • Whittingham argues for a user-centred (rather than system-centred) approach to equipment design and, in this chapter, examines methods for determining human error probability (HEP). The two main methods are database methods and expert judgment methods.
    • Chapter 5: Human error modelling
      • In the most mathematics-intensive chapter, Whittingham looks at probability theory including how to combine probabilities and how to create event trees. This chapter also looks at error recovery (how errors are caught and why some are not).
  • Chapter 6: Human error in event sequences
      • Following on from chapter 5, Whittingham provides a detailed example of a human reliability analysis (HRA) event tree: a plant operator who has to switch on a pump to prevent the release of toxic gas in an industrial process.
  • Part II: Accident case studies
    • Chapter 7: Organizational and management errors
      • Flixborough chemical plant disaster, capsize of the Herald of Free Enterprise, privatisation of the railways
    • Chapter 8: Design errors
      • Fire and explosion at BP Grangemouth, sinking of the ferry "Estonia", Abbeystead  water pumping station explosion
    • Chapter 9: Maintenance errors
      • Engine failure on the Royal Flight, Hatfield railway accident
    • Chapter 10: Active errors in railway operations
      • Clapham junction, Purley, Southall, Ladbroke Grove
    • Chapter 11: Active errors in aviation
      • KAL007, Kegworth
    • Chapter 12: Violations
      • Chernobyl, Mulhouse Airbus A320 crash
    • Chapter 13: Incident response errors
      • Fire on Swissair flight SR111, Channel Tunnel fire
  • Conclusions
    • Whittingham concludes by drawing together his thoughts on human error and blame.

I haven't got time to read 265 pages!


This is a very easy to read book (a stark contrast with last month's book) and you may be surprised at how quickly you can get through it. However, those who are pressed for time should probably focus on Chapters 1 to 3 and then skip on to the accident case studies that they are most interested in.

What's good about this book?

Whittingham's style is eminently readable and makes this book into a real page-turner. He also simplifies concepts such as human reliability analysis. For example, having realised the health benefits of soya milk one can create a human reliability analysis event tree of the coffee shop barista not using soya milk in your coffee (all errors are the blog author's not Whittingham's)
 
The error probabilities are the blog author's own (with some reference to the HEART methodology data on p.54) and would suggest that about 1 in 200 coffees will result in the author walking away with dairy milk in his coffee.

Whittingham does not shy away from pointing out the corporate, judicial and political factors which create the environment in which simple errors become disasters. The corporate blame culture which results in the cover-up of near misses and political short-termism, such as seen in the nationalisation of the UK railways, are particular targets of opprobrium.

Whittingham also delivers a fresh look at a number of events which have been covered in detail in other books such as the Chernobyl disaster and the sinking of the Herald of Free Enterprise.

What's bad about this book?


Very little. The mathematics required to calculate error probabilities may be complicated, but this should not prevent an understanding of the concepts. One small gripe is the sub-title of this book (Why Human Error Causes Accidents) which is (perhaps unwittingly) ironic. Whittingham does a fine job of explaining how the term "human error" can be abused to quash an investigation. He also argues that the individual at the sharp end does not "cause" the event, but that the causative factors may lie in the distant past. Lastly, in healthcare at least, we are moving away from the term "accident" (the British Medical Journal banned the word in 2001) as it implies that there was nothing that could be done to prevent the event from happening. Perhaps the subtitle could be rephrased: "Why 'human error' 'causes' 'accidents'"

Final thoughts


This book deserves a place on the bookshelf of simulation centres which are interested in human factors and human error. The concepts of human reliability analysis and human error probability should be welcome in the healthcare environment.


Further Reading


"It's all human error" Blogpost