The Efficiency Delusion
Hillary Sillitto - 13th January 2022
Why do we keep falling foul of the efficiency fallacy?
Last month Common Weal published a blog by Robin MacAlpine titled 'You are not (even) a number'. In the discussion under the article, Robin paraphrases a common complaint of health professionals he speaks to:
Scotland seems determined to run the NHS at 100 per cent capacity and the wonders why every period of slightly above average demand creates a mini-crisis. As one person said to me, medics on the continent are bemused at how anyone can think a 100 per cent capacity health service is supposed to cope.
This reminded me of a conversation at an engineering management meeting I was at some years ago, which went roughly as follows.
Senior colleague: My engineers are nearly at 100% planned utilisation.
Very senior colleague: Excellent! How is on time delivery performance?
Senior colleague: Not very good actually, and it seems to be getting worse.
Silence round the table.
Me: That’s not surprising. We know from Queuing theory[1] that if you are running at 100% capacity, delays are inevitable. In fact, in a system running at 100% capacity all the time, queue lengths will tend to infinity!
Senior colleague: Queuing theory - but that’s only a theory!
Me (silently, to myself): Give me strength! I don’t think I can be bothered with this any more. I wonder if I can afford to retire yet?
As it turned out, I could, so I did.
Let’s step back. ‘When a journalist [or manager] says theory, they mean a guess. But when a scientist or mathematician says theory, they mean the closest you or I will get to the truth[2].’ Something referred to as the xxx theory is the closest we can get to an absolute scientific truth, tested, never disproved, and found correct in numerous practical situations.
As the rate people arrive in the queue gets closer and closer to the rate they can be dealt with – and of course they’re arriving at random intervals, and processing times vary randomly as well – the queue length really does head for infinity[3].
So what does this mean in practice? I guess they don’t teach queuing theory in British management schools. Or maybe they do, but managers forget it when they find themselves stuck in the middle of a management system that’s obsessed with efficiency above all else, and whose leaders are blissfully unaware of queuing theory or don’t care. The system reinforces this view with both carrot and stick. Bonuses reward ‘efficiency’. Spare capacity is punished. To protect your team in an organisation that is constantly on the lookout for ‘efficiency savings,’ you keep everyone in your team busy. This means running a backlog, which may be good for the organisation (or your bit of it), but maybe not for its customers. If you do the right thing by your customers, and keep your backlog small, the efficiency Nazis think you’ve got spare capacity – and in many organisations, they’ll weed it out in the name of efficiency.
‘But that’s just management, the way of the world,’ you say, ‘what’s the problem?’ Well the problem is that an efficient organisation – efficient by its own internal metrics – is not necessarily an effective one as experienced by its customer. As Robin’s quote on the first paragraph demonstrates, this appears to be a particularly British disease. I’d argue that its root causes are a) a focus on internal costs rather than external value, and b) micro-management that slices and dices budgets and responsibilities so much that no-one inside the system can see, never mind improve, the whole picture.
A lot of enterprises tackle this by creating posts called ‘directors of user experience’ or similar. (User Experience is often abbreviated to ‘UX’.) Their role is to look at the whole business as experienced by a variety of typical (and, equally importantly, atypical) customers, and use that perspective to make sure that the variety of services the organisation can deliver matches the variety of customer needs they have to deal with. The tool often used for this is to examine the end to end customer journey, looking at the whole system from various customers’ points of view, and seeing how to make the customers’ experience as seamless as possible. Counter intuitively, there is abundant evidence that doing this not only improves the customer’s experience, but also reduces the organisation’s costs – through a reduction in what’s called ‘failure demand’, which is first doing the wrong thing, and then having to do more work to put it right[4].
An example of failure demand would be failing to detect a cancer early, at a point when it could be treated easily, and having to deal with it later at much more expense with a much worse outcome for the patient. Or not quickly fixing an oil leak in a car engine, so the engine gradually becomes damaged beyond economic repair. Or building Nightingale Hospitals when there are no nurses or doctors to work in them.
So how to do better?
First, add spare capacity where it’s cheap.
Second, minimise the number of process steps each customer experience involves, and keep unavoidable queues short.
Third, look at your output stage and see where the customer goes next. Bed blocking in hospitals occurs because social care packages aren’t ready for patients when they are ready to be discharged. Social care resources cost less than acute hospital beds, but are paid for with a different budget, so we are back to the slicing and dicing problem.
Fourth, put people on the right ‘customer journey’ from the start. A&E departments shouldn’t have to deal with cases that can be dealt with by GPs or minor injury clinics. (Maybe putting GP surgeries on a 9-5, 5-day-week basis with the new GP contracts fifteen years ago wasn’t a great idea!) Ambulances should not be queuing outside A&E departments. The NHS often seems to be set up on the assumption that consultants are the scarcest resource. How would you organise it if you decided paramedics and triage nurses were the most important resource?
And fifth, help your customers to help themselves. Humans can see a queue and make choices to avoid it. Given them the right information and incentives so they can make smart choices that help both them and you.
Lots of organisations are doing this really well.
Lots more are doing what they can to improve things within their control, but don’t have a wide enough span of control to do as much as they’d like.
We need to find good examples, explain how they discovered what works for them, so others can embark on such discovery in their own context. And we need to educate managers to understand the whole system view, that less is sometimes more, that over-controlling increases costs and reduces efficiency, that inappropriate metrics are actively counterproductive, and that creating artificial internal markets in big organisations can create transaction costs that far outweigh the notional savings.
[1] Queuing theory is well established maths, which has been the basis of capacity problem for over a century, ever since a gentleman called Erlang was exercised by the problem of how many operators they needed in the Copenhagen telephone exchange. See https://en.wikipedia.org/wiki/Queueing_theory
[2] Thanks to an anonymous account on Twitter for this explanation.
[3] https://packetpushers.net/average-network-delay/ Just after (equation 8): average number of customers waiting for service in a M/M/1 system = r2/(r-1) where r is the ratio of arrival rate to service rate. Blows up as r tends to 1.
[4] https://locality.org.uk/wp-content/uploads/2018/03/Locality-Report-Diseconomies-updated-single-pages-Jan-2017.pdf