Imagine it’s April 2021 and you’re considering your lunch options: should you go to your favourite sandwich shop, or try somewhere new? You decide to play it safe and head to the usual spot. Upon arrival, a similar dilemma awaits: order the ham sandwich as always, or venture into uncharted menu territory?
When you think about it, we are constantly making similar decisions between the known old and the unknown new. Who should I socialise with this weekend: the usual suspects or a new colleague? Where should I go on holiday next year: can’t go wrong with Cornwall again, but Reykjavík might be fun?
Computer scientists call this the ‘explore-exploit dilemma’. In short, exploring means testing untried options and increasing your knowledge about them. You might estimate that the tuna mayo will score somewhere between 5/10 and 9/10 for taste, but you won’t know until you test it. Exploiting, on the other hand, means using your existing knowledge (about ham sandwiches) to achieve a good outcome (an 8/10 lunch).
Few of us would equate making such a decision with running an algorithm. But behind the scenes that’s exactly what we’re doing. For an algorithm is, in its simplest form, a series of rules: if A happens, do B; if I enjoyed the ham yesterday and it’s available again today, order it. The reason we don’t tend to think about our decision-making in these algorithmic terms is that we perform these underlying mental calculations so rapidly that they take place beneath our conscious awareness.
But once we recognise that we are unconsciously running algorithms every day, we can consciously choose which algorithms to run. In other words, we can create explicit rules for ourselves to follow in certain decision situations that are likely to enhance our wellbeing in the long run.
When faced with explore-exploit dilemmas, people exhibit a strong tendency to avoid the uncertainty associated with exploring. They seem to unconsciously follow an algorithm known as ‘Win-Stay, Lose-Shift’. This means staying with your previous choice if it worked for you, but shifting to a new option if it didn’t. It explains why, when Deliveroo polled 1,500 UK office workers recently, they found that a third of them ate exactly the same thing for lunch every day – and topping the sandwich pops was the humble ham sandwich.
Win-Stay, Lose-Shift minimises the risk of disappointment and makes sense in one-off decisions. But the limitations of following this rule become immediately clear when we face a longer series of repeated decisions. If asked to order your lunches for the next three years in advance, for example, it’s unlikely that you would immediately request 1,095 ham sandwiches. Or passionately argue that this would be an inspiring life choice, for that matter. And yet many of us fall prey to this ham sandwich fallacy in all areas of life.
The good news is that we can avoid this predictable unconscious pitfall with conscious planning. One method is to create our own simple algorithm to follow: if I’ve been to the same restaurant or holiday destination twice in a row, go somewhere different next time. And we might introduce additional variety by letting a throw of the dice decide our culinary or voyaging destiny from time to time.
Alternatively, we might apply an algorithm called ‘Upper Confidence Bounds’. Also known as ‘optimism in the face of uncertainty’, this means selecting an option based on how good it might potentially be. In our lunch scenario, this would mean choosing the tuna mayo because it might be a 9/10, whereas the ham is an 8/10 at best. This approach similarly emphasises exploration, encouraging us to try new things and meet new people, which will likely prove more fulfilling in the long run.
There is no perfect strategy for all scenarios. And, to a certain extent, the time horizon will always determine the value of exploring versus exploiting. If it’s our last day on Earth, we should stick with what we know. But if we expect to face the same decision several times over several years, we might anticipate our unconscious propensity to under-explore.
In these circumstances, we might benefit from consciously taking the time to build exploration into our routines – and in the process avoid becoming a victim of the ham sandwich fallacy.