10 min read

What can we learn from orcas?

What can we learn from orcas?
A pod of orcas on the move (image credit: Andreina Schoeberlein, CC BY ND 2.0)

Orcas? For those of you who haven't kept up with marine wildlife news, 2020-2023 saw a big uptick in the number of orca attacks on human vessels around the Iberian peninsula. Is this their attempt to even the odds? Are we heading towards full-on conflict? Is planet of the killer whales upon us?

I can hear you smirk. It's not so funny once you realise orcas have more brain surface relative to their weight than any species on record. If that wasn't bad enough, they are also one out of only five known species (us included) where females go through menopause (the other three are narwhals, pilot whales and the admittedly less formidable beluga whales). And given that they can live up to one hundred years, they can hold a pretty long grudge. Maybe these open acts of war off of the Iberian coast are due to a rude stare by some sailor from Singapore in 1924?

All jokes aside, the whole thing got me interested in social learning. What started as behaviour in one pod in 2020 ended up getting adopted by most of the pods living off the Iberian coast. More than 500 incidents had been reported by mid-2023.

The funniest explanation I found is that it's a fashion thing–like the summer of 1987, when Puget Sound pods adopted the habit of wearing dead salmon on their heads, only to completely abandon the behaviour again in 1988.

Anyway, what does all of this have to do with us? Well, for one it shows us one way we humans react to the rapid build-up of AI applications in our environment. We disable their rudders and try to knock holes in them. To be fair to orcas, in my experience as an ML engineer this is not an uncommon response. There is even some wisdom to it. The sheer number of "magical" AI tools and the black-box nature of their decision-making processes would also make me weary.

From the "law" of averages to mean reversion

Then again, we humans are not that transparent in our decision-making either. And even when our decisions are transparent–as in explainable–they aren't necessarily better. Take for example the well-known case of 5-year-olds beating MBA students in a spaghetti-tower building exercise. So much for the value of textbook knowledge.

Whatever the reason for these boat attacks, I'd say our own track record in collective decision-making is also pretty poor. I used to buy into the whole "wisdom of the crowd" premise as a statistics student. I'm not so convinced anymore. Recent elections and pretty much all of known history have shown otherwise.

There is another argument to be made against the "wisdom of the crowd". Averages–i.e. aggregation functions–are what most machine learning systems eventually bank on to make their predictions. Which is basically counting the number of times certain combinations of events or properties occur, and then using those tallies and their statistics to pull a number out of a hat.

Very useful if the data you have correctly represent the most important aspects of your environment. Not so much if they don't. Which in practice is almost always the case, at least to some degree. One of the most important task in building an ML system therefore lies in determining to what extent the numbers you have at your disposal reflect the environment in which your system will operate.

To use the increase in orca boat attacks as an example, we could extrapolate and say that this number indicates a ramp-up to full-scale interspecies war. We could also apply the fad theory, and say they've spiced up their toy selection from salmons to sailing vessels. Or perhaps this is tied to the increase in orchestrated attacks on blue whales by orca pods, i.e. this is whale killing school? Who knows 🤷‍♂️ The same goes for most machine learning model predictions. Often, multiple human interpretations will fit the same data and patterns.

This annoying little fact doesn't apply just to science, it also applies to real-life decision making. We often come up with explanations after the act, and use statistical averages to justify them. And that might actually not be the best way to go about in business, economics or even as a society, as I'll try to explain below.

A small digression before we continue–perhaps one exception is the mean reversion rule. In financial markets rational expectations and how they reflect the aggregated "wisdom of the crowd" can be a useful basis for contrarian approaches. Then again, I'm not a professional investor so you probably shouldn't listen to me here...

The case for aquatic intelligence

As most of you will realise, none of us operate in a vacuum. Whether we are part of a commercial organisation, a research lab, a family or a soccer team, our decisions are almost always embedded in the contexts of one or multiple social groups.

Even the most (metaphorical) lone wolf out there has some basis for decision making rooted in language, family and tradition or formal education. We love our agency and how it makes us feel–I am no exception here–but in reality we are always entangled in webs of social, emotional, economic and intellectual connections.

These webs don't just consist of human agents. We've always had technologies visibly and invisibly shape our environments–from agriculture and automobiles determining how we organise our living spaces to language, writing and computing shaping how we transmit information across these webs.

Technology might have been a decisive forcing function throughout history, but its role was that of an intermediary. It needed human actors. Recent improvements in LLM capabilities have opened the door to a new role for technology in our human world. The rise in popularity of AI companion apps is but one of the instances where human-to-human relationships are being replaced with human-to-technology relationships.

For our human-to-human relations there is a clear organising principle. They are mostly shaped by our need to amass things–security, fortunes, fame, reputation, knowledge, food, manpower, blessings etc. This aggregational need has been a major driving force throughout history, and continues to play a pivotal role in how economies and societies are shaped today.

It is unclear what forces are driving these new human-to-technology relationships. The effects of for example smartphones on us are mixed, at best. If the behaviour of orcas around human vessels are anything to go by, we will have the same mixed relationship with AI. Some orca pods have learnt to feed of fishing lines, while others are attacking the boats in their territory.

Anyhow, back to us humans. As of this writing we haven't found a carrot that works better balancing individual motivations with common interests than economic growth–the promise of riches for all, of a share of the spoils. The success market mechanisms have had since at least ancient Sumer however don't–as some liberal thinkers would have us believe–make us rational actors.

So what then are we, if not rational or aggregational actors? I'd like to suggest that maybe we're not so much different from orcas–we exist within a pod (our circles, peers, friends, family, colleagues, etc) and take our cues from them as much as from our internal compass to determine our bearing and station in life.

If you accept this as a premise, it would mean a lot of our behaviour is emergent. It'd be determined by our environments and our pods rather than by individual deliberations or the constant need for aggregation. It would definitely explain why our predominantly aggregative mental, social and statistical models have done such a poor job explaining and predicting our collective behaviour.

Decades of behavioural science research and failed liberal policies show us the "homo economicus" construct should have never have been allowed to walk out of the economics classroom.

We can take useful measurements using statistics, no doubt there, but–as we saw in the case of orca boat attacks–they don't really provide us with an explanation. A lot of our current ML applications are nothing but trend detection systems. Unless you reach a scale at which you can start to shape behaviour at the aggregate level, individual users will be incorporated as either another datapoint to be aggregated, or as an outlier to be detected.

There are two active research areas in ML that I know of that try to deal with the limitations in our current approaches. The first is the work on causality and causal inference pioneered by Judea Pearl. The second is research applying findings from swarm intelligence to human settings, as in the example below.

Particle swarm optimisation "convergence" (image source: the pyswarm package).

Another example is a conversational swarm intelligence system developed by Unanimous AI. Its goal is to let groups of humans make better predictions using deliberative processes running in real-time closed-loop systems. The stated goal of their founder is to give groups better tools for collective decision-making, and improve on aggregative techniques such as polls, surveys and elections. This video by their founder explains their latest system in more detail.

Which leads me to the final point I want to make before we move to the future. I think we've been going about human ML interactions wrong–at least from a human-in-the-loop perspective. To answer the question in the title, the main thing we can learn from orcas is that we need to improve our aquatic intelligence.

In practical terms, that means we should look at our own "pod" behaviour and sync or decouple accordingly–both are valid as deliberate choices. There is a good reason why some of the most highly rated soccer players in the world spend more time scanning than their peers. Or why picking up on implicit signals is a must-have for successful executives.

It is also the possibility of decoupling that distinguishes us from insects, and why I believe swarm intelligence algorithms won't be all that effective at explaining or modelling human behaviour. We humans–along with our nautical brothers and sisters–can choose to change pods if we want to. It will sometimes make more evolutionary sense for us to do so than to remain part of the same pod.

What does this mean for the future of AI?

Looking at the next decade, I don't think the AI frontier is better language models. LLMs are useful to build reasoning agents*, but I expect that the true innovations in AI will come from three adjacent fields: multi-agent systems, aquatic (okay, swarm) intelligence, and human-AI collaboration.

The developments in these three fields will need to be underpinned by something else: by advancements in continual learning, goal evolution and goal progression algorithms. Multi-agent systems cannot be trained entirely offline, and standard supervised evaluation methods are less than ideal in multi-agent settings.

This is because I believe building models of the environments in which these systems are deployed will often prove to be a self-defeating effort. Either we can successfully model the environment to such a degree of realism we no longer need the multi-agent system, or the model of the environment is so poor there is no way of telling how the system will perform in the real world.**

To make sure these systems are effective, incorporating feedback on a continuous basis won't be enough (continual & online learning). They will also need to be able to track how far away they are from their goal (goal progression), and whether or not their goal(s) still satisfies their purpose (goal evolution). These are all part of areas of active research, and I don't expect major progress in 2024. Maybe we'll have working algorithms by 2026, maybe later.

The upside of having conversational agents is that human-AI collaborations will become a lot easier to facilitate. That means our aquatic intelligence will need to be primed to work not just with human pod-mates, but with hybrid human-AI pods.

Because of this, successful organisations will be those that leverage the aquatic intelligence of hybrid pods and maximise both their pod goal and their goal-setting behaviours. Since we're talking biology, one field that AI researchers could draw inspiration from is the field of epigenetics.

In all this, I think Galton's contributions to statistics will play a much reduced role. I expect that causality and emergence will be the main actors in the AI systems of the next decade, and I'm curious to see what kind of insights they will bring to our understanding of our own species.

Rethinking collective decision making

It does beg the question of whether the systems discussed above will also work at billion-people scale. Some ant colonies consist of up to 300M ants, and they've managed to somehow make it work using pheromones and swarm intelligence.

On the algorithms side the short answer is no, not right now. As far as I know the two approaches mentioned struggle at scaling up. The current generation of causal inference methods become intractable after a certain number of features, and the conversational swarm intelligence approach has been tested with a maximum of 48 people divided into groups of 5.

On the contrary, aggregative statistics will have no issues at all with a billion observations. So in the short run it will probably remain the default option in our machine learning toolkits and AI applications.

Which might make you wonder, why bother thinking about this at all? The main issue I have with aggregative methods is the information loss that happens when you throw all your observations onto one big heap. Our statistical methods are designed to mitigate this, but they will only take us so far.

And especially when it comes to high-stakes political decisions such as national elections or laws passed through representative or popular vote, it is probably better to get all the nuance and insights you can from the population–without getting bogged down in details or endless discussions.

The same applies to the kind of systematic strategic decision making processes that happen at Bridgewater. I'm not sure corporate strategy will ever be purely a numbers game. However, collecting the "wisdom of the crowd" through surveys and polls is not going to bring out the best possible insights. It is just scratching the surface of our collective pool of knowledge and decision-making skills.


*) As noted earlier, rational man hasn't proven to be a great model for explaining or predicting either individual or collective behaviour. Language has its limits.

**) This line of reasoning has also to some extent played part in OpenAI's decision to make the GPT series models available to the general public. They had no idea how the general public would respond, and the only way to learn was by throwing these LLMs out there.