It was the French philosopher Voltaire who famously said that you should judge a person by the questions they ask rather than the answers they give. These may be very wise words when it refers to humans, but for artificial intelligence the situation is even simpler: the machine doesn’t need to know what the question is in the first place in order to give a respectable answer.
AI uses the concept of clustering: by presenting a large volume of data to a suitable algorithm it will find clusters of similar data points. These clusters may depend on many different features; not just a couple of things like salary and propensity to buy a certain product, but, in some cases, many hundreds of different features. The AI is providing mathematical muscle beyond the capability of a human brain to find these clusters.
But these clusters are not (or, more accurately, don’t have to be) based on any predetermined ideas or questions – this is usually called ‘unsupervised learning’ in the AI world. The algorithm will just treat the information as lots of numbers to crunch, without a care whether it represents data about cars, houses, animals or people. But, whilst this naivety of the data is one of AI’s strengths, it can also be considered a flaw.
For big data clustering solutions, the algorithm may find patterns in data that correlate but are not causal. In a rather whimsical example of an AI system finding a correlation between eye colour and propensity to buy yoghurt, it would take a human to work out that this is very unlikely to be a meaningful correlation, but the machine would be naive to that level of insight.
The AI may also find patterns that do not align with social norms or expectations – these usually centre around issues such as race and gender. There is plenty written already on the challenges of unintended bias (including in our own blogs), but in this case an awkward correlation of purely factual data may naively be exposed by the algorithm. The challenge for those responsible for that algorithm is whether this is a coincidence or there is actually a causality that has to be faced up to. How that is handled will have to be judged on a case-by-case basis, and with plenty of sensitivity.
There is also the infamous example of from a few years ago of the Microsoft tweetbot (automated twitter account) that turned into a pornography-loving racist. It was originally intended that Tay, as they called the bot, would behave as a ‘carefree teenager’ learning how to act through interactions with other Twitter users. But it quickly turned nasty as the human users fed it racist and pornographic lines which it then learned from, and duly repeated back to other users. Tay, as a naive AI, simply assumed that this was ‘normal’ behaviour. It only took a few hours of interaction before Microsoft were forced to take the embarrassing tweetbot offline.
One useful way of thinking about the naivety of AI is to consider how dogs learn. All dog love going for a walk, and the owners generally know this because the dog gets excited at the first signs that a walk might be imminent. These include things like locking the back door and putting walking shoes on. Now, the dog has no idea what the concepts of ‘locking the back door’ or ‘putting walking shoes on’ are, but they do know that when these two events happen in close succession then there is a high probability of being taking for a walk. In other words, the dog is completely naive to what the preceding events mean – they are just data points to it – but they can be correlated into a probable outcome.
This dog/AI analogy is quite useful and can be extended further: some dogs can be quite lazy, so if they see the owner lock the back door but then put running shoes on, they might go and hides to make sure they don’t have to go for a tiring run. In this scenario, the dog is using increased granularity to calculate the outcome this time – it’s not just ‘shoes’ but ‘type of shoes’. Of course, the dog doesn’t know that running shoes are specially designed for running, just that they are different enough from walking shoes. It may be the different colour/shade, a different smell, the different place where they are kept, etc. This demonstrates the opaqueness issue of AI: no-one would have a real idea (unless they do some pretty thorough controlled testing) what aspect of the shoes switches the outcome from ‘Excellent, I’m going for a walk’ to ‘Hide, he’s going for a run’, but it clearly does have a binary impact.
The dog/AI analogy, by the way, does has its limitations though: dogs have lots of other basic cognitive skills, such as knowing when it is time for his dinner without being able to tell the time, but because AIs are currently very specialised in their capabilities, an AI that predicted walks would not be able to predict dinner time (this is the ‘narrow AI’ versus ‘general AI’ debate).
So, the naivety of AI systems can be a real headache for its users. Suffice it to say that the outcomes from the clustering must be used carefully and wisely if they are to yield their full value. Data scientists and AI developers must be aware of the consequences of their creations and must apply plenty of common sense to the outputs to make sure that they make sense in the context for which they were intended.