Policy makers across the world to a different degree are contemplating about regulating AI. Whilst acknowledging the productive potential of AI, they perceive several risks with its deployment and some call for regulatory intervention. This article aims to present some essentials of AI to contribute to a rational discussion about the need for regulation and its potential scope, which is further developed in a second article.
AI comprises clever algorithms or methods for exploiting the massive increase in computer power (Moore's Law) and data. Computer scientists laid the foundations of AI already in the 1950s. Many algorithms in use today have been around for decades. The public has only paid attention to them since they have produced astonishing performances, mimicking human cognitive capabilities such as object identification (computer vision), talking (natural language processing) or playing games (reinforcement learning), which we associate with 'intelligence'. People also perceive a 'deep neural network' as a kind of human brain, which it is not.
The term 'intelligence' is misleading. It contributed to the popularity of AI, but it also gave rise to rather exaggerated views about what these algorithms can do, both in negative and positive sense. There is considerable hype around AI. We should see it as an evolution of digital technologies rather than a revolution. Whether we will ever achieve human-like intelligence with computers is an open question and would take a long time. The notion of a human-centric AI which appears in many policy papers seems to indicate such a ‘futuristic’ understanding of AI.
Thus, algorithms do not learn in the human sense of the word. Think of machine learning as the transformation of raw data (say a few million pixels) into more valuable data (say grouping these pixels together and label them). Developers need to train algorithms, often using tons of data and massive computational power, on very specific tasks as mentioned above. The algorithms do not understand what it is they are ‘learning’ or ‘performing’. They cannot draw any conclusions outside their training scheme and it is still challenging to apply a trained model to tasks within another context.
This may sound boring, but it leads to astonishing performance levels in well-defined tasks without using hard-coded programmes, their fine-tuning by a developer or adding expert knowledge. I once read about someone in the industry ironically saying that every time he fires a linguist, his translation tool becomes better, will say, data is everything we need. AI allows to automatize data analysis, which has one major advantage: Speed.
As a result, a human can often not replicate tasks performed by an algorithm. Take photography as an example. You can focus your camera, but when the object is moving fast and erratically only the algorithm in your camera can keep it in focus. Speed makes it tempting to rely on machine decisions without human intervention. Most times this is just helpful, but in certain cases this can cause problems. One reason is that AI does not perform without flaws. It can fail even it worked fine before. Another reason is the lack of common sense. AI powered applications can come to decisions which are plain stupid or worse, not acceptable for the society, for instance, considered discriminatory. This is often a problem caused by the data used for training. AI developers have a laborious task to detect these biases, although for statisticians they are not new at all.
Biases have always plagued statisticians to build models or analyse data that are representative of the real world. A paper published in 2019 summarised the variety of possible distortions when dealing with data. For instance, analysts may use non-representative data, confuse cross-sectorial data with longitudinal data or measure input in a way that pre-determines the result.
In machine learning these biases can be particularly difficult to identify because of the opacity of the models. We also refer to opacity as a lack of explainability or interpretability. The definitions vary. Models, in particular deep neural networks, have become complex with millions of parameters optimised through thousands of iterations, typically by applying ‘gradient descent’ and ‘back propagation’. Therefore, it is challenging and sometimes practically impossible to follow the operations being executed within a model to answer the questions why it has come to a certain result and why it failed or succeed. Thus, AI is still a mixture of science, engineering and handcraft. Yann LeCun, a key person in developing deep learning techniques, joked about ‘machine learning as the science of sloppiness’ during a YouTube interview with Lex Fridman (which I recommend watching). https://www.youtube.com/watch?v=SGSOCuByo24
Thus, opacity exacerbates the data problem as it becomes difficult to identify the characteristics in the data sets that cause certain biased results. For instance, Compas, an evaluation algorithm applied in the US judicial system, does not use racial or other information considered discriminatory to decide whether to keep someone in prison while waiting for trial or not; yet researchers found that its recommendations contained racial discriminations because of hidden correlations. For instance, belonging to a group can be correlated with a place of birth, educational background or age.
The case of Compas also reveals the difficulties to encapsulate the notion of non-discrimination or fairness without contradiction. It is impossible to apply one definition of discrimination without coming into conflict with another equally acceptable definition. Karen Hao in an article for MIT Technology Review of 4 February 2019 asks: ‘Does fairness mean, for example, that the same proportion of black and white individuals should get high risk assessment scores? Or that the same level of risk should result in the same score regardless of race?’
There are various ways to deal with discrimination, none of them is perfect. See, for instance, https://arxiv.org/abs/1908.09635 and https://arxiv.org/abs/1908.09635. We should face the reality that unbiased decisioning is not realistic. Even if an AI introduces a bias, it may be less damaging than existing biases. AI models can also – almost as a side-effect - reveal societal inequalities. Therefore, we should enhance our understanding how the technology works in a specific context, how to mitigate negative implications, and use AI to highlight societal problems.
Discrimination is not the only issue being discussed by policy makers. Another problem is using AI for deception. Our media consumption has become in enormous parts digital: newspapers, letters, movies, photos have grown from paper and celluloid to ‘Zeros’ and ‘Ones’ coming to our home through the Internet, available everywhere and shared with everyone. We have known for long that digital information can be much easier manipulated than analogue data. Think of the digital effects in movies, making you wonder why Hollywood still needs actors and actresses. Yet, until recently, humans had to generate these effects by programming each step and feature. AI has taken this to a higher level, namely the ability to create a digital reality without a designer touching a computer by hand.
This offers unprecedented opportunities, not only in the movie industry, but across many industries, for instance, creating simulations of traffic flows or product design. Unfortunately, this capability of AI also turns social media even more into a weapon for people and organisations with ill intentions. We have already come to learn about this the hard way: Fake news, fake videos, fake articles, fake people, and now imagine the speed AI can produce and you will see what is at stake. We need to prevent malicious people from distributing a fake digital reality to millions of people in a split of a second.
To summarise, AI does not mean intelligence in the human sense. AI is data driven and programmes itself from data. Algorithms perform certain tasks with speed at a level surpassing cognitive abilities of people. In doing so, the AI does not understand what they perform. Yet, the potential of AI for business and solving societal problems is significant.
The reason: The pervasive digital infrastructures around us, such as the cloud, social media, and smart phones, amplify the use of data. We are witnessing a positive feedback loop between digitalisation and AI which will have the transformative effect several studies predict.
AI, as with all technologies, will have positive and negative effects. Algorithms can reinforce societal injustices and at the same time be helpful to raise attention about discrimination that exists in the society.
We need to understand how malicious individuals and organisations can exploit the opacity of certain models and the deception potential of algorithms and collectively find answers to fight against these threats.