Rebellious Artificial Intelligence algorithm

Global media outlets broadcast a news report at the beginning of 2024 about an incident concerning Artificial Intelligence (AI) algorithms and their peculiar behaviour.

It reported that a chatbot system, powered by AI and used by one of the major delivery companies to communicate with its customers and respond to their inquiries, deviated from its designated role and violated marketing norms and diplomatic rules.

It went as far as insulting the company's customers and criticised the company operating it by highlighting its flaws and recommending alternative competitor companies. This necessitated the company's intervention to stop this rebellious smart system.

This incident somewhat resembles an event that occurred with Microsoft in 2016, which also had to disable its AI chatbot, known as ‘Tay,’ on the ‘Twitter’ platform (now known as X platform) after it went rogue and used inappropriate language with the public interacting with it on the social media platform within 24 hours of its launch. Such incidents raise critical questions about the unpredictable nature of AI algorithms and their potential to act in ways that breach ethical and conventional boundaries.

This article delves into the underlying causes of such rebellious behaviour exhibited by AI systems, examining the role of training data, algorithmic bias, contextual misunderstanding, and the impact of adversarial interactions.

When an AI algorithm, like those in chat systems, starts exhibiting undesirable behaviour, such as using inappropriate language or acting in a racist manner, these phenomena can be explained by several factors, including issues with training data.

AI algorithms often train on large datasets containing human linguistic interactions, which naturally may include inappropriate or negative linguistic content that subsequently becomes part of the training data. The algorithm is then expected to mimic this ‘linguistic’ data, both good and bad.

Another factor is algorithmic bias; smart algorithms can develop biases based on the data they are trained on. These biases can lead to undesirable behaviour, including generating responses that are not suitable for the intended use of the smart model. Algorithmic bias can be a challenge as it often reflects complex and subtle patterns in training data.

Another factor is the lack of contextual understanding, which often appears with generative AI models related to GPT (Generative Pre-trained Transformer) systems. These models generate their outputs based on patterns learned from the data. However, they might not clearly understand the context or the nuances of human language related to social and cultural norms, leading to inappropriate linguistic interactions in some cases.

Additionally, the phenomenon of adversarial attacks through distracting interrogative interactions can contribute to the algorithm's rebellion.

Some users deliberately or inadvertently present inputs to the smart model in the form of questions that exploit weaknesses in the AI model, leading to linguistic behaviour that does not align with the required context and undermines its conversational abilities.

From a mathematical principle perspective on the operation of AI algorithms, smart models rely on text generation based on the mathematical probability principle in distributing appropriate words to form the suitable text.

For example, generative models use ‘self-attention’ mechanisms, to weigh different words according to their importance and priority and generate statistically likely text outputs based on training data. The model selects each word in its response according to the conditional probability of that word given the previous words and the input question through interrogative interactions.

This predictive process is difficult to control, even if inputs can be controlled, explaining the multiple challenges in curbing the rebellion of the algorithm, which cannot rely solely on one factor such as controlling and selecting data but extends to other factors such as controlling the algorithm's operation itself, adding another challenge alongside data control.

The mathematical complexity of the algorithm and its deep networks explains the presence of this challenge; the mathematical model complexity of AI algorithms, especially newer models based on deep learning algorithms requiring many digital neural networks and large data, makes them highly complex.

They are often described as ‘black boxes’ because their decision-making processes are not easily interpretable, complicating the ability to predict and control how the model responds to all inputs, leading to unwanted rebellious traits in some of its outputs, like those mentioned at the beginning of the article.

When artificial intelligence ‘loses control,’ it results from the model generating high-probability outputs that, in some cases, may be inappropriate or unexpected for the reasons mentioned earlier.

Measures can be taken to address such problems of algorithmic rebellion. This includes developing data-cleaning mechanisms through conditional automatic text selection, reducing bias, improving context handling, designing safe interactions to prevent adversarial attacks that work on targeted interrogative interactions. This also activates continuous monitoring of the algorithm, and being part of the digital elements that the algorithm operates according to mathematical details entered by programmers during the construction of the algorithm model and defining its nature and operational rules.

Oman Observer is now on the WhatsApp channel. Click here