How ethical is machine learning?

Authors

Lee Griffiths

We all want tech to help us build a better world: Artificial Intelligence’s use in healthcare, fighting human trafficking and achieving gender equity are great examples of where this is already happening. But there are always going to be broader ethical considerations – and as AI gets more invisibly woven into our lives, these are going to become harder to untangle.

What’s often forgotten is that AI doesn’t just impact our future – it’s fuelled by our past. Machine learning, one variety of AI, learns from previous data to make autonomous decisions in the present. However, which parts of our existing data we wish to use as well as how and when we want to apply them is highly contentious – and it’s likely to stay that way.

A new frontier – or the old Wild West?

For much of human history, decisions were made that did not reflect current ideals or even norms. Far from changing the future for the better, AI runs the risk of mirroring the past. A computer program used by a US court for risk assessment proved to be highly racially biased, probably because minority ethnic groups are overrepresented in US prisons and therefore also in the data it was drawing conclusions from.

This demonstrates two dangers: repeating our biases without question and inappropriate usage of technology in the first place. Supposedly improved systems are still being developed and utilised in this area, with ramifications on real human freedom and safety. Despite its efficiencies, human judgement is always going to have its place.

The ethics of language modelling, a specific form of machine learning, are increasingly up for debate. At its most basic it provides the predictive texting on your phone, using past data to guess what’s needed after your prompt. On a larger scale, complex language models are used in natural language processing (NLP) applications, applying algorithms to create text that reads like real human writing. We already see these in chatbots – with results that can range from the useful to the irritating to the outright dangerous.

At the moment, when we’re interacting with a chatbot we probably know it – in most instances the language is still a little too stilted to pass as a real human. But as language modelling technology improves and becomes less distinguishable from real text, the bigger opportunities – and issues – are only going to be exacerbated.

Where does the data come from?

GPT-3, created by OpenAI, is the most powerful language model yet: from just a small amount of input, it can generate a vast range, and amount, of highly realistic text – from code to news reports to apparent dialogue. According to its developers ‘Over 300 applications are delivering GPT-3–powered search, conversation, text completion and other advanced AI features’.

And yet MIT’s Technology Review described it as based on ‘the cesspits of the internet’. Drawing indiscriminately on online publications, including social media, it’s been frequently shown to spout racism and sexism as soon as it’s prompted to do so. Ironically, with no moral code or filter of its own, it is perhaps the most accurate reflection we have of our society’s state of mind. It, and models like it, are increasingly fuelling what we read and interact with online.

Human language published on the internet, fuelled by algorithms that encourage extremes of opinion and reward anger, has already created enormous divisions in society, spreading misinformation that literally claims lives. Language models that generate new text indiscriminately and parrot back our worst instincts could well be an accelerant.

The words we use

Language is more than a reflection of our past; it shapes our perception of reality. For instance, the Native American Hopi language doesn’t treat time in terms of ‘chunks’ like minutes or hours. Instead they speak, and indeed think of it, as an unbroken stream that cannot be wasted. Other examples span across every difference in language, grammar, sentence structure – both influencing and being influenced by our modes of thinking.

The language we use has enormous value. If it’s being automatically generated and propagated everywhere, shaping our world view and how to respond to it, it needs to be done responsibly, fairly and honestly. Different perspectives, cultures, languages and dialects must be included to ensure that the world we’re building is as inclusive, open and truthful as possible. Otherwise the alternate perspectives and cultural variety they offer could become a thing of the past.

What are the risks? And what can we do about them?

Language and tech are already hard to regulate due to the massive financial investment required to create language models. It’s currently being done by just a few large businesses that now have access to even more power. Without relying on human writers, they could potentially operate thousands of sites that flood the internet with automatically written content. Language models can then learn what characteristics result in viral spread and repeat, learn from that, and repeat, at massive quantity and speed.

Individual use can also lead to difficult questions. A developer used GPT-3 to create a ‘deadbot’ – a chatbot based on his deceased fiancée that perfectly mimicked her. The idea of chatbots that can mask as real, live people might be thrilling to some and terrifying to others, but it’s hard not to imagine feeling squeamish about a case like that.

Ultimately, it is the responsibility of developers and businesses everywhere to consider their actions and the future impact of what they create. Hopefully positive steps are being made. Meta – previously known as Facebook – has taken the unparalleled step of making their new language model completely accessible to any developer, along with details about how it was trained and built. According to Meta AI’s managing director, ‘We strongly believe that the ability for others to scrutinize your work is an important part of research. We really invite that collaboration.’

The opportunities for AI are vast, especially where it complements and augments human progress toward a better, more equal and opportunity-filled world. But the horror stories are not to be dismissed. As with every technological development, it’s about whose hands it’s put it in – and who they intend to benefit.

To find out more about our capabilities in this area, check out our DevSecOps page.

Authors

Lee Griffiths

How ethical is machine learning?

A new frontier – or the old Wild West?

Where does the data come from?

The words we use

What are the risks? And what can we do about them?

Recommended reading

Understanding pitfalls & employee psyche amidst change management

Three ways digital twins can transform small airports

Exploring the benefits of security testing