Legal Futures Groundbreaking test finds AI judges "too persuadable"

^[1]

Judges: Research says replacing humans puts justice at risk

A groundbreaking test of how persuadable an AI judge can be has raised serious concerns about access to justice and miscarriages of justice.

The research – carried out by two academics in Ireland– revealed that AI judges were too open to powers of persuasion, to the extent that they could be talked into a verdict by skilled advocates.

Dr Oisin Suttle, Associate Professor of Law at Maynooth University, told Legal Futures: “The main lesson from our results is that we cannot treat AI models in the legal domain as objective analysts of law or facts. The legal conclusions they reach are substantially shaped by the arguments presented to them.

“It is important that a judge should be open to persuasion. Otherwise there would be no point in letting the parties argue their case. However, we don’t want a judge to be too persuadable.

“They should decide on the law and the facts of the case, rather than being swayed by a skilled advocate. Our results show that AI models in legal settings can he very persuadable, and that raises worries about reliability and fairness.”

Dr Suttle, working with Dr David Lillis, Associate Professor at the School of Computer Science at University College Dublin, created a “simulated legal contest” using genuine evidence from court cases held in England and Wales, Ireland and the US.

The contest pitted an AI prosecutor against an AI defence before an AI judge.

By varying how much one side argued their case, and by measuring how often the prosecution or the defence won, the academic saw how the quality of the legal arguments put forward to the judge could shape the outcome in a case.

The results showed every AI model tested was “measurably persuadable”. Across 20 systems, including Anthropic, Google and OpenAI, stronger advocate models won between 58% and 71% of cases on average. In the more extreme contests, the stronger advocate won 90% of the time.

The conclusion – outlined in the research paper, Persuadability and LLMs as Legal Decision Tools ^[2] – said: “AI systems in the justice domain are open to manipulation, depending on the arguments presented to them.”

Dr Suttle said: “An overly persuadable judge is a bad judge, and that brings two significant risks to the justice system. The first is accuracy and stability. A model that is highly persuadable will give different answers given different arguments.

“The second is fairness and access to justice. The more persuadable a decision-maker is – whether human or artificial – the greater the advantage to those with the resources to engage the best advocates.

“Our research shows the extent of the persuasion is striking. It is undesirable that the quality of the advocate should effect outcome to this extent.”

The academics carried out their research in response to the rise of AI in courts and tribunals in the UK and around the world.

Last week, Lord Chancellor David Lammy announced that AI legal assistants would be used to assist Crown Court judges ^[3].

Dr Suttle said that, despite the flaws in legal AI, the problems facing the judiciary – from shortfalls in funding to a lack of judges – would lead to it being used more and more in courtrooms.

He said: “In a world where courts are overloaded, with cases backed up and constant delays – which seems to be the permanent state of most court systems – the attraction of AI tools is obvious. The UK government’s announcement highlight this.

“That move is being proposed as a support for human decision-makers, but those lines can get blurred: a model that is doing ‘research and case analysis’ may not be making decisions, but it is shaping the decisions that human decision-makers make.

“Where those human decision-makers are under significant time and work pressure, there is a tension.”

The research raises concerns about how AI judges could mislead a jury when summing up a case, or directing a jury during a trial.

Dr Suttle said: There are certainly risks of AI systems leading to miscarriages of justice. Cases such as the Post Office scandal in the UK, Robodebt in Australia, and how AI was used in the Netherlands to identify welfare fraud all highlight the dangers of overreliance on automated systems where these directly affect individuals’ rights.”

The academics stopped short of calling for AI to not be used in legal settings, but they said its weakness around persuasion needs to be measured and disclosed when any AI legal decision making tool was evaluated.