A ground-breaking battle between lawyers and artificial intelligence (AI) has ended in a comfortable victory for AI.
Lawyers had a week to predict whether real PPI complaints were upheld or rejected by the Financial Ombudsman, using their own resources and unlimited time, before AI, in the form of CaseCrunch, got to work.
The CaseCrunch system, developed by the team behind LawBot, achieved an accuracy rate of 86.6%, compared to 62.3% for the lawyers. Over 100 lawyers, mainly solicitors from commercial law firms, contributed 775 predictions.
Ludwig Bull, scientific director of CaseCrunch, said: “Evaluating these results is tricky. These results do not mean that machines are generally better at predicting outcomes than human lawyers.
“These results show that if the question is defined precisely, machines are able to compete with and sometimes outperform human lawyers.
“The use case for these systems clear. Legal decision prediction systems like ours can solve legal bottlenecks within organisations permanently and reliably.”
CaseCrunch said the main reason for the machine’s winning margin seemed to be that the network had “a better grasp of the importance of non-legal factors”, such as age, than the lawyers.
Mr Bull is working on a research paper explaining the approach taken and broader significance of the challenge, to be published early next month.
The legal judge for the challenge was Felix Steffek, lecturer in law at Cambridge University and co-director of the Centre for Corporate and Commercial Law.
He said: “The factual descriptions of the problems set by the Financial Ombudsman Service are a reasonable basis for a prediction about PPI mis-selling complaints being upheld or rejected by the ombudsman at an early stage in the advisory process.
“Trained lawyers from commercial London law firms, using all the tools and resources they usually work with, are able to make reasonable predictions about these problems at this point even though the information given per claim varies and further information might be revealed at later stages.”
The technical judge was Ian Dodd, UK director of American case prediction pioneers Premonition.
Mr Dodd said it would be interesting to put a value, in pounds sterling, on the processing costs of the challenge: “The real number of ‘Human: 62.3% at £300p/h and X hours’ compared to ‘AI: 86.6% at £17ph and X hours’ is the true bottom line.”
Rebecca Agliolo, marketing director of CaseCrunch, told Legal Futures: “We realised that there are two main obstacles to the substantive advancement and adoption of legal AI.
“Firstly, that the capabilities of AI are shrouded in misconceptions, as lawyers and journalists are fundamentally asking the wrong questions. Secondly, there has been too much talking and too little ‘doing’.
“Ultimately, the challenge wasn’t about ‘winning or losing’- it was about showcasing the potential of artificial intelligence and changing the current paradigm not by talking, but by doing.”
The results were announced at a reception at City firm Kennedys, and the other sponsors were Pinsent Masons and consultants Cosmonauts.
Last year, AI was used to predict the outcome of cases in the European Court of Human Rights in 79% of cases.