Legal tech in 2025: Data, data and more data management

3 February 2025

Posted by Andrew Lindsay, general manager at Legal Futures Associate LexisNexis Enterprise Solutions

Lindsay: Law firms need to use their own data to train their LLMs

Last year was when the adoption of generative artificial intelligence (AI) in the legal world was proven a certainty. Even the staunchest sceptics are now recognising that it is here to stay. But was it also the year that the AI ‘hype bubble’ burst?

The initial excitement and DIY approach to AI has been trumped by the need to demonstrate tangible return on investment from AI investment. The Boston Consulting Group recently reported that 74% of companies struggle to achieve and scale value when adopting AI, and legal providers are taking note, only considering investing in AI solutions that deliver demonstrable business value to their firm.

Data strategy and management should be the number one priority

Consequently, it’s safe to say that a robust data strategy for continuous and thorough data management should be the number one focus for law firms in 2025.

Today, there is the utmost recognition that the promise of AI and more specifically, generative AI, relies almost entirely on the quality and integrity of the data that the large language models (LLM) are fed.

Research shows that technology companies are predicted to exhaust publicly available data for LLM training as early as 2026. So, here’s the rub, especially for those firms who are tempted to buy ‘sexy’ systems that say ‘AI’ on the tin as a short-cut to AI adoption – law firms need to start thinking about their own data source, and how they can glean value from it before they even attempt to gain efficiencies from AI exploiting internal knowledge.

Let’s face it, no law firm can deliver better advice and ‘lawyering’ by simply using publicly available data (i.e. ChatGPT) – but, by using their own data to train their LLMs, that’s when real value can be derived.

A legal practice’s invaluable knowledge and expertise resides in the data held within its systems, so it makes sense that, to truly extract value from AI technology, using their own data warehouse to power it is essential – in addition, of course, to external private and proprietary sources of data, whose quality, reliability, and integrity are proven.

Entangling data – a messy affair

However, if we are honest, many lawyers’ data houses are not necessarily in ‘order’, and the task of solving the data management problem can undoubtedly be difficult and messy.

While such projects are not going to be enthusing and exciting, they are nevertheless essential – not only because of the promise the future of AI holds, but for the increased client and legislative demands weighing on modern legal practitioners. Law firms are therefore better off having a well-developed data strategy that is ‘well on the way’ to implementing AI, rather than treating data cleansing and management as ‘tomorrow’s job’.

The garage analogy is a fitting comparison. For a garage, that for the best part of its existence has been filled with ‘stuff’, and the door lowered to hide the clutter from being seen. Sorting through, deciding whether to discard or retain, and then organising the identified useful stuff, is a daunting, drawn-out, and potentially painful exercise.

What should be thrown away? What’s the decision-making criteria? Should the whole garage be organised in one go, or should the process be staggered? However arduous the task, it is better to consider these factors before decluttering to make sure nothing of value is lost.

As a result, a plan of action can be determined, and the garage will finally be in a fit state to house the new shiny car and comply with the reduced insurance policy cost of keeping it in a locked garage – a win-win!

Considerations for a data management

Today, firms have disparate, disorganised and duplicated (even triplicated) data across various formats – Word, Excel, Outlook, PMS/CMS/CRM systems, and more. A key reason for this is that, in most firms, digital transformation has occurred gradually over the years, often by converting hard copies into digital files.

A careful process of identifying the best, most representative documents to use for training, rather than just feeding all available documents into the model, is crucial.

Due to the colossal volume of data residing in law firms, they must conceptualise and build a data framework to collect, store, curate and manage so that every piece of data is held only once. This data normalisation is important to ensure data quality and integrity.

Routinely, some files may be drafts, outdated versions, or not representative of the firm’s best practices. If these lower-quality documents are used to train the LLM, it could lead to the model producing biased or inaccurate outputs. So, law firms then must determine which data is trustworthy and appropriate for training AI models.

The firm’s data strategy must be driven by the business need and its timeline for AI adoption. Of course, the ultimate vision has to be a carefully cleansed and automatically managed data environment, but realistically this goal cannot be realised in one day.

Data strategy and management projects can take anywhere from two to five years to deliver the full return on investment. In the interim, firms must decide what data they need immediately for training the AI models so that the AI adoption vision can progress.

Part of the data strategy must also be determining, or even taking a stand on, who actually owns the data that will be used to train the LLMs – the law firm or its clients?

Some food for thought: A client instructs the firm to act on its behalf. The firm uses its knowledge, expertise, and experience to process the legal case, delivering an outcome and result for the client. Thereafter, the firm anonymises the client files/data, removing the client-specific information to the extent that it is not possible to identify the client.

This experience collated and accumulated through thousands of such cases and files (in the form of data) is invaluable and, if fed to the LLM, will categorically improve the law firm’s future legal service delivery for the better.

Nonetheless, it is important to define the intent of this client data usage from the get-go, ensuring the client is informed, even if anonymised, so that no compliance implications can be realised.

In summary, the law firms that prioritise the development and execution of a realistic and comprehensive data strategy are the ones that will be best placed to support and derive value from their AI initiatives in 2025.

As the AI market matures, partners and business owners will be expecting to see real, tangible return for their continued commitment, creating ‘AI hope’ as opposed to ‘AI hype’, proving the value that AI holds for the future of law… and your business.

_{Tags:
LexisNexis Enterprise Solutions}

Services Directory Advertise Become an Associate

Market Intelligence for Law Firms of the Future

Legal tech in 2025: Data, data and more data management

Conferences

Claims Futures Conference 2026

Regulation & Compliance Conference 2026

Related News

Is your Google Ads budget funding your competitors’ AI visibility?

Containing the spread of unapproved AI

When AI adoption goes wrong, costs add up fast

Gen Z lawyers vs identity, sacrifice and success

How does the legal profession become neuroinclusive?

Features

Concerned that AI is affecting enquiries? Here’s what to do

Service of proceedings by alternative methods

Faster, leaner, smarter: How AI lets small firms compete with BigLaw

Associate News

The over-compliance problem in UK law firms

Document Direct partners with Expert Witness to support the growing demand for specialist litigation services

Landmark supports property industry to accelerate transactions with Sales Pack Ready

Small injury, serious liability: Why minor workplace cuts are triggering bigger employer claims

Mergers are booming, but what happens to the work that no longer fits?

Risk management guide: client data management and the retention and destruction of documents

The login nobody asked for (and the workaround you already have)

Legal tech in 2025: Data, data and more data management

Upcoming Webinars

Conferences

Related News

Features

Associate News

Associates

R&R Solutions

InfoTrack

Verisk

Kord

Valid8 IP

Linetime

National Accident Helpline

OneAdvanced

Legal Brokers

Express Solicitors

Osprey Approach

Ignite Specialty Risk

Efimis

Qanooni

Search Acumen

Bundledocs

BigHand

LexisNexis Enterprise Solutions

LEAP Enterprise

Somuna

Conscious Solutions

Temple Legal Protection

Recovery First Limited

National Claims

National Accident Law

Clio

Document Direct

VinciWorks

DG Legal

Brabners

O'Connors

SearchFlow

LexisNexis®InterAction®

DR Solicitors

Stridon

Miller Insurance Services LLP

Legmark

Lockton Companies LLP

Internet Erasure Ltd

Seven Stars Legal Funding

AxiaFunder

Allianz Legal Protection

CEL Solicitors

Fraser and Fraser

Landmark Information Group

Nexa Law

iCOFA

OneSearch Direct

Perfect Portal

Financial & Legal

Dye & Durham

Access Legal

LEAP Legal Software

ARAG

Legal intelligence from LexisNexis®

Acquira Professional Services

Sign-up for our e‑newsletter