Legal Futures The best legal AI doesn't replace rules-based engines

Posted by Greg Ingino, chief technology officer at Legal Futures Associate Litera

Ingino: Hybrid approach is not a compromise

There is a belief circulating in legal tech that AI can solve everything – that large language models (LLMs) are universally superior to what came before.

I understand the appeal of that argument. The capabilities of modern AI are genuinely remarkable. But after testing these assumptions directly at Litera, I can tell you with confidence: it is not always true. In legal work, being wrong about this has real consequences.

The question legal technology companies need to be asking is not ‘Can AI do this?’ It is ‘Should AI do this and, if so, how much of it?’ The answer, more often than people expect, is a hybrid approach.

When we put LLMs to the test

Document comparison ^[1] is one of the most fundamental tasks in legal work – reviewing contracts, redlining changes and tracking amendments for compliance. Many assume that an LLM can compare two documents and produce a redline just as effectively as traditional methods. We tested that assumption. It proved not to be true.

At Litera, our rules-based comparison engines have been refined over 20 to 30 years. They are built to deliver the precision and accuracy that lawyers depend on for contract review, redlining and compliance work.

When we explored replacing these engines with AI, we found the AI could not match the reliability and consistency of our rules-based systems. Critically, it also could not handle image and table-based comparison – capabilities that matter enormously in real legal documents.

The data backs this up. Internal benchmark research ^[2] released at Legalweek 2026 by Litera compared its rules-based comparison engine against leading general-purpose LLMs – including Gemini, Claude and ChatGPT – on long-form legal documents containing tables, images, embedded objects, and headers and footers.

The results were unambiguous. General-purpose models were unable to produce usable redlines for non-text elements at all. Even on short documents, LLM text accuracy topped out at around 90%, a threshold that sounds reasonable until you consider that a single missed change in a contract can carry significant consequence.

On a 200-page document, one model’s accuracy dropped to roughly 40%.

The conclusion: LLMs can describe what changed in a document. They cannot reliably produce the legal artifact lawyers actually need.

In legal tech, you cannot afford to be ‘mostly right’ or ‘directionally accurate’. Lawyers need certainty. Compliance demands it. Clients expect it. That is not a standard general-purpose AI models are built to meet – at least not yet, and not for this.

The case for hybrid

Rather than abandoning our proven rules-based engines, we enhanced them with AI. We applied AI where it genuinely excels – natural language understanding, contextual suggestions, intelligent orchestration – while preserving the rules-based foundation that guarantees accuracy.

The result is a platform where AI and traditional precision engineering work together, each doing what it does best.

This is the thinking behind Litera One and Lito, our AI legal agent. Rather than treating AI as a wholesale replacement for existing workflows, we have turned our products into what we call ‘skills’ – discrete capabilities that can be invoked within an agentic architecture.

The AI orchestrates. The rules-based engines execute where precision is required. Lawyers get a single, intelligent interface that understands their workflow and brings the right tool at the right moment.

Not every problem needs an AI solution

We have seen this pattern repeat itself across our operations. When we applied AI to quality engineering, it was genuinely transformative. We have now written 22,000 test cases using AI, which generates close to 70% of our total tests.

The result is better product quality and freed-up engineering capacity for higher-value work. AI was the right tool for that problem.

But the lesson from document comparison is equally important: evaluating where the technology can be best utilised is not optional – it is the work. The firms and vendors who are getting this right are not the ones who deployed AI most aggressively. They are the ones who are most honest about where it belongs.

Ultimately, the measure of success for law firms is not how much AI they have adopted but the outcomes they are achieving for their clients. Every technology decision, including how and where AI is deployed, should trace back to that question: what outcome are we trying to achieve, and is this the right tool to get there?

Firms that stay anchored to that standard will make better decisions about AI than firms chasing the technology for its own sake.

There is also a supervision dimension that cannot be overlooked. You need people overseeing AI who truly understand the domain.

We do not allow AI output to go directly to production without expert review. Our engineers supervise AI-generated code. Our legal technology experts validate AI-driven workflows. Speed without expertise creates risk, and in legal technology, that risk is unacceptable.

Don’t trust the black box

For legal teams evaluating AI tools, the right question to ask vendors is not ‘Do you use AI?’ but: ‘Where exactly does the AI sit in your workflow, and what does it hand off to something more reliable?’

Any vendor that cannot answer this question clearly is asking you to trust a black box with your clients’ most sensitive work.

The hybrid approach is not a compromise. It is the architecture that the complexity of legal work demands.

The tools that will earn lasting trust in this industry are the ones built on that understanding – decades of legal-specific precision, made smarter and more accessible by AI, without sacrificing the one thing lawyers cannot work without: certainty.