Why Local AI Is the Future for Confidential Work
Local AI is becoming the more practical choice for confidential legal work not because cloud models are getting worse, but because two trends are compounding: open models are getting smaller and more capable at the same time, and the consumer hardware they run on — Apple Silicon chips, on-device neural processors, more unified memory — keeps getting stronger. The result is a gap between "runs on your laptop" and "runs on a data center" that's closing steadily for focused, well-defined tasks, even though it remains wide for open-ended general reasoning. This isn't a hype argument. It's a trajectory argument, and it's worth being precise about what's already true versus what's still catching up.
The two trends, separately
Models are getting more efficient. A model released two years ago needed far more parameters — and far more compute — to hit a given quality bar than a model trained today aiming for the same bar. Better training techniques, better data curation, and architectural refinements mean smaller models increasingly do what only much larger models could do before. This is a genuinely fast-moving area of open research, not a marketing claim, and the direction has been consistent: more capability per parameter, year over year.
Hardware is getting stronger, in exactly the way local AI needs. The bottleneck for running a language model on a laptop isn't just raw processing speed — it's memory, specifically how much of it the chip can address quickly enough to hold a model's weights and run inference at a usable pace. Apple Silicon's unified memory architecture, and the steady increase in RAM configurations available on consumer Macs, has moved the ceiling on "what model can comfortably run on a lawyer's laptop" upward every product cycle. Dedicated on-device neural processing hardware compounds this further, handling inference more efficiently than general-purpose CPU cores alone.
Neither trend alone would matter much. Together, they mean the size of model that fits comfortably on consumer hardware keeps growing, while the capability of a given model size keeps improving — so the practical capability available locally is improving on two axes at once.
Where the gap has already closed
For narrow, well-defined professional tasks, local models are already good enough today, and have been for a while:
- Summarization of a document into key points — a task with a clear, checkable output — is a strength of even mid-sized local models.
- Extraction of specific fields (parties, dates, dollar amounts, defined terms) from a document is pattern-matching against text that's already in front of the model, not open-ended reasoning, and local models handle it reliably.
- Flagging likely-relevant clauses or sections for a human to review — surfacing candidates, not making legal determinations — plays to what a local model does well: narrowing a large document down to the parts that need attention.
These are, not coincidentally, the tasks that make up most of day-to-day contract review. They're bounded, checkable, and don't require the model to reason broadly across unrelated domains — which is exactly where a smaller, focused model performs closest to a much larger general one.
Where the gap is still real
It would be dishonest to claim local models have closed the gap everywhere. For genuinely open-ended reasoning — synthesizing an answer across a wide, unfamiliar domain, handling a question that requires broad world knowledge with no document in front of it, or producing highly polished long-form writing on a novel topic — cloud frontier models still have a real edge, and likely will for some time. They're trained at a scale, and run on hardware, that no laptop will match.
The honest way to frame this: local AI isn't winning a race to replace frontier cloud models at everything. It's winning the specific race that matters most for confidential legal work — being good enough, reliably, for the bounded tasks a solo practice actually needs — while keeping the document that never has to leave the building. That's a different, narrower, and more achievable bar than "match ChatGPT at everything," and it's the bar local AI is already clearing for contract review, summarization, and extraction.
The trend isn't only technical
The push toward local processing isn't driven by hardware and model improvements alone — pressure is building from the client and regulatory side too. Clients, especially institutional ones, are asking more pointed questions about where their data goes and who can access it. Data-protection expectations generally continue to tighten rather than loosen. None of this requires citing a specific rule to observe the direction: scrutiny of where confidential material travels is increasing, not decreasing, across client relationships generally. A tool that never transmits the document in the first place is the version of this story that ages well regardless of how that scrutiny evolves.
How to watch this trend yourself, without taking anyone's word for it
You don't have to trust a vendor's roadmap claims to see this trajectory — you can observe it directly, using signals available to anyone:
- Track RAM on new consumer Macs year over year. Each generation of Apple Silicon has shipped with higher standard and maximum memory configurations, which directly raises the ceiling on what size model runs comfortably on a laptop.
- Watch which on-device AI features major platform vendors ship without a cloud round-trip. When a phone or laptop OS adds a genuinely useful on-device model feature, that's a signal the underlying tooling has matured enough for production use, not just research demos.
- Try a local tool on your own hardware, on your own documents, with the network off. The most reliable signal is direct: does it produce a usable, accurate result on the actual task you need it for, today, on the machine you already own?
That last one is the test that matters most for a buying decision — trajectory arguments are useful context, but a tool should earn adoption on what it does right now, not just on the promise of what it'll do later.
Why this trajectory matters more than the current snapshot
A lawyer evaluating AI tools today is often implicitly comparing a snapshot: "how good is the local tool right now, compared to the cloud tool right now." That comparison understates the case for local AI, because the trend line matters as much as the current point. Every year, the model-efficiency curve and the consumer-hardware curve both move in local AI's favor, and neither shows signs of flattening. A local tool that handles today's contract-review workload well is positioned to handle more next year, not because the vendor promises it will, but because the underlying models and hardware it runs on keep improving independently.
Compare that to the alternative bet — sending confidential documents to cloud infrastructure because "the cloud model is currently smarter" — which asks a practice to accept an ongoing disclosure risk in exchange for a capability gap that's actively shrinking, on the tasks that matter most for legal work specifically.
What this means for a solo practice deciding now
You don't need to wait for local AI to be perfect to adopt it for confidential work — you need it to be good enough for the specific, bounded tasks that make up your caseload today, which for contract review, document summarization, and similar work, it already is. Privileged is built on exactly this bet: on-device processing via Ollama, purpose-built around the workflows a solo practice actually runs (contract review, document summary, filing review, time entry), rather than trying to be a general-purpose assistant that also happens to run locally. As the underlying models and hardware keep improving, a tool built this way improves with them, without ever changing what happens to your data.
Frequently asked questions
- Will local AI ever fully catch up to cloud frontier models?
- For broad, open-ended reasoning across every domain, cloud frontier models are likely to stay ahead for the foreseeable future, since they run at a scale no consumer device can match. For narrow, well-defined tasks like contract review or document summarization, local models are already closing the gap quickly, because those tasks don't require frontier-scale general reasoning to do well.
- Why are local AI models getting better so fast?
- Two trends compound each other — open model architectures are getting more capable per parameter through better training techniques, and consumer hardware (Apple Silicon, dedicated NPUs, more unified memory) keeps getting stronger. A capable model plus stronger hardware means the ceiling for "what fits and runs well on a laptop" rises every year.
- Is local AI a niche option or a mainstream direction?
- It's moving from niche toward mainstream for privacy-sensitive professional work specifically. The direction of travel — smaller, more efficient models; more on-device compute; growing regulatory and client pressure around data handling — points toward local processing becoming the default for confidential tasks, not a workaround for them.
- Does "local AI" mean giving up capability compared to ChatGPT?
- For general, open-ended questions, yes, current local models are usually less capable than top cloud models. For a focused, repeatable task the tool is purpose-built for, a well-designed local tool can match or exceed a general cloud tool, because it's optimized for that one job rather than trying to do everything.