How far is too far left?

By Tom Palladino, President, eDiscovery AI

Why advanced AI delivers more value after collection than inside live enterprise systems

Abstract

Many in-house teams and Legal Service Providers (LSPs) are aiming to push discovery workflows, along with advanced AI solutions, closer to where enterprise data lives (e.g., Microsoft 365, Google Workspace, and other data vaults). This article argues for a pragmatic boundary: use in‑place capabilities for legal holds and preservation, then apply advanced analytics, including generative AI solutions, after defensible collection and normalization. Today’s corporate data infrastructures, platform limitations, and complexity, combined with audit expectations, still favor post‑collection AI for complex, high‑stakes matters. Meanwhile, a clear readiness checklist can guide organizations on when and how to responsibly move more work to the left in the future.

EDRM, “Left vs. Right,” and What’s Actually Changing

The Electronic Discovery Reference Model (EDRM) describes how organizations handle electronically stored information (ESI) from information governance through presentation. Crucially, the model is a conceptual, iterative map that teams can revisit as they learn more and may not follow the stages in a strict sequence.

In industry shorthand, the “left” side comprises Information Governance (IG), Identification, Preservation, and Collection; the “right” side covers Processing, Review, Analysis, Production, and Presentation. The industry’s current momentum is toward optimizing the left side: tightening governance, improving scoping, and bringing earlier insight to downstream decision‑making. That momentum is healthy, but it has limits.

Why LSPs Want to Move Left

LSP growth over the last several years reflects clients’ desire for cost control, process rigor, and technology‑enabled services. Moving “left” lets LSPs shape matters earlier, reducing downstream volume, capturing longer lifecycle value, and embedding managed services that blend technology, project management, and skilled review.

As legal departments and law firms pilot generative AI solutions, LSPs are also positioning themselves as integration partners and operational backbones. Those incentives are real; they just shouldn’t be confused with an argument to run full discovery‑grade AI in place on live enterprise systems before identification and collection. The better play, today, is a calibrated shift left that preserves defensibility and speed while avoiding brittle, platform‑specific constraints.

The Temptation of “AI at the Source” (M365, Google, and Data Vaults)

On paper, moving analytics to where data already lives promises faster insights, fewer copies, and a stronger security posture, while also reducing processing costs and other associated costs downstream. In practice, native eDiscovery modules in enterprise platforms remain fundamentally built for governance, legal hold, scoping search, and export, not for end‑to‑end analytics, model training, and complex multi‑platform review at litigation timelines.

Microsoft Purview’s eDiscovery features have matured, including review sets and some analytics, but they come with documented limits (e.g., locations per search, export sizes, job volumes, display caps, and more) and service throttling when tenants exceed search/export thresholds.

Google Vault takes a partner‑first approach for heavy lifting; it is explicitly designed to retain, hold, search, and export, with product quotas for simultaneous exports and guidance to break up large exports. It relies on third‑party eDiscovery tools for processing, review, and analysis. These are sensible designs for governance and legal response at scale; they are not yet a replacement for full downstream discovery analytics in complex matters.

What Purview/Vault Do Well vs. What They Still Need to Improve

Great at (today):

  • Legal holds across core workloads
  • Targeted, in‑place search/collections (a.k.a. “collections” in Purview Premium)
  • Review sets and basic analytics (Purview)
  • Audit logging, defensible exports (both)
  • API‑driven automation for holds/search/export (both)

Not (yet) a full substitute for downstream discovery platforms:

  • Analytics at scale with advanced modeling and reviewer workflows
  • Robust throttling‑resilient performance for large, parallel matters
  • Normalization pipelines (e.g., dedupe, threading, entity/linking) at eDiscovery speeds
  • Deep TAR/AI workflows with portable productions and rich QC controls
  • Litigation‑grade parity across all modalities (e.g., chat reactions, edits, embedded objects)
  • Seamless integration of cross-platform data set

The Defensibility Lens: Proportionality, Validation, Auditability

Under Rule 26(b)(1) of the U.S. Federal Rules of Civil Procedure (FRCP), discovery must be relevant and proportional, considering the needs of the case, amount in controversy, parties’ resources, and the importance of the discovery in resolving issues. Ambitious in‑place AI solutions deployed in enterprise systems can undercut these goals if indexing lacks parity across data types, if search and analytics aren’t transparent and reproducible, and/or if audit trails and export fidelity don’t meet litigation norms.

By contrast, running advanced analytics on collected, normalized datasets supports rigorous QC, sampling plans, and repeatability, exactly what courts expect when discovery decisions are challenged.

What Post‑Collection AI Gets You (Today)

Once data is defensibly collected and normalized, established eDiscovery workflows combined with advanced AI solutions deliver compounding benefits:

  • Normalization and enrichment: deNISTing, deduplication, email threading, near‑dup detection, entity extraction, communication maps.
  • Transparent QC: defensible sampling, validation against holdouts, reproducible logs.
  • Portable productions: stable metadata, Batesing/redactions, and packaging that moves cleanly through meet‑and‑confer to production.
  • Right‑sized scale: specialized platforms can flex resources without tripping service throttles or UI limits inherent to governance portals.

These strengths show up as fewer re‑collections, reduced motion practice risk, and faster time‑to‑insight for case strategy.

A Pragmatic Middle Path (Left, but Not Too Far)

Organizations can (and should) push thoughtfully left today without jumping into full in‑place AI analysis and review. Practical steps include:

  • Use in‑place legal holds early and efficiently (custodian, date range, worksites, etc.).
  • Perform targeted collections/“collections” with clear queries and permission filters.
  • Run AI-powered early case assessment solutions against export samples and datasets to set smart downstream parameters and gain insights about case fact patterns.
  • Utilize a combination of generative AI review solutions with expert human validation to provide transparent defensibility metrics and scope discovery production parameters.
  • Automate bridges: orchestrate holds → search → export → downstream processing/review so the hand‑offs are consistent and logged.

Avoid committing to in‑place modeling/triage for core responsiveness and privilege until you can meet the readiness criteria below.

Three Ways to Show Your Work
When Will “AI at the Source” Be Ready? A Readiness Checklist

Before shifting critical genAI solutions further left, validate these capabilities in your tenant/platform stack:

  • Indexing coverage parity across modalities (mail, files, chats, reactions, edits, embedded objects).
  • Performance guarantees without throttles that break litigation timelines.
  • Forensic auditability with immutable logs, chain‑of‑custody, and consistent export fidelity.
  • Validation harnesses demonstrating recall and precision parity with downstream analytics on representative data corpuses.
  • Partner ecosystem maturity so end‑to‑end workflows work without brittle, manual exports.

Even advocates of in‑place analysis urge careful integration and partner‑driven workflows today; treat native analytics as accelerators, not replacements, for downstream review until the checklist is satisfied.

Common Objections and Practical Responses

“But moving left reduces cost.”

True… up to a point. Prematurely pushing robust review/analytics in place can increase rework and re‑collections and create defensibility gaps that cost more later. Anchor leftward moves in proportionality and validation

“Purview has analytics; why not just use those?”

Purview’s analytics are valuable within limited review sets and for scoping, but they don’t yet replace cross‑source discovery analytics and specialized reviewer workflows common to complex matters.

“AI can triage in place as a first pass.”

Only if you can demonstrate search parity, auditable decisions, and export‑ready fidelity. Most organizations aren’t there today across all sources and workloads.

Conclusion: How Far Is Too Far Left?

Push left with purpose. Use in‑place tools for legal holds and precise scoping; then rely on post‑collection advanced AI solutions for the heavy lifting, where you can validate results, scalereliably, and produce defensibly. Reassess this boundary as enterprise platforms close gaps in indexing, performance, auditability, and partner integrations.

The goal isn’t to stay on the right. It’s to keep your most consequential modeling and review on defensible ground while the left side matures.

ARTICLE

How Far is
Too Far Left?

References

[1] EDRM Diagram Elements (Iterative nature of EDRM) – https://edrm.net/resources/frameworks-and-standards/edrm-model/edrm-diagram-elements/

[2] Information Governance Reference Model (IGRM) – https://edrm.net/resources/frameworks-and-standards/information-governance-reference-model/

[3] Thomson Reuters Institute, Alternative Legal Service Providers Report 2025 – https://www.thomsonreuters.com/en-us/posts/wp-content/uploads/sites/20/2025/01/LSP-Report-2025.pdf

[4] Reuters: Alternative legal services market reaches $28.5 bln (2025) – https://www.reuters.com/legal/legalindustry/alternative-legal-services-market-reaches-285-bln-report-says-2025-01-28/

[5] Microsoft Learn: eDiscovery (Premium) limits – https://learn.microsoft.com/en-us/purview/ediscovery-premium-limits

[6] Microsoft Learn: Limits for Content search and eDiscovery (Standard) – https://learn.microsoft.com/en-us/purview/ediscovery-limits-for-content-search

[7] Microsoft Learn: Service advisories for eDiscovery throttling in Exchange Online – https://learn.microsoft.com/en-us/microsoft-365/enterprise/microsoft-365-ediscovery-throttling-service-advisory

[8] Microsoft Learn: Analyze data in a review set in eDiscovery (Premium) – https://learn.microsoft.com/en-us/purview/ediscovery-analyzing-data-in-review-set

[9] Microsoft Learn: Export case data in eDiscovery (Premium) – https://learn.microsoft.com/en-us/purview/ediscovery-exporting-data

[10] Microsoft Learn: Investigating partially indexed items in eDiscovery – https://learn.microsoft.com/en-us/purview/ediscovery-investigating-partially-indexed-items

[11] Google Workspace blog: Using Google Vault to simplify eDiscovery exports and litigation holds – https://workspace.google.com/blog/ai-and-machine-learning/three-best-practices-simplify-ediscovery-exports-and-litigation-holds-google-vault

[12] Google Developers: Vault usage limits – https://developers.google.com/workspace/vault/limits

[13] Google Vault Help: Export data from Vault – https://support.google.com/vault/answer/2473458?hl=en

[14] Google Vault Help: What is Google Vault? – https://support.google.com/vault/answer/2462365?hl=en

[15] Cornell LII: FRCP Rule 26 – https://www.law.cornell.edu/rules/frcp/rule_26

[16] The Sedona Conference: Commentary on Proportionality in Electronic Discovery (2017) – https://www.thesedonaconference.org/sites/default/files/publications/Commentary%20on%20Proportionality%20in%20Electronic%20Discovery.18TSCJ141.pdf

[17] Exterro: Early Case Assessment (Basics of e-discovery, Chapter 5) – https://www.exterro.com/basics-of-e-discovery/chapter-5-early-case-assessment

[18] EDRM: Early Case Assessment—An eDiscovery Primer – https://edrm.net/2022/09/early-case-assessment-an-ediscovery-primer/

[19] Veritas: Six Trends Affecting eDiscovery in 2024 (Shift Left) – https://www.veritas.com/content/dam/www/en_us/documents/white-papers/WP_ebook_ediscovery_trends_for_2024_V2064.pdf

[20] EDRM: Moving eDiscovery Upstream (context) – https://edrm.net/2024/02/moving-ediscovery-upstream/

Scroll to Top