Introduction:
In our previous post, we provided a comprehensive introduction to the concept of using Language Model AI (eDiscovery AI) for predictive coding. We explored how eDiscovery AI can revolutionize the way legal professionals handle vast amounts of data, enhancing efficiency and accuracy in the review process. Building upon that foundation, this blog post will delve into the workflow of using eDiscovery AI, encompassing the testing and refining of instructions, as well as evaluating performance through recall and precision metrics. Let’s unlock the potential of this game-changing tool!
Designing Effective Instructions:
The first step in the workflow is to develop instructions that effectively guide the eDiscovery AI model in understanding and identifying relevant documents. Consider these key aspects:
a. Targeted Terminology: Carefully select and craft the language used in instructions to capture the nuances of the case, utilizing legal terminology, relevant keywords, and industry-specific jargon.
b. Include Examples: Consider using specific examples of what types of documents are relevant and not relevant. The examples are short, plain English descriptions of these types of documents and not the actual documents as was required in previous predictive coding software applications. For instance, it may be important to direct eDiscovery AI to treat certain content as relevant, but if the only reference to this information comes from outside parties, it should be considered not relevant.
c. Iterative Process: Start with a set of initial instructions and run these through the model across a small set of documents, ideally both relevant and not relevant. Review the results and refine your instructions based on the model’s responses and gradually increase the size of the document set that you’re using to test. Utilizing explanations during this part of the process is very helpful to understand why the system considered a document relevant or not, and then making the necessary updates to the instructions. Continuous iteration and improvement are crucial for achieving optimal performance.
Based on the model’s performance on the testing set, analyze the results and refine the instructions to improve the model’s accuracy. Consider the following actions:
a. Analyzing Errors: Identify any false negatives (relevant documents missed) or false positives (irrelevant documents predicted as relevant). Analyze the patterns and characteristics of these errors to fine-tune instructions accordingly.
b. Expert Feedback: Collaborate with legal experts to gain insights into the model’s performance and seek their expertise in refining instructions.
c. Iterative Cycle: Repeat the training, testing, and refining process iteratively, incorporating feedback and gradually enhancing the eDiscovery AI model’s ability to accurately identify relevant documents.
Classifying and Testing:
Once the instructions are developed, tested, and are performing effectively it’s time to run the entire data set through the eDiscovery AI model.
a. Classify the Documents: run the entire document set through the eDiscovery AI model for all of the necessary issues and categories. The system will begin automatically populating the classifications into your Relativity workspace.
b. Create Control Set: create a random sample that’s representative of the entire set based on appropriate statistical parameters for your case.
c. Review Control Set: a subject matter expert will then manually review the documents in the control set and code each document for every category that’s being analyzed by eDiscovery AI.
d. Calculate Performance Metrics: by comparing the manual categorizations in the control set with those made by eDiscovery AI to generate statistical metrics to measure the performance of the process.
Evaluating eDiscovery AI Performance:
The evaluation of the eDiscovery AI model’s performance is crucial to measure its effectiveness and ensure its alignment with project goals. Key performance metrics include:
a. Recall: Assess the model’s ability to identify relevant documents by calculating the ratio of correctly identified relevant documents to the total number of relevant documents in the dataset.
b. Precision: Determine the accuracy of the model’s predictions by measuring the ratio of correctly identified relevant documents to the total number of documents predicted as relevant.
c. Balancing Recall and Precision: Strive for a balance between recall and precision metrics, as optimizing one may impact the other. Fine-tune instructions and model parameters accordingly to achieve the desired balance.
Conclusion:
Mastering the workflow of using eDiscovery AI for predictive coding is essential to harness its full potential. By designing effective instructions, training and testing the model, and iteratively refining instructions and model performance, legal professionals can enhance the accuracy and efficiency of document review. Evaluating performance using recall and precision metrics ensures the model’s effectiveness and guides further improvements. Stay tuned for our next blog post, where we will explore advanced techniques and best practices in leveraging eDiscovery AI.
Request a Demo to Learn More