Changing Lanes from TAR to CAL: The Fastest and Most Direct Paths to Uncovering Facts

Converging train tracks
Article by Stephen O’Malley & Robert Bird

Over the past several years, the legal technology industry has touted the benefits of technology assisted review (TAR) and a variety of document analytic technologies, ranging from concept clustering to near-duplicate identification. When paired with trusted technology advisors and established processes, these technologies have helped legal teams successfully reduce the cost of eDiscovery and identify documents that would otherwise not be found through standard keyword search techniques. Despite these benefits, adoption rates remain low.

Traditional TAR, often referred to as “predictive coding,” is a complex, rigid and involved process. In contrast, continuous active learning (CAL) is relatively simple, with few requirements to get started. Results generate quickly, and new feedback is continuously incorporated into the model. This provides a significant improvement for the legal team’s ability to access critical case documents early on in a case, as well as flexibility to adapt the model as new information surfaces.

These processes require a computer to analyze a statistical sample set of reviewed and coded documents and predict the likelihood that the remaining documents maintain similar coding. While TAR proves more accurate and efficient in comparison to the effort involved in a linear document-review of large quantities of documents, it possesses significant weaknesses. TAR requires an upfront (and sometimes significant) time commitment to identify valid statistical samples, review the samples, and train the TAR engine until the predicted results (recall and precision) meet the needs of the legal team.[1] This training effort often needs to be repeated, resulting in a time-consuming process for time-conscious attorneys.

Once programmed, the TAR engine can assign a probability associated with un-reviewed documents that note their likelihood of being coded in the same way. The time cost of undergoing this process can be problematic during certain types of matters (such as investigations or complex litigations) as it prevents attorneys from accessing key case documents at the critical early stages of the review. This can impact a range of factors associated with legal decisions from deposing certain witnesses to moving straight to settlement in the instance of a damning document coming to light. Suffice to say, legal teams would likely weigh the impediments of traditional TAR against running keywords. While TAR could be faster than manual review, both would be slow to discover key documents when compared to CAL.

CAL is a major upgrade to traditional TAR because it removes the need for the upfront sample review. As a document review proceeds, the system is constantly analyzing the documents that are coded. It then compares the un-reviewed documents to the reviewed ones. Thus, it is possible to have the system present documents for review early in the review effort based on the probability they will be responsive. This gives the legal team a tremendous opportunity to access key documents early in the discovery process and make sound decisions based on the most relevant items available. It also improves the efficiency of the document-review team by presenting fewer nonresponsive documents within a review batch. CAL is a significant improvement in that it allows clients access to critical documents in a timely manner without the upfront and time-intensive training process involved in traditional TAR applications.

Whereas traditional TAR struggles to deliver results quickly or to adapt to changes in the review, CAL can quickly elevate key documents based on new information. Traditional TAR’s slower workflow — first coding a control set and then validating the results and iterating toward a stable data set — limits the software’s ability to create a model for multiple decisions. A CAL workflow is simple, which means models can be created for multiple decisions, including hot, responsive and subject-specific issues. CAL increases the elasticity of workflows while simultaneously expanding the volume of data that can be accessed, greatly aiding its users in their ability to excel in their duties in the courtroom. Since CAL does not place the same emphasis on the control set, it can be turned on during initial review without the same risks seen with a traditional TAR review protocol. In addition to allowing more frequent updates to refine the model, CAL will surface documents earlier in the project.

Legal work is painstaking and demands rapidity; it requires both discerning accuracy and efficient execution of duties. CAL offers major advantages to law firms seeking a system that provides these attributes along with the ability to access critical documents early in the legal process. CAL represents a significant technological upgrade for attorneys who work many cases at once or need faster access to potentially critical documents.

The authors would like to thank Paul Wilson for providing research, writing, and editing assistance on this article.


View a PDF version of this article.

Stephen O’Malley |
Robert Bird |