Federal Court Endorses Use of Predictive Coding Software in E-Discovery

Chris Hanslik

March 26, 2012

A New York Magistrate Judge recently endorsed the use of predictive coding technology as an appropriate method to satisfy a producing party’s review obligations in appropriate cases.  In Da Silva Moore v. Publicis Groupe & MSL Group, 11 Civ. 1279 (S.D.N.Y. Feb. 24, 2012), Magistrate Judge Peck issued the first judicial opinion formally recognizing the use of computer-assisted review over a large volume of documents.  Specifically, Judge Peck wrote:  “What the Bar should take away from this Opinion is that computer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review.”

Computer-Assisted Review refers to tools that use sophisticated algorithms to enable a computer to determine relevance of a document based on interaction with a human reviewer.  The process involves a senior attorney, with extensive knowledge of the case, reviewing and coding a “seed set” of documents from the collected data.  The individual document review “trains” the computer to recognize other relevant documents so that it can “predict” the reviewer’s coding of the entire dataset. This process enables the computer to identify properties of those documents that it then uses to code other documents.  As the reviewer continues to review and code additional sample documents, the computer refines its predictions of the reviewer’s coding.  When the system’s predictions and the reviewer’s coding become sufficiently consistent, the system has “learned” enough to accurately make predictions from the remaining documents.

The plaintiffs in Da Silva Moore brought a class action discrimination case under Title VII and the Family Medical Leave Act.  In response to the plaintiffs’ initial document requests, the defendant asserted that it had approximately three million documents that it needed to review.  As a result, the defendant requested that it be allowed to use a computer-assisted review technology.

It is important to note that both parties in Da Silva Moore agreed to use some form of predictive coding, but they disagreed on the details.  Judge Peck commented on how essential transparency was when using predictive coding and praised the defendants for being open about its methods and coding.  Specifically, the defendants agreed to turn over all non-privileged documents in the “seed set” and each of the “judgment based sample” sets providing both relevant and irrelevant documents.  The defendants also incorporated any coding changes proposed by the plaintiffs based on their review of these documents.  As a guide for other litigants, the court annexed the parties stipulated protocol for utilizing the predictive coding to its opinion.

The court also addressed the argument of using computer-assisted review versus other known alternatives noting that:

computer-assisted review works better than most of the alternatives, if not all the [present] alternatives.  So the idea is not to make this perfect, it’s not going to be perfect.  The idea is to make it significantly better than the alternatives without nearly as much cost.

The court referenced studies showing that manual review is not only expensive and slow, but also not necessarily as accurate as computer-assisted review.  The court also noted that the Federal Rules do not require a party to certify that its production is complete or perfect, rather courts apply the Rule 26(b)(2)(C) proportionality doctrine.

Ultimately, Judge Peck found that the use of predictive coding software was appropriate in Da Silva Moore having considered the following five factors:

  1. The parties’ agreement;
  2. The vast amount of electronically stored information to be reviewed;
  3. The superiority of the computer-assisted review over the available alternatives (such as linear manual review or keyword searches);
  4. The need for cost effectiveness and proportionality under Rule 26(b)(2)(C); and
  5. The transparent process proposed by the defendant.

In today’s ever-increasing world of electronic information, companies are creating and storing large volumes of electronic data every day.  If, and when, those companies become involved in a lawsuit it is more likely today that a significant portion of the relevant data will be electronically stored.  The Da Silva Moore opinion will now serve as a focal point to encourage parties and courts to allow computer-assisted review to help reduce the costs associated with manually reviewing such large amounts of data.  There are, however, higher vendor costs associated with loading and processing the documents so that a computer-assisted program can be used, but in situations like the Da Silva Moore case, those costs should be much lower than the fees associated with a manual review.  It should also be noted that a computer-assisted technology can always be used for internal reviews after a large volume of data has been produced or in conducting an internal investigation.