Effective Multilingual E-discovery

In summary: Best practices for law firms.

Multilingual e-discovery challenges: Firms must navigate diverse languages, formats, and jurisdictions with advanced e-discovery technology.
Data collection and processing: Legal teams must address data laws and technology limitations to handle multilingual documents properly.
Technological expertise and planning: Law firms require international e-discovery plans and a balance of machine translation and human expertise for effective multilingual case management.

Nowhere is the language obstacle more pervasive than in the legal profession—specifically in the area of discovery—where documents created in any language could be relevant to a lawsuit, investigation, or regulatory matter.

Fortunately, law firms facing the material in multiple languages have various technologies at their disposal to help manage this challenge.

Where documents in more than one language need to be sifted through and organised, this is called multilingual e-discovery.

It’s an increasingly common situation in a globalized era, and many lawyers find themselves working with cases covering material in multiple languages. Luckily this new age of legal globalization also coincides with a new generation of e-discovery technology. Companies such as Kroll Ontrack are leading the way in developing systems to manage e-discovery that provide support for multiple languages.

Multilingual e-discovery is a real asset for lawyers working on multi-language projects, but law firms must understand the technology they can access. Although the technology is increasingly sophisticated, people working with it need to understand the limitations and capabilities of what they are using.

One of the key fundamentals to grasp is the difference between code pages and Unicode format. Most electronic documents will be saved in one of these formats, which has implications for the accuracy of e-discovery processing. The optimum system may vary from one project to the next, depending on the languages being used.

Document e-discovery faces additional technical challenges when documents use languages in different alphabets or are written in different directions (such as Arabic). Some languages don’t handle linguistic parts of speech in the same way as others. For instance, spaces between words don’t necessarily indicate the start and end of a word in Asian languages. Systems need an additional layer of computer programming if they are to be able to handle multilingual documents in these kinds of languages.

Additional challenges in a globalized world

Language issues may also affect the data-gathering process. A legal team gathering documentation from multiple jurisdictions not only has to deal with the issue of numerous language texts. There are also likely to be data collection laws that are relevant to these activities, and they may vary across the different legal jurisdictions. Some of these may prevent data transfer outside that territory, which complicates or frustrates data-gathering efforts. This has implications for the technology used to store or transfer data across international borders.

Tools that lawyers commonly use for collection may only work in English and its related languages in the Western European language set.

The processing system must support the relevant character set to ensure that documents appear as they should. Suppose the system does not support a particular font. In that case, it will usually be rendered as a question mark, square box, or similar substitute character when the document is retrieved. This will have severe consequences for presenting those documents as evidence later. Law firms must understand the capabilities of the systems they are working with to avoid problems such as this.

Plan ahead

With lawyers working in an increasingly globalized environment, law firms need to have an international e-discovery plan in place before they need one. At the point when a project comes in requiring them to handle multiple language documentation, there won’t be time to implement one.

It’s important that law firms have the technological expertise in place to understand the requirements of processing and filtering multilingual data.

Law firms need to be able to ask the right questions before they invest in new technologies. This isn’t always as straightforward as it sounds. Although e-discovery service vendors will frequently advertise their system as Unicode compliant, this only means it recognizes Unicode characters. It does not necessarily mean the system can filter or process Unicode data or enable comprehensive search and sort capabilities.

The legal industry hasn’t got a good track record regarding technological aptitude, but it’s becoming an increasingly important part of working life in the industry. Being able to adopt new technologies is becoming an element of competitive differentiation. And it isn’t enough to leave technology to the IT team. Those working in law need to understand the technology available to them as it affects how they perform as lawyers. This involves everything from asking the right questions of software vendors to understanding the capabilities of the software that they are working with.

Taking the right approach

When a legal team works with multilingual documentation, they have two options available to them. The first option is to use lawyers with relevant language capabilities to work on the documents. The second option is to get the documents translated before working on them. In reality, many law firms approach projects using a mixture of the two methods. This usually considers making the best use of their language capabilities within their existing team and the cost of accessing legal talent who also offer the relevant language skills and understand the subject matter.

Some law firms also use elements of machine translation when working with multilingual documentation.

Many law firms seek a compromise between using the cheaper, less reliable machine translation for bulk document review and human translation on a smaller number of key documents. There are many pitfalls associated with this approach. One of these is that machine translation systems can’t always accommodate documents written in more than one language.

Another is that the terms associated with the legal industry can be ambiguous – meaning they can have two or more distinct meanings. Some legal texts are vague and unclear, meaning they have multiple meanings, some (or all) of which have indefinite applications. Some legal texts are ambiguous – they use concepts with an indefinite application to particular cases. And machine translation software is not very good at making sense of these.

Whichever approach is used, law firms need to have a plan in place long before multilingual projects arise for them to work on.

In many firms, not enough is being done to educate employees about the capabilities of different technological options. This causes difficulties when new multilingual work comes into the business. Still, it also puts firms at a competitive disadvantage if they aren’t able to offer capabilities in an increasingly globalized working environment.