A mere 15 years ago, ediscovery itself represented a sea change within the field of discovery, formerly grounded in physical files and objects. Email posed an enormous challenge to discovery professionals who had to learn how to collect, preserve, categorize, process, review, and produce it in litigation.

Today there’s a new transformation happening, but this time, it’s not about a single category of evidence like email. New data sources crop up every day, with unique formats, locations, and properties. And the current ediscovery transformation is about more than just these new sources of data: even within preexisting data channels, there are entirely new modes and forms of communication now, requiring a new approach to all of ediscovery. The key to success with these new data sources will be creativity — a willingness to look for alternative data sources and combine them in novel ways. Let’s review some of these new data sources, considering the problems they can create and solutions ediscovery professionals can use.

New Data and New Communications

Yes, email continues to provide the bulk of ediscovery information, though text messages and social media posts are gaining ground. But beyond intentional communication, a whole world of data is continually being generated and stored. You may, with ingenuity and legwork, be able to find data that businesses and individuals rarely realize they’re creating. Consider what you could do with:

  • Collaboration messages, images, or videos from Slack, Symphony, FactSet, ICE instant messaging, WhatsApp, FaceTime, or Skype, among dozens of others.
  • Location information from mobile phones, vehicles with GPS capability, computer connections to Wi-Fi networks, Uber account records, or Google map search histories. Even if a smartphone’s GPS setting is off, making calls produces a record of the phone’s location — which can be used against the caller in court.
  • Vehicle operation information from driver-assist cars or — before too long — entirely autonomous self-driving cars, including details about travel speed, lane departures, and collision warnings, as well as the vehicle’s self-corrective actions.
  • Artificial intelligence applications and queries. While anyone can ask IBM Watson a question, those queries may be available as evidence later. (And Watson’s cousin ROSS is an attorney, so you know it knows about ediscovery.)
  • Background listening records from smart devices like Siri, Alexa, or Google, which are constantly monitoring sounds to detect their wake words.
  • Internet of Things sensors, including motion-activated lights, home alarms, motion-activated surveillance cameras, keycard access points, or even appliance sensors. (Could a property line dispute be resolved by determining where an automated lawn mower’s boundary has been defined?)
  • Fitness tracker data, including steps taken, GPS locations, stairs climbed, and the user’s heart rate, much of which has already been used to build a murder case.
  • Other wearables such as the Smart Cap, designed to detect driver fatigue among professional drivers.

Again, also look at new forms of communication that are being used within existing channels. Words are falling out of favor today, replaced by emojis, GIFs, memes, photos and other images, and video, not to mention animated emojis and enhanced pictures on platforms like Snapchat.

More Data, More Problems

But all these sources of data bring with them real problems.

Legal’s knowledge of data streams. All too often, legal departments aren’t aware of every data source that exists within their company. Employees may install apps or IT might deploy sensors without consulting legal, which prevents any sort of preservation or collection effort for potentially discoverable data. Develop strict policies that prohibit the use of unauthorized apps and clearly establish company ownership of all data. Work with your IT department to prevent installation or use of unauthorized apps. And enforce these policies when they’re circumvented, ensuring that employees take their generation of potentially discoverable data seriously. Once you’ve considered data access and retention (below), consider whether you should declare specific apps off-limits because of the problems they pose.

Access to data. Who owns the data you’re after? Recently, Arkansas police tried to obtain Amazon Echo voice records related not to specific commands but to the device’s background listening function. Amazon fought the subpoena until the suspect allowed the records to be released. While your business should own all data generated within the scope of business activities, ownership doesn’t guarantee access. Research the collaboration and messaging apps in use at your company. Learn how to access, preserve, and interpret any data that your employees are generating with those or other apps. In the process, determine whether data is encrypted and, if so, how it can be unencrypted.

Retention of data. Related to data access concerns are data retention problems. Legal must be aware of all data streams to ensure that data is preserved from every source or channel used by employees. Preserving data demands that you locate it, whether it’s on company servers or stored within the app’s own cloud. Be aware, too, that some apps do not store messages at all, such as the encrypted self-destructing messages Uber sent using Wickr. Even when data is stored, it may not be retained long enough to be useful.

Interpretation of data. Regarding non-text communication, identifying, contextualizing, and interpreting data offer tremendous challenges. What does a “puzzled face” emoji mean in a text message related to litigation? Is it sexual harassment if an employee texts another an emoji of an eggplant? Work with IT and your ediscovery vendors or software providers to collect, preserve, and make sense of non-text forms of communication.

Solutions to New Data Problems

The primary solution to all of these new data problems is creative thinking. Be ready to look for unusual types of data in novel or even bizarre locations. Continually ask how you could prove your case or any small element of it — or how you could disprove your opponent’s case. Think particularly about what you could learn by layering multiple forms of data together, looking for patterns or anomalies. Equally important for ediscovery, how can you collect these new sources and types of data? How will you preserve, interpret, process, and produce them? How can you standardize an approach to data retrieval and processing when the forms of data are themselves so varied? Keep an open mind and be flexible enough to adapt to whatever data sources you encounter. Be ready not only to preserve information from novel sources, preventing its deletion or overwriting, but also to request unique forms of data from your opponent. Establish solid communications with your IT department so that you understand what sensors or devices surround you, generating data about the day-to-day operations of your company. The future is now. Are you ready?