Understanding and Evaluating Data in Context

12 December 2020 10:00 by John de Villiers

By John de Villiers, Editor Lexis Digest

Conducting legal research can be tedious, monotonous and time-consuming, but performing timely and comprehensive legal research is critically important for lawyers. AI systems certainly aid lawyers by performing legal research on relevant case law and applicable statutes, faster and more thoroughly than most lawyers may be able to do on their own.

Such systems are proving powerful enough to use data to predict the outcome of litigation and enable lawyers to provide more impactful advice to their clients in connection with dispute resolution issues, freeing up lawyers to do what they do best.¹

With one big caveat: Everything is dependent on data; collecting it, dealing with it, analysing it, understanding its nature and applying the results with maximum effect. In short, understanding data “structure” - the way in which it is organised and stored on a database to be accessed and analysed - is key to unlocking its value. Big data (vast amounts of data) is divided into the following three types, each of which will be considered, albeit fairly superficially, below:

Structured data;
Semi-structured data; and
Unstructured data.

Structured data
Structured data is data where the elements can be addressed for effective analysis. It has been organised into a formatted repository, typically a relational database consisting of a table with rows and columns. The data has relational keys and can easily be mapped into pre-designed fields allowing it to be searched using Structured Query Language (SQL). Such data is the most processed and ultimately the simplest way to manage information. Relational database examples include airline reservation systems, inventory control and sales transactions.

Semi-Structured data
Semi-structured data is information that does not reside in a relational database, but which has some organisational properties such as metadata that make it easier to catalogue, search and analyse. However, it is possible for it to be stored in a relational database. An Extensible Markup Language file (XML) is an example, where tags are used to give structure to the text.

Unstructured data
Unstructured data is basically everything else. It is data which is not organised in a predefined manner or does not have a predefined data model, and thus not a good fit for a relational database. Alternative platforms for storing and managing unstructured data are therefore used. Its use is becoming increasingly prevalent in IT systems and it is used by organisations in a variety of business intelligence and analytics applications. Word, PDF and text documents, data from Facebook, Twitter, LinkedIn, and media such as digital photos, is unstructured data. ²

Using Google for legal research³
With the above in mind, let us briefly consider the nature, value and reliability of using Google as a legal search tool and illustrate the strengths and limitations of basing legal research on essentially unstructured data, as opposed to semi- and structured data sources as exemplified by software provided by a legal technology company. Why Google? Because Google is where most of us (92%) begin when we start searching publicly accessible documents on web servers – hundreds of terabytes of information – it is big, no huge, data.⁴

Google strength: Google is very good at providing preliminary basic background information – such as Wikipedia and giving some context from a myriad of sources as researchers start exploring their subject, however they need to recognise and distinguish between those resources which are authoritative and those which are not.

Google weakness: Google allows users to search court cases, but it has a limited ability to weed out non-legal materials. By contrast, a search on a reputable and reliable legal technology platform, with its editorially curated control over the nature, quality and authoritativeness of the information, will be quicker than having to sift through masses of irrelevant information.

Google strength: Google automatically searches for related terms, which removes some of the frustration when searching unfamiliar subjects.

Google weakness: No guarantee of currency. Google has no control over the information it locates and whether it is current, up to date, authentic or accurate. For example, Google does not provide the type of citation evaluation or the status of legal precedents and primary law resources such as national and provincial legislation.

Google strength: Google easily finds legal research information across many jurisdictions and links to electronic resources as a starting point.

Google weakness: No consistent annotations and enhanced content. Lexis Library for instance contains editor-generated explanations, headnotes, and citations to related material and links to other resources on the database from within the text – like legislation and practical guidance.

Accessing data is the essence of legal research, and an understanding of its nature and how to evaluate it in its correct context is key to unlocking its value for the practical application of the law. Having access to tools that simplify this process, that you can trust, and rely on to provide you up to date, relevant information should be a priority for legal professionals.

Need access to a tool that gives you accurate and structured data that will help you take you law practice the next level? Click here.

[1] The Rise of the RoboLawyers
[2] Difference between structured, semi-structured and unstructured data
[3] Googling the Law: Apprising Students of the Benefits and Flaws of Google as a Legal Research Tool
[4] Wikipedia Google Search