Annotating English Judgments for AI: Structural and Practical Challenges


Time to read

3 Minutes

The fact that Chinese legal artificial intelligence researchers are able to annotate court judgments faster than their English peers sheds light on structural and practical challenges faced by legal technology developers in the common law world.

Natural language processing (NLP) is a major method for legal artificial intelligence (AI) development. Nevertheless, manual annotation has been the bottleneck for many NLP projects. This is especially the case for annotating court judgments. Manual annotation of judgments often means that researchers need to tag a court judgment with different labels such as facts, reasoning, and verdict, so that machines can gradually be trained to extract key elements from judgments.

In a recent workshop, researchers from Tsinghua University and the University of Oxford discussed NLP research in legal domains, in particular sharing their experience in manually annotating court judgments. It turns out that the Tsinghua team managed to annotate many more judgments than their peers in Oxford in a relatively short period.

Two reasons may explain this. The first is a dissimilar writing style among English judgments from different courts and judges. One of the well-known examples is Lord Denning’s ‘bluebell time in Kent’ in Hinz v Berry [1970] 2 QB 40. Such an idiosyncratic judicial style forces annotators to spend much time comprehending a judge’s approach to factual and legal issues.

By contrast, Chinese judgments are much more formalised and homogeneous (see a translated example). The Supreme People’s Court issued a protocol on how to structure a judgment, and this approach is followed by the key actors in the whole judicial system. Normally, a Chinese judgment adopts the following structure: (1) a brief procedural history; (2) parties’ argument and evidence; (3) court’s finding of facts; (4) court’s reasoning regarding disputed issues; and (5) final ruling. The formalised style saves annotators a lot of time when labelling texts.

The second reason is a practical one. Copyright concerns over English judicial materials are keeping researchers away. The Copyright Policy of the British and Irish Legal Information Institute (BAILII) prohibits users from incorporating data ‘into the output of a computer program’. For the time being, the University of Oxford seems to be the only institution that is permitted by BAILII to use its mass dataset for AI purposes. ROSS’s (a legal AI pioneer) shutdown amid a copyright infringement lawsuit suggests that it is even less likely for legal publishers such as Westlaw and LexisNexis to allow users to conduct AI analysis based on their data. This is discouraging for AI researchers since legal publishers’ product (eg headnotes of reported cases) can be the most ideal data for manual annotations, given their formalised feature and informative content.

Different from the situation where English researchers’ hands are tied by copyright concerns, the Chinese legislature and judiciary take a more flexible approach. Article 5 of the Chinese Copyright Law provides that ‘This law shall not be applicable to … documents of legislative, administrative or judicial nature’. Further, all Chinese judgments are published on one single platform, where the User Agreement does not prohibit users from conducting NLP studies (unless the website operation is affected or attacked by data mining activities). The open access to judicial data has accelerated the development of legal AI in China. For example, a judicial reading comprehension dataset has been developed to help researchers extract elements from mass judicial texts.

To conclude, one must remember that it is open-source practice which boosts innovation in AI. Assuming that (1) the idiosyncratic drafting style will continue to be one of the key features of common law judgments, and (2) it is unlikely that the common law courts will issue a practice direction to standardize the structure of judgments, it is thus more practical to tackle the copyright challenge first. With regard to the publicly available ‘plain judgments’ published by judicial authorities and institutions such as BAILII, it would be less controversial if they took a more open approach to copyright policy, since these primary legal materials arguably fall within the scope of the public domain (on copyright in judgments under common law, see, for example, Brendan Clift, ‘Rights in Primary Legal Materials in Hong Kong: Proprietary or Prerogative; Private or Public’ (2010) 40 Hong Kong LJ 659). Granting unrestricted licenses for data mining of such public sector information would be a good starting point. With open access to mass data, more researchers would be able to collaborate and develop NLP datasets of common law judgments. As a next step, with developed NLP technology that comprehends judicial texts, researchers would be in a better position to negotiate with legal publishers to collaborate on commercialisation and lead the legal industry into the age of AI.


(I would like to express my appreciation to the organisers and participants of the webinar ‘Legal NLP in China and England’ for sharing their impressive work, which inspired me to write this piece. My special thanks to Jacob Sin of Herbert Smith Freehills and the editorial team of the Oxford Business Law Blog for their comments. The views expressed herein are my own and should not be attributed to any organisation.)

Zhaoyi Song (Troy) is a Trainee Solicitor at Herbert Smith Freehills.


With the support of