BUILD is a sequential sentence classification dataset which provides structure to Indian Court judgements using sentence rhetorical roles. Automatic Structuring of Court judgements is foundation building block for creating other applications like summarization, automatic charge identification etc. This is created as part of OpenNyAI mission by EkStep Foundation, Thoughtworks , Agami , National Law School's law and technology society (Bangalore) and Rohini Nilekani Philanthropies.

For more details about BUILD, please refer to our paper:

For details about data download, preprocessing, baseline model training, and evaluation please refer to GitHub repository. To try rhetorical rolewise summarization on custom judgement text using the baseline model, please refer to Colab Notebook.

BUILD is distributed under a CC BY-SA 4.0 License. The training and development sets can be downloaded below.

Once you have built your model, you can use the evaluation script we provide below to evaluate model performance by running python <path_to_prediction> <path_to_gold>

To submit your models and evaluate them on the official test sets, please read our submission guide hosted on Codalab.

Rank Model Code Weighted-F1

Bert-base HSLN (Baseline model)
Ekstep, Thoughtworks