MEng | July 9, 2024

M.Eng. Projects June 2024

Once again this semester, LII hosted a team of M.Eng. students, who explored the application of human language technologies to legal information retrieval tasks. Under the guidance of LII’s Language and Data Scientist Dr. Sylvia Kwakye and recent M.Eng. graduate Adrian Hilton, the students began with work-in-progress and refined data, techniques, and visualizations to get the data ready for the public to use on the LII website.

The students met weekly, sharing the results of their work, comparing notes on techniques and challenges, and reaching consensus on what looked most promising to try next. As they worked, they encountered — and became adept at untangling — difficulties with source datasets and pretrained models, alongside the usual software engineering challenges they were more accustomed to encountering from their other coursework.

There was broad agreement that this experience would be particularly relevant to the work they are about to take on at their first jobs in the tech world. These days it seems as though every week there’s a new language model, approach, or tool to try out, and nearly as often, a new flurry of headlines about less-than-credible results from a big technology company’s latest product or feature. The ability to reality check — efficiently — has never been more important, and this semester’s project gave the students opportunities to apply their considerable creativity to answering the question “are we there yet?”.

As a bonus, the students’ in-depth exploration of alignments between a general legal ontology, corpus-specific topical indexes, and a topic model derived directly from the language of legal texts provides a strong foundation on the data side for systematizing topical organization and evolving search for LII’s original content, including the Wex legal dictionary / encyclopedia, which was the focus of LII’s first hackathon.