JT/DL: Audit Justice Tech
Plus new jobs
The JT/DL is a twice-monthly newsletter about justice technology news, events, and opportunities. My opinions do not reflect those of my employers or professional partners.
Dear Reader,
I’m doing something new this week: a guest post. This newsletter started in 2017 when I was teaching at Georgetown Law with Keith Porcaro. At the time, the newsletter was a way to talk about the syllabus and the projects our students worked on. The newsletter is a lot different today, and Keith similarly evolved. Since then, he’s gone off the Harvard and Duke, and he’s cut a unique path through justice technology that leans on his rare mix of legal and technical expertise. He recently released an audit of a chatbot used by the Nevada Courts, and I asked him to write about it here. I hope you enjoy it.
This is my last post before I take a break for the summer. I’ll be back in the fall with more essays, news, events, and jobs in justice technology. Thank you to everyone that contributed this past year. I look forward to reconnecting with you in a few months.
Best,
Jason
Audit Justice Tech
By Keith Porcaro, assistant clinical professor at Duke Law School.
If my LinkedIn feed is accurate, law is in its AI experimentation era. Law firms are incinerating cash on tokens and models; big platforms are on the rise; and seemingly every day brings a flurry of new startups, research preprints, chatbots, agents, and AI-generated think pieces.
But what’s missing are useful, sober, public results. We have no idea what works until we study what happens, and we aren’t devoting enough attention to studying what happens. That’s where we come in, and where I hope we won’t be alone for long.
For the last two springs I’ve run an AI Audits practicum at Duke Law School, where (as of July 1) I’m an assistant clinical professor. In the practicum, a group of students work with me to audit a live AI tool from a partner court or legal aid organization. Our inaugural project looked at Legal Aid of North Carolina’s LIA chatbot (report here). This year, we worked with the Nevada Courts to audit a self-help chatbot for family law issues (report here), and with Scale Justice to help them develop a testing plan for evaluating a potential AI project (report coming next month). This fall, we plan to work with the Alaska Court System to audit AVA, a chatbot developed to assist Alaskans with probate issues.
The audits themselves are intense deep-dives into a project: we mix structured evaluations with “snowball” style investigations of emergent product issues. Our goal is not to decide whether to stop a project, continue it, or change course. In fact, our reports are deliberately opinion-light. Instead, we aim to give partners as complete a picture as possible about what is happening with their project, along with enough context to explain why our findings are meaningful or important. How partners interpret that picture and move forward is up to them.
We’ve been fortunate to have great partners who have not only made data and staff available to us, but supported us to release our audit reports publicly. It can be intensely uncomfortable to release a report that highlights a project’s imperfections. But these uncomfortable results really do push the field forward, and hopefully drive better investments in future technology projects.
This work—audits, uncomfortable results, introspection—is something that every “justice tech” project needs, especially ones that use generative AI. And for students—for future lawyers—it is far more important to learn how to critically analyze and audit an AI system than to learn how to write a good prompt. As clients deploy more algorithmic and AI systems, audits like these will inevitably become part of a lawyer’s investigative toolkit. But nearly a year after our first audit, this is still the only program of its kind that I know of.
Instead, I worry that we’re seeing an explosion of AI benchmarks: (largely secret) tests to measure an AI model’s performance on a task. Benchmarks are helpful for developing a product, but they can’t replace results grounded in real data. In law and elsewhere, we’ve yet to see good evidence that AI benchmarks are generalizable to real-world settings. This is a real problem for justice tech projects that face unpredictable clients with unpredictable legal issues. Clients tell precisely the sort of stories that large language models struggle with: missing key facts, seasoned with irrelevant details and framing assumptions that may or may not be accurate. An “access to justice” benchmark might not capture this well, and ultimately be more harmful than helpful when influencing product decisions.
To put it plainly, without hard looks at hard data, all of this “justice tech” experimentation is little more than a marketing exercise.
News
Audit report for self-help chatbot in the Nevada Courts. (Duke)
Judge learns lawyers on both sides of case used AI, cancels the trial, and kicks everyone off the case. (404 Media) (h/t Caleb Bushner)
Digital IDs or democracy? (Collaborative Research Center for Resilience) (h/t Cynthia Conti-Cook)
A Peter Thiel-backed tribunal is putting journalists on trial. (Hollywood Reporter)
The Illinois Supreme Court has a new policy of data transparency. (ISC) (h/t William Raftery)
AI in the court room. (Planet Money)
Events
Wikimania will be in Paris July 23-25. (WM)
[New] Works-in-Progress Presentation at the AI for Law Scholars Conference is August 13-14, and they’re accepting submissions. (NW)
The A2J Network Conference will be in Cincinnati October 21-22. (A2JN)
Jobs & Opportunities
Arnold Ventures is looking to fill multiple roles. (AV)
[New] The Bellagio Center has an open call for applications. (RF)
The Brennan Center for Justice has multiple openings. (BCJ)
The Center for Democracy and Technology has academic externships. (CDT)
The Chan Zuckerberg Initiative needs a counsel for AI and tech. (CZI)
The University of Chicago Crime Lab has multiple openings, including internships. (UCCL)
Code for America has multiple openings. (CfA) (h/t Russ Finkelstein)
[New] Draper Richards Kaplan Foundation and NextLadder Ventures launched the navigation tech initiative. (DRK) (h/t Keith Porcaro)
The Kapor Foundation needs research fellows. (KF)
OpenMinded has multiple openings. (OM)
The Pew Charitable Trusts needs a data and policy officer for their courts work. (Pew)
Recidiviz has multiple openings. (R)
Renaissance Philanthropy is hiring for multiple roles. (RP)
TechCongress opened it’s applications for its fellowship program. (TC)
[New] TechTonic Justice needs California State Organizing Director. (TTJ)
[New] Utah Courts’ Self-Help Center needs a content coordinator. (UC)



