The JT/DL is a twice-monthly newsletter about justice technology news, events, and opportunities.Subscribe for free to receive new posts and support my work.
News
ICE’s surveillance state isn’t tracking only immigrants. (New York Times)
ICE’s use of AI will lead to big mistakes, maybe that’s the point. (Rolling Stone)
Local police aid ICE by tapping school cameras amid Trump’s immigration crackdown. (The Guardian)
ICE observer says her Global Entry was revoked after agent scanned her face. (Ars Technica)
Police told to be “as vague as permissible” about why they use Flock. (404 Media)
The JT/DL is a twice-monthly newsletter about justice technology news, events, and opportunities.Subscribe for free to receive new posts and support my work.
Time and again, justice technology developers do the justice system dirty. They’ve sold biased risk assessment tools, software packages that lead to false arrests and habeas petitions, and “AI” that wasn’t. This embarrassing trend persists because the justice system and its advocates don’t build the tools and practices needed to interrogate and validate technologies old and new.
It’s in this ecosystem that state and local courts expect to double their use of genAI this year. To state the obvious: the courts are not ready to vet AI—to know if an AI product does what the developers claim. This means untested technology will do everything from banal administrative tasks to review filings for judgment. Done incorrectly, genAI can wrongly take away people’s freedom, limit their access to justice, and demolish their economic well-being. Without standardization, evaluation, and training around genAI, courts run the risk of perpetuating existing harms and creating new ones at the scale and speed of AI.
As part of the development of the Court Innovation Fund at Renaissance Philanthropy, we are exploring the development and use of AI benchmarks for court software. We believe that building benchmarks will create standards for court AI, incentivize a new generation of technologists to work on court innovation, and provide court officials actionable information when considering the purchase and adoption of AI systems.
First, a brief primer on benchmarks. Benchmarking is the process of figuring out if you can trust an AI system by gauging factors like accuracy or bias.
To accomplish this, a benchmark dataset has model inputs and model outputs. The inputs are the data that are fed into the model and the outputs are the ideal conclusion, like an answer sheet. To put this into context, say you wanted to test an AI model on how well it categorizes pictures of blueberry muffins and chihuahuas. You would put together a dataset of blueberry muffin and chihuahua images (model inputs) and feed them into the AI. Then you would compare the AI’s output (it’s attempt at classification) with the benchmark’s model outputs. The delta between the AI’s output and the benchmark’s model output is your accuracy number.
Not as easy as you thought, right?
Once the developers get their accuracy numbers back, they can make improvements on their model and then test again. It’s through this iterative process that AI models become better and more trustworthy. When applying benchmarks to genAI, this form of validation is only more important as we are not just asking “can the software classify this?” but “does the software do what its developers claim?”
Benchmarks are standard across fields like education, medicine, and science. They are often made public, so anyone can test against them and the results can be viewed like a nerdy leader board. This is an incredibly powerful feedback loop that creates a level of standardization and transparency that would otherwise be missing. Yet, there is no concerted effort to create benchmarks for AI deployed in the courts.
This is problematic, because AI in the courts isn’t trouble ahead, it’s here now. AI tools are helping litigants fill out forms, draft police reports, and assist in courtroom interpretation. It’s also being developed and tested to review eviction and debt filings, helping a judge determine how she should rule.
I find filing review tools illustrative of the potential of genAI in the courtroom—for both court and public—and the dire need for benchmarks.
Debt cases, which already accounted for 1-in-4 civil cases in state courts before the pandemic, are surging across the country. Usually suing for $10,000 or less, debt buying companies file millions of cases across the U.S. with the hope that a defendant doesn’t appear in court. This allows the judge to rule against the defendant without considering the facts of the case. With that judgment, the company takes the defendant to collections, perpetuating a punishing cycle of debt. The problem is that many of these claims, drafted with the help of AI, do not pass basic scrutiny, like whether the defendant actually holds the debt.
To fight back against this predatory practice, states like Arizona and New York, now require a judge to assure that the claim meets statutory standards before entering a default judgment. This is a smart policy change to curb predatory debt suits. It also creates a massive administrative challenge for judges to sift through the ever-growing pile of claims. Luckily, it’s the perfect job for AI to automate the review process.
However, we’re left wondering: do these new review tools actually work? Without independent verification, we are left with two main mechanisms to vet court technology: asking other courts if they liked the tool and vendor ad copy. This status quo is woefully insufficient.
However, it’s easy to understand how we got here. Due to the significant costs, technical expertise needed, and logistical challenges collecting and publishing court data, there is a dearth of datasets available for research and development in the courts. Adding complication, benchmarks need to evolve over time either because the foundational models improve or the application of the tool changes. Collectively, these factors put benchmark development out of reach for most courts. This creates a need for this work at scale and to build a commons infrastructure that all courts can benefit from.
That is why the Court Innovation Fund is focused on building a strategy to develop benchmark datasets for courts. The first step will be to figure out the economics and practicalities of doing this work at scale and over time. With that information, we can bring foundational benchmarks to court AI. Doing so creates standardized, reliable data for building and evaluating the performance of new tools that support just outcomes. It also creates an easy on-ramp for technologists to explore how their technology could help court issues, expanding the coalition of people developing solutions for these problems. Last, it gives courts more information when considering AI adoption.
Courts are places for fact-finding. They provide a process to gather and assess evidence to determine the truth behind a legal claim. We can’t let AI's accuracy and dependability—literally its ability to be factual—fall short of the standards required in court. That is the path we are currently on. Luckily, we can do better.
The JT/DL is a twice-monthly newsletter about justice technology news, events, and opportunities.Subscribe for free to receive new posts and support my work.
How Bad Courts Eat Good Policy
Before the holiday break, I laid out what I find to be the 10 most wild facts about the American court system. Today, in this ongoing series capturing my thinking about courts and court innovation, I wanted to expand the scope about what it means to talk about courts, why they matter, and why they require immediate attention.
Nationally, court budgets were slashed during the ‘08 Housing Crisis, leading to nearly two decades of underfunding and understaffing. Today, only about two percent of state court budgets nationally go to capital improvements, technology, and innovation, putting much needed upgrades out of reach. This decades-long reality has left courts struggling to improve operations, meet their legal requirements, and keep up with the ever changing technological times. But what’s most important to remember is that the courts are not a closed environment where these outdated systems merely impact people facing a legal issue. A poorly functioning court can unintentionally eat good public policy set by other branches of government, including affordable housing, public safety, national security, and more.
Take the Washington, D.C. housing court for example.
Due to old policy and procedure, landlord-tenant cases take over a year to resolve. As the backlog grows beyond 6,000 cases, back rent goes unpaid and evictions are unresolved, which negatively impacts those facing eviction and the community at-large.
For individuals, resolving an eviction—usually in the landlord’s favor—kicks off a cycle of poverty as public eviction records cost people employment and housing opportunities, kids have to abruptly change schools, and the evicted family can even lose their belongings in the process.
But the individual’s experience is only half the story. This court backlog also exacerbates the District’s affordable housing crisis. By leaving cases unresolved, rent goes unpaid and units can’t be re-rented, putting as many as 20,000 affordable units on the brink of foreclosure. The knock-on effects are clear: a court struggling to keep up can hurt people and devour local policy goals, like affordable housing.
Similarly, court malfunction can damage public safety. In California, hundreds of people convicted of vehicular manslaughter kept and even renewed their drivers licenses because courts failed to inform the DMV of their convictions. This failure was due to human and technical errors. Even though courts know how to collect and share data, as required by state law, these solvable issues went unnoticed and people who should have had their licenses revoked were still driving recklessly, getting into accidents, and creating new victims.
Then there’s our national security.
Just last week, it became public that the U.S. Supreme Court was hacked multiple times allegedly by a man in Tennessee. But this news is only the most recent in a long string of cybersecurity failures from the federal courts. Last year, it was disclosed that Russia hacked the federal court’s data system. This hack imperiled ongoing criminal investigations, classified intelligence, and prized trade secrets at issue in civil cases. Like the two examples above, the federal courts’ inability to secure its vulnerabilities hurts more than the courts themselves. As U.S. Senator Ron Wyden bluntly put it, “The federal judiciary’s current approach to information technology is a severe threat to our national security.”
What’s confounding is that the vulnerabilities exploited during the Russian hack were known back in 2020, when the federal courts experienced another major breach. (Yes, this happens that often.) The federal court’s collective mismanagement of their data systems has led to hundreds of threatened witnesses and dozens more murdered. Still, only last spring did the federal courts institute multi-factor authentication for all users, a cybersecurity practice widely adopted in the mid-2000s and established as the standard for the federal government in 2015.
These are just three examples of a national problem well known in the court reform community. But executive and legislative branch leaders from around the country still watch their policy agendas walk into court-shaped buzz saws of outdated process and inoperable technology. More examples abound from criminal justice to economic mobility and poverty alleviation policy efforts. If you’re skeptical, I ask that you go watch your local debt docket for an hour to see how courts are too often a platform for wealth redistribution going the wrong direction.
This national problem will likely compound our trouble ahead. With economic insecurity growing, the collapse in federal support to state and local justice systems, and the passage of H.R. 1—putting onerous, new requirements on Medicaid—we can expect a new wave of economic hardship cases to crash into courts barely treading water. Eviction, debt, and family violence cases all spike during times of economic decline. And demand on Medicaid will increase at a time when denials will skyrocket. Simultaneously, courts will grapple with a flood of AI-generated cases. All of which will break brittle, outdated systems already struggling to keep up. And when that dam breaks, there’s no part of our society that won’t feel the impact.
It’s this pressing reality that’s on our minds as we develop a philanthropic fund to innovate court capacity. Novel approaches to AI, improved data systems, software upgrades, staff training, procurement, and other interventions can help bolster these critical institutions. We aim to move courts from a forgotten periphery to an integral and co-equal branch of government when contemplating policy implementation. By doing so, we will increase support for the courts and improve their ability to serve the public, while still maintaining their independence. In the coming dispatches of this series, I’ll talk about ideas we have to set the courts moving in that direction.
Regardless of the intervention, however, the takeaway is this: court innovation isn’t just for the courts, it’s for the benefit of us all.
The JT/DL is a twice-monthly newsletter about justice technology news, events, and opportunities.Subscribe for free to receive new posts and support my work.
Editor’s note: The version of the newsletter from this morning had a bad link for the Q & A. That’s been fixed.
A Conversation About the Court Innovation Fund
At the end of last year, I had the chance to do a Q+A with Kumar Garg, the President of Renaissance Philanthropy, where I’m the Court Innovation Fellow. In this role, I’m leading the development of a philanthropic fund focused on building the field of court innovation. Below is a selection of the conversation setting the stage of what the Court Innovation Fund is, what it can accomplish, and the AI of it all. Please visit the RenPhil website to read the whole interview.
KG: What sparked the idea of a potential Court Innovation Fund, and why is this moment different for courts and for justice innovation?
JT: For the past three and a half years, I co-designed and co-led the Judicial Innovation Fellowship at Georgetown Law. This program placed technologists and designers in state and local courts to improve the courts’ responsiveness to public needs. In our final report, we identified multiple systemic chokepoints inhibiting court innovation in the U.S., and by extension people’s ability to access justice, economic mobility, and personal safety. Informed by this on-the-ground experience, we felt a dedicated fund for court innovation was overdue to build the field and overcome these chokepoints.
The timing for this work could not be more acute. The pandemic helped courts understand the cost of having incompatible technical systems, and AI is a wake up call for courts trying to keep up with the rapid state of change. If this one-two punch wasn’t enough, criminal justice reform and federal justice efforts are faltering and federal funding changes risk lost revenue and exploding dockets. Taken together, this moment requires innovative state and local courts to meet the moment and ensure equal justice for all. A focused philanthropic fund can be the catalyst.
KG: Courts are processing millions of cases a year and facing an “AI tsunami.” How do you think these technologies will reshape the front door of American democracy, and what’s at stake if courts don’t get ahead of it?
JT: This isn’t trouble ahead, the tsunami is here now. Late last year, I wrote about a bill in Wisconsin that would replace human court translators with AI software. As I outlined in the Milwaukee Journal Sentinel, the state will regret this bill becoming law, because the technology isn’t ready for high-stakes venues like the courts. This is for two reasons. First, most languages are “low-resource” meaning there isn’t a lot of content online in a language like Hmong, Wisconsin’s third most spoken language, which limits the ability to train an AI system to make accurate translations. Second, even a high-resource language like Spanish, which has lots of online content to train AI on, mistranslates legal terms. “Due date” becomes “date to give birth”, and the pronoun “su” (either “your”, “his”, “her”, or “their”) can be mistranslated, sowing confusion over property ownership or legal responsibility. These are not harmless errors, and if legislators in Wisconsin have their way the courts will have to figure out how to get to the facts of the case without a trustworthy translation.
This is just one example of the tsunami that’s already here. In other instances, court watchers are seeing an uptick of debt cases in state courts, which they believe is aided by AI. It’s already well documented that debt claims are often low quality and depend on the defendant not showing up to court to win. AI has the potential to supercharge this predatory practice. Courts are reacting by building automated AI review tools to ensure that debt claims are credible before ruling. While this is a welcome innovation, we lack the tools, like benchmarks, to know if these review tools are accurate or contain biases.
These two examples illustrate what is at stake for courts getting this moment right. Courts provide a process for fact finding and getting at the truth of a legal matter. To adopt AI translation services too early or to rely on unvalidated software to determine case outcomes risks the public’s trust. With nearly 70 million cases a year, courts are one of the most common touch points for the public and our government, getting it wrong there undercuts trust in our entire democratic system.
KG: Who needs to be at the decision-making table for responsible court modernization, particularly around AI governance and standards?
JT: Courts are how the public experience the law. This makes courts a critical service provider in need of public feedback to improve their services. But the public alone are insufficient by themselves. A goal of the Court Innovation Fund is to grow the coalition of people supporting court innovation. Obviously, we need judges, court administrators, IT staff, and other public servants involved in defining the needs and challenges of their demanding work. But we should also be thinking more broadly: there are unexplored issues impacting court modernization, including the economics of reform, the consumer harms created by private vendors, and how the increasing digitization of courts is creating cybersecurity vulnerabilities. Yet, we don’t see economists, consumer protection attorneys, or cyber experts coming to aid the courts.
As much as this potential Fund will organize the buyers and end users of court technology, we also need to incentivize on-ramps to other professions to expand the universe of potential solutions needed by the courts. This approach increases the internal capacity of courts to vet, develop, and deploy responsible, user-centered projects while growing the coalition of professionals in this space. That within itself would be an incredible innovation.
To read the entire interview, including what success of this fund looks like, please visit the RenPhil website.
The JT/DL is a twice-monthly newsletter about justice technology news, events, and opportunities.
Thanks for reading JT/DL! Subscribe for free to receive new posts and support my work.
10 Wild Facts about America’s Courts Pt. 2
Welcome back for part two! If you are just joining us, you can read the first five wild facts about America’s courts here. For those that just need a quick refesher, the first half of this list discussed how state and local courts account for nearly 99% of court cases in the U.S., that the wrong people go to jail when court IT goes sideways, and that litigants are not entitled to an attorney in the vast majority of court cases.
With that, let’s jump back in:
Most people go to court without an attorney.
With no right to an attorney in civil cases and attorneys being expensive, about 75% of civil cases have one person who represents themselves. It’s worse when we drill down. For example, only an average of 4% of those facing eviction have an attorney, versus 83% of landlords. How does this discrepancy impact so-called “self-represented litigants”? One study found a significant bias against people representing themselves in court, including that the monetary value of their claim was too high. By contrast, a randomized control trial found that people represented by an attorney in eviction court were 4.4 times more likely to retain possession of their apartments than similar tenants who didn’t have an attorney.
Judges don’t have to be lawyers.
You would think a self-represented person would have less legal training than the judge presiding over their case, but you’d be surprised. The majority of states allow for “lay judges,” these are judges that didn’t go to law school or pass the bar exam. In fact, they can come from any background, from mechanic to manicurist. This has some justification in rural and tribal communities, where access to justice is scarce. However, critical training and oversight are different from state-to-state and training is not required in some states before judges start deciding people’s fates. Luckily, the judicial system has checks on abuse and mistakes—the biggest being that a ruling is appealable to a judge with legal training.
Almost nothing gets appealed.
When any judge gets a legal issue wrong, the losing party can appeal to a higher court to consider the issue. But appeals cost money, they are confusing, and, as we mentioned, people don’t usually have an attorney. So, appeals don’t happen very often. For insight, we can look at states like Florida, Ohio, and Michigan, which all maintain appeals data. They see appeal rates of civil and criminal cases at .5%, .4%, and .2%, respectively. Not cherry picked to make a point, these states are three of the top five states for number of appeals in the country in 2023. This means a ruling from a lay judge has the force of law and is the end of the legal road for most.
There’s only one national philanthropy dedicated to supporting the courts.
Whatever the actual number of courts in the U.S., having one philanthropy focused on the judiciary is a bad ratio. Philanthropy matters because it is a thought partner to government and non-profits and can catalyze experimentation with new, promising ideas—something that courts are hungry for and the public needs. The Pew Charitable Trusts is that sole philanthropy in America with a dedicated, national approach to court reform, and they do great work. However, there’s an endless amount of opportunity for further progress not being addressed. There are, of course, one-off grants here and there from philanthropies servicing criminal justice or economic mobility issues, for example. But that’s distinct from a dedicated court fund with a cohesive theory of change regarding the courts themselves. And while there are federal funding streams, like the State Justice Institute, the Bureau of Justice Assistance, and the Legal Services Corporation—it’s an understatement to say that the future of those funds are uncertain.
There’s no national group that represents the public’s access to the courts.
There are plenty of organizations that help courts do better. Major national organizations train judges, improve court administration, and support ancillary groups, like legal aid, to support their clients in court. However, all of these groups are optimizing for different constituencies, like judges, court staff, technology vendors, or those that qualify for certain services. Not a single one considers the court-going public their chief constituency. Unfortunately, this makes sense, because courts were never designed for the public in the first place.
While each fact is presented separately, they don’t exist in a vacuum—every challenge compounds another. The person who doesn’t have a lawyer can’t figure out the court’s arcane rules also has their case heard by a lay judge and doesn’t understand their right to appeal. Making matters worse, there’s often no transcript of a trial, which means no one, including the court and the research community, knows what even happened! No matter how you define “justice”, this isn’t it.
While two part series focused on the most eye-widening facts about our courts, I want to be clear that it’s not a hit piece—it’s a call to action. Thousands of court employees across the U.S. are doing the best with the resources they’re given. But after decades of slashed budgets and courts being understaffed, cracks show no matter the dedication and creativity of our public servants. It’s a testament to them that the situation isn’t worse, and they need resources and broad spectrum support if we want things to be better.
As much as it might be hard to hear, this is the reality of our state and local courts and how the public experiences them. As we’ll see in future installments of this series, these challenges don’t stop at the courthouse steps. When outdated court processes and IT are allowed to fester, they can have society-wide ramifications. In short, we’re going to see how bad courts eat good policy.
We’re off for the rest of the year and will be back in your inbox January 6. Happy Holidays!
An AI model trained on prison phone calls now looks for planned crimes in those calls. (MIT Tech Review) (h/t Keith Porcaro)
ICE is using smartwatches to track pregnant women, even during labor. (The Guardian)
Understanding and improving the experience of pro se litigants in the trial court of Massachusetts. (Mass. Courts)
A federal court imposed sanctions for using generative AI without checking the output. (Eastern Dist. Michigan) (h/t Zach Zarnow) Something similar happened in a court in Oregon. (The Oregonian) (h/t John Grant)
A national collection of court standing orders and local rules on the use of AI. (Ropes & Gray) (h/t Natalie Roisman)