Algorithms Are Everywhere. How Are We Fighting Their Bias?

The whos, whats, and hows of algorithmic auditing

The Basics

What it is: Algorithmic auditing is a new but quickly developing field that aims to suss out bias in code before it causes harm.

Why it matters: Because “real people lives in the balance” of how algorithms are used, says Kush Varshney, distinguished research staff member at IBM Research.

How it works: Algorithmic audits can be internal or external, voluntary or imposed, but the "most impactful, in terms of creating systemic change" have been conducted by research and advocacy groups, says Amba Kak, director of global policy at the AI Now Institute.

What it looks like: Though there are overlapping areas, there isn’t one agreed-upon way to audit an algorithm. As the space matures and regulators dig in, it’ll probably become more standardized.

One day in 2019, a Belgian public-sector agency decided it needed to look harder at an algorithm it was gearing up to deploy.

The subject of scrutiny: a prioritization model to help distribute unemployment services to the country’s 11.5 million citizens. Using employment type and length of time at a job, it predicted an individual’s likelihood of employment in the next six weeks—the lower the likelihood, the higher their priority in the system.

The agency brought in Accenture to help audit the model for potential biases. Both teams’ data scientists worked together to identify both unsurprising biases, like racism and sexism, as well as unexpected ones (e.g., age-based bias against younger people with shorter work histories).

In this case, an algorithm was set up to make potentially life-altering determinations: how soon a person gets unemployment assistance. Without external oversight and in-depth troubleshooting (and even with them, for that matter), the system could lock people out of resources they needed to survive.

The picture below is an example of an internal audit, pulled from a Google paper:

The world of algorithmic audits

Algorithmic auditing is so new that Google doesn’t even have enough data to display the term’s search popularity. At this point, it’s an “active field of research and activism rather than a standardized practice both in industry and in government,” says Kak.

But now, interest seems to be surging. AlgorithmWatch created a compilation of more than 160 AI ethics guidelines last year, then revised it in April. Coursera had recently launched a free ethical assessor certification course, designed to help non-technical workers ask the right questions in assessing algorithms.

Audits come in two genres: internal audit and external audit.

Internal audits are run by the same organization that created or commissioned the algorithm. This gives visibility into the inner workings of a system that are typically off-limits to external auditors, like training data and intermediate models.

By nature, internal audits introduce conflicts of interest, including lack of public accountability, potential incentives to proceed with a flawed system, and a limited diversity of perspectives, says Kak.

  • Think of it this way: If a company committed to an annual diversity, equity, and inclusion audit, you'd probably trust the results of an independent investigation more than an internal one.

Then there are external auditors. There’s no clear leader in the space, but there are two main types of providers:

  • Large strategy/consulting firms with dedicated algorithmic auditing arms

  • Smaller specialty companies founded by data scientists or engineers

Those kinds of external audits are opt-in. But sometimes, independent watchdogs impose their own reviews in order to expose how opaque systems operate (for instance, a key insurance pricing algo).

In practice: One of the most well-known external, watchdog-style audits was published in 2018 by MIT Media Lab and Microsoft Research. They found that three companies with AI products that aimed to classify people by gender—IBM, Microsoft, and Face++—performed better on males and lighter-skinned subjects.

(A) typical audit process

Although there's no agreed-upon method, approaches overlap. And before an audit happens, many algorithms look like the below image to those who didn't build them.

Step 1: Align around purpose

The first step in an algorithmic audit is typically a high-level look at the algorithm’s intended use, why it was created, and its potential for misuse. It’s also a time to get aligned on the company’s “north star” - e.g., fairness goals and ethics principles, says Rumman Chowdhury, the global responsible AI lead at Accenture.

Step 2: Examine the tech specs

Here auditors dig into the technical specs, including the data the algorithm was trained on, the best tools for assessing the model, and how it reacts to changing conditions.

  • Exhibit A: “Machine learning models that were trained in Pre-covid times are no longer functioning that well because the data of the world itself has changed,” says Varshney.

Step 3: Test fairness metrics for groups and individuals

Say you’re auditing an algorithm that analyzes resumes and scores each job applicant on hiring potential. You’d generate semi-realistic test cases (plus some edge-case scenarios), says Varshney, then feed them into the system and analyze the outputs with questions like...

  • How do the results compare for protected groups?

  • Are people with similar experience receiving similar outcomes, or are variables like race or gender skewing results?

  • Which trade-offs have to be made if we deploy the algorithm?

At this stage, algorithms are also sometimes tested for their robustness to outside threats from hackers.

Step 4: Transparency and explainability

At the end of the audit, the algorithm’s creators should be able to articulate the purpose and limitations of the system, in plain language. How exactly they communicate will change depending on the intended audience.

  • For instance, the gold standard for communicating with the public might be a nutrition label-style fact sheet, whereas regulators might want a more technical and comprehensive primer.

Big picture: “There’s no such thing as a perfect audit,” says Chowdhury. The process will change according to the intended industry and use case, as well as the algorithm itself.

Practice, practice, practice

If Chowdhury were to choose an industry theme this year, she said it’d be “From principles to practice.”

“We spent the last few years understanding the space, trying to create principles, trying to evangelize those principles,” she says. “Now, everyone’s like, ‘Okay, great, we’re on board with your idea, now tell us how to put this idea into practice.’”

One key point: Continuous algorithmic audits - proactive processes that begin at an algorithm’s inception and continue after it’s deployed - beat reactive ones every time.

Not only is the former approach more thorough, it also just makes more sense. It’s much more complicated and time consuming to root out bias after the fact, from an already-trained model, than to swap out biased data early on.

  • According to 24 interviews analyzed in a July paper, many AI practitioners feel their companies’ own fairness work is reactive rather than proactive.

Another priority for practitioners: making sure this field doesn’t become a feel-good write-off, a box to check, or a form of corporate absolution.

No audit is a cure-all, but without genuine support from the entity requesting it, one that’s chosen to “embed responsible use of technology into the DNA of [the] organization,” says Chowdhury—it likely won’t cure anything.

Zoom out: As these priorities and practical recommendations take shape, Chowdhury says the space is likely to get more standardized, following comparable sectors like risk management.

Good friction

In many contexts, an algorithm is too high-stakes a tool to be wielded without outside supervision. That’s a view shared by ethics experts and regulators.

Policy conversations are growing more and more frequent, says Kak—ranging from when audits should be mandated to the role of regulatory agencies in doing the mandating.

  • One early signal: The Algorithmic Accountability Act, introduced by a group of U.S. senators in April 2019, would require corporations to analyze and fix algorithms that result in biased or discriminatory practices against Americans.

“There need to be legal structures around how these are conducted and what standards they need to meet, rather than leave it up to the discretion of companies,” says Kak. “This includes creating pathways that allow external actors researchers, journalists, or regulatory agencies - to audit these systems that are typically shielded by corporate secrecy laws."

Those processes ultimately add “good” friction into algorithmic systems, she says - and they’re helped along by a wide cast. Auditing is not just a task for those who write the code; the process should include engineers, data scientists, social scientists, lawyers, designers, psychologists, and more engaging with "equal footing,” says Chowdhury.

"It’s about recognizing the different forms of expertise that are required to evaluate algorithms,” says Kak. “Lawyers, policy people, and researchers, but also so many of the community advocates that have made the largest strides...in the last few years, in creating a noise when they have been directly impacted by these systems."

And just like Europe’s data protection laws, says Kak, it’s vital that the next wave of algorithmic accountability policy “create that space” to stop moving forward with a problematic system altogether, no matter the incentives or potential losses.

Already, some cities and companies have deemed certain algorithms too racist, sexist, or otherwise dangerous to use.

For example: the wave of public and private bans on certain uses of facial recognition tech that swept the country this year.

Algorithmic auditing is just one tool in this broader societal push to ensure that the emerging technologies we interact with are safe, fair, and used responsibly. In some cases, no amount of auditing can offset the dangers.