“We have been sort of stuck with outdated notions of what fairness and bias means for a long time,” says Divya Siddarth, founder and executive director of the Collective Intelligence Project, who did not work on the new benchmarks. “We have to be aware of differences, even if that becomes somewhat uncomfortable.”
The work by Wang and her colleagues is a step in that direction. “AI is used in so many contexts that it needs to understand the real complexities of society, and that’s what this paper shows,” says Miranda Bogen, director of the AI Governance Lab at the Center for Democracy and Technology, who wasn’t part of the research team. “Just taking a hammer to the problem is going to miss those important nuances and [fall short of] addressing the harms that people are worried about.”
Benchmarks like the ones proposed in the Stanford paper could help teams better judge fairness in AI models—but actually fixing those models could take some other techniques. One may be to invest in more diverse datasets, though developing them can be costly and time-consuming. “It is really fantastic for people to contribute to more interesting and diverse datasets,” says Siddarth. Feedback from people saying “Hey, I don’t feel represented by this. This was a really weird response,” as she puts it, can be used to train and improve later versions of models.
Another exciting avenue to pursue is mechanistic interpretability, or studying the internal workings of an AI model. “People have looked at identifying certain neurons that are responsible for bias and then zeroing them out,” says Augenstein. (Neurons are the term researchers use to describe small parts of the AI model’s ‘brain’.)
Another camp of computer scientists, though, believes that AI can never really be fair or unbiased without a human in the loop. “The idea that tech can be fair by itself is a fairy tale. An algorithmic system will never be able, nor should it be able, to make ethical assessments in the questions of ‘Is this a desirable case of discrimination?’” says Sandra Wachter, a professor at the University of Oxford, who was not part of the research. “Law is a living system, reflecting what we currently believe is ethical, and that should move with us.”
Deciding when a model should or shouldn’t account for differences between groups can quickly get divisive, however. Since different cultures have different and even conflicting values, it’s hard to know exactly which values an AI model should reflect. One proposed solution is “a sort of a federated model, something like what we already do for human rights,” says Siddarth—that is, a system where every country or group has its own sovereign model.
Addressing bias in AI is going to be complicated, no matter which approach people take. But giving researchers, ethicists, and developers a better starting place seems worthwhile, especially to Wang and her colleagues. “Existing fairness benchmarks are extremely useful, but we shouldn’t blindly optimize for them,” she says. “The biggest takeaway is that we need to move beyond one-size-fits-all definitions and think about how we can have these models incorporate context more.”
#benchmarks #models #biased