Quantcast
Channel: Artificial Intelligence News, Analysis and Resources - The New Stack
Viewing all articles
Browse latest Browse all 541

Tools for Addressing Fairness and Bias in Multimodal AI

$
0
0
scales of justice

The recent boom in artificial intelligence gives us a fascinating glimpse of future possibilities, such as the emergence of agentic AI and powerful multimodal AI systems that have also become increasingly mainstream.

But even as AI development blazes ahead, there are lingering questions about algorithmic bias. This term refers to how AI systems can inadvertently reflect and augment prejudices from their creators or from skewed training data, thus potentially producing unfair outcomes based on gender, race or age, and potentially perpetuating or even amplifying social inequalities and biases.

It could mean that as its influence grows, AI could increasingly determine which job applicant gets hired, which health insurance claim gets approved, who gets paroled and who gets approved for a mortgage.

“AI has the potential to unintentionally reinforce unfairness, bias and discrimination,” explained Channarong Intahchomphoo, an adjunct professor at University of Ottawa’s School of Engineering Design and Teaching Innovation in a recent interview about AI bias. “Engineers may include these concepts unknowingly in AI systems while rushing to develop AI products quickly to be first to market, often without thorough consideration and rigorous testing before deployment.”

Without mitigation, some experts caution that AI could bring on a grim, systematic algorithmic reordering of society, thus opening the way for a tyrannical kind of algocratic governance where algorithms have the final word.

Recent work investigating fairness and bias in AI suggests that there are a variety of strategies to ensure fairness and eliminate bias in AI. In particular, multimodal AI may be a harder nut to crack because research has shown that biases are further compounded in multimodal systems, compared to unimodal approaches.

Fairness and Bias: What They Mean

In machine learning, fairness refers to the principle that an AI model should make decisions impartially and equitably to prevent discrimination based on “sensitive attributes” like race, gender, age or socioeconomic status.

Bias refers to the phenomenon of decision-making algorithms outputting systematic errors, sometimes leading to unfair outcomes for certain groups. Bias can inadvertently appear in an AI model at multiple stages, such as during data collection, training and deployment. There are many different types of bias in AI models, including historical bias, selection bias, sampling bias, group attribution bias, in-group bias, confirmation bias, experimenter’s bias, implicit bias and explicit bias (longer list and definitions here). As we can see in the table below, extensive research has been done to pinpoint what kinds of biases are present in different models.

Via “Fairness and Bias in Multimodal AI: A Survey” by Adewumi and Alkhaled et al.

Tools for Measuring Fairness and Bias in AI

To help audit, measure and evaluate fairness and bias in AI, here are some tools that AI engineers could use for their models:

  • WEAT (Word Embedding Association Test): A popular method that assesses biases using semantic similarities between word embeddings (paper); however, it may overestimate bias in a model.
  • iEAT (Image Embedding Association Test): A tool consisting of around 200 images that can be used to test for biased associations between social concepts and attributes.
  • ML-EAT (Multilevel Embedding Association Test): A more recent method that is designed to be interpretable and transparent. It uses multilevel embedding techniques to uncover complex relationships between variables.
  • Cosine similarity: A metric used to audit fairness and bias by measuring the similarity between two or more non-zero vectors projected in a multidimensional space, such as image and text embeddings. The closer a word vector is to another word vector, the more closely those words are associated with each other. Potential biases in data can be detected using cosine similarity; for example, if the word “doctor” is associated more with “man” than it is with “woman.”
  • Bipol: This is a recent metric (with explainability) for estimating bias in data. It features a two-step process: corpus-level evaluation based on model classificaiton and sentence-level evaluation based on term frequency. The code and the data set of almost 2 million labels samples are available on GitHub.
  • SCoFT (Self-Contrastive Fine-Tuning for Equitable Image Generation): This method uses pre-trained models’ cultural misrepresentations to refine itself, and is designed to encode high-level information from the Cross-Cultural Understanding Benchmark (CCUB) data set into a model with the aim of shifting misrepresentative cultural depictions.
  • FairDistillation: A cross-lingual method based on knowledge distillation. It tackles bias across languages by creating smaller language models to control for biases.

Data Sets

  • MultiBench: A systematic and unified large-scale multimodal benchmark data set for bias evaluation that spans 15 data sets, 10 modalities, 20 prediction tasks and six research areas (paper).
  • Fair Diffusion: Based on LAION-5B, this recent data set is designed to instruct generative image models on fairness by attenuating gender biases after deployment, based on human instructions, without the need for data filtering or extra training. (You can find the demo on Hugging Face and the repository on GitHub).
  • Cross-Cultural Understanding Benchmark (CCUB): This data set leverages surveys from human participants from different countries to help generative models like Stable Diffusion generate images that are more culturally representative. It can be paired with the previously mentioned SCoFT method.
  • FairFace: Described as a “face attribute data set with balanced race, gender and age.” It includes race, age, gender and skin tone annotations for over 100,000 face images derived from Yahoo’s popular YFCC100M data set (paper).
  • A longer list of data sets used in algorithmic fairness research in computer vision can be found via the Montreal AI Ethics Institute.

Summary of selected data sets and their weaknesses in fairness and bias challenges. Bottom part of table features data sets usually used in downstream tasks (via “Fairness and Bias in Multimodal AI: A Survey” by Adewumi and Alkhaled et al.)

Other Strategies

Besides the tools and data sets above, there are other strategies to mitigate bias in AI models, such as:

Despite this wealth of tools for detecting and mitigating bias, experts say that evaluation data sets and metrics are fragmented, and there is still no unified framework for measuring biases. Ultimately, removing bias from AI models is a human undertaking that will likely require efforts from not only the tech industry, but also other sectors — and potentially, collaboration on an international level.

“It is important to promptly address and mitigate the risks and harms associated with AI,” emphasized Intahchomphoo. “I believe that engineers, policymakers and business leaders themselves need to have a sense of ethics to see fairness, bias and discrimination at every stage of AI development to deployment.”

The post Tools for Addressing Fairness and Bias in Multimodal AI appeared first on The New Stack.

To help audit, measure and evaluate fairness and bias in AI, here are some tools that AI engineers can use for their models.

Viewing all articles
Browse latest Browse all 541

Trending Articles