Submission to Twitter's algorithmic bias bounty challenge

Overview

Twitter Ethics Challenge: Pixel Perfect

Submission to Twitter's algorithmic bias bounty challenge, by Travis Hoppe (@metasemantic).

Abstract

We build off the work presented by Yee et al. and show that a trivial image modification can dramatically change the saliency point of two images. This modification can result in different crops for the same images. Specifically, we find that adding padding to the left of an image can alter the selection point where the Twitter algorithm will crop. At least 16% of all image pairs are exploitable in this way (possibly much larger). The exploit, which can be easily triggered intentionally, happens naturally and is 22% more likely to occur when comparing white women to women of color.

Example

The following images are almost identical, with one small exception. The second image has a 13 pixel padding on the left. This is enough to change which image is cropped!

Rashida Tlaib is cropped

Kyrsten Sinema is cropped

To replicate this, you can use the code provided or the jupyter notebook.

Methods

To ensure a dataset that is 1] representative of gender and ethnicity, 2] publicly available, 3] uniform in framing and pose, and 4] consensual, we use images from the 117th US Congress. Images and demographic data provided by Civil Service USA and can be found at the following locations:

Each congressional representative and senator was put in competition with each other. Similar to the Twitter paper, we placed a black buffer between the images and asked the cropping algorithm for the most salient point using the aspect ratio of the original images (1:1).

The cropping algorithm computes a set of saliency rankings across 140 evenly spaced points along the composite image (1536x512). For this experiment, this is about 40 pixels per point. We evaluate the "winner" for the original composite image, then examine if the winner changes when we add a buffer of fixed size to the left of the image. We used buffers of size [0, 6, 13, 19, 26, 31]. A pair of images is considered "exploitable" if there exists a buffer of some size that selects a different photo from the non-buffered image.

Anecdotally, we found a much larger effect when we applied the attack to buffers of of more sizes than those listed, but computational constraints prevented this full analysis. We also could increase the attack surface by inserting the buffer between the images (but this modified one image independently of the other).  Since the buffer shifted both images by the same amount, it is considered "fair" for attacks of all images in the wild as it's trival to add extra space to any image (see examples in the self-scoring section for more).

For demographics, we split the population into the two categories of gender provided (all members identified as either male or female), and two categories of ethnicity: white and other. "Other", refered to here as people of color, was chosen as a category for statistical power as the various subgroups (African American, Hispanics, pacific islanders, ...), were not large enough to draw meaningful conclusions. Future work should examine this bias using more nuanced subgroups with larger datasets.

Results

We found that out of the image pairs considered, 16.4% of them were exploitable by our method. Furthermore, we found that the attack was disproportionately more likely to occur when comparing white women to women of color. We found an increase of about 22.2% from the baseline (20.04% up from 16.4%) when considering this subgroup (p<<0.001).

Full tables of statistics are provided at the end of the README. Self-pairs were not considered, so the actual number of image pairs was (536^2 - 536), and each of those were evaluated at the 6 offsets. Additionally, we find slight differences considering image A-B vs B-A, so we list them as separate cases in the data tables (but did not find them statistically significant).

Self-score

  • Type of Harm (20 base points)

Unintentional underrepresentation.

  • Damage or impact ((1.4 + 1.2)/2 = 1.3)

We show that along multiple axis of identity (gender and ethnicity) that it is magnified for women of color (x1.4). Being cropped out of an image, or having high variability of being cropped out of an image (UI gaslighting) can have a moderate impact to a person's well-being (x1.2)

  • Affected users (1.2)

We estimate that the number of users that have been seen or exposed to images of multiple people, where one of them is a woman of color is at least one million (x1.2). With 187 million active user each posting or reposting a single target image once a year with 10 views, this under-estimate still gives 5 million views per user per day.

  • Likelihood only graded for unintentional harms (1.3)
  • Exploitability only graded for intentional harms: (not scored)

While this is not an intentional harm, it can be! Code provided can easily duplicate this effect (x1.3 if scored) or it can be done with minimal photo editing. Strangley this algorithm isn't exactly the same as the one live on Twitter now (8/2/21). The live algorithm is worse! It currently chooses to do nothing and instead picks a black square in the middle (Roll Safe meme: can't have bias if we remove the humans).

We note that it is difficult to tell harm has occurred in the wild, as the effect may be unnoticed since the user isn't presented with alternatives. That said, we show that it works on real world images as well in the examples below. Considering that images with multiple people are shown on Twitter daily, this exploit is very likely (1.3).

Here the most salient point is marked by a green dot. The images are only offset by a single width pixel.

This this example, over the span of three pixels there are three different cropping points:

  • Justification: (1.0)

We hope that the reader can see that the current cropping algorithm is brittle and easily exploited. This effect happens naturally with different images and is dependent on unimportant information at the edge of the image. The harm here is subtle, but it is important to note that it isn't uniform across all demographics. If some users are arbitrarily and inconsistently cropped, this creates an experience that their presence (and self!) in a photo is also arbitrary. We score this section at (x1.00) as the methodology is sound but could use a larger dataset, more cropping points, and the effect isn't as harmful as a racially biased crop that was shown in the first paper.

  • Clarity of contribution: (1.5)

This submission is fully documented with workable examples, a reproducible dataset, and evidence of harm both intentional and unintentionally (x1.5). We note that the exploit can be fixed in numerous way at the cost of more computational resources. Instead of a single point, multiple points can be evaluated heuristically (as the original authors suggest). Also, kernel densities can be estimated to provide a smoother representation of the most salient area vs a single point.

  • Final Score: (60.84)

20 x (1.3 + 1.2 + 1.3 + 1.0 + 1.5) = 60.84

Appendix and data tables

We first report results reflected in Yee et. all: gender plays a strong role (towards females), while the role of ethnicity matters, but in a more subtle way. p-values are constructed from a two-sided binomial test using the sample mean as the expected value, significance (when shown), is set at p<0.01 and provided for visual convenience.

            key        n       k       pct         pvalue
0    white_male  2095416  909151  0.433876   0.000000e+00
1    other_male   429336  224617  0.523173  1.305973e-202
2  white_female   602352  382057  0.634275   0.000000e+00
3  other_female   301176  198315  0.658469   0.000000e+00

Reflecting the gender and ethnicity parity in the party structure we see the same result:

           key        n       k       pct  pvalue
0  independent    12816    3807  0.297051     0.0
1   republican  1672488  764219  0.456935     0.0
2     democrat  1742976  946114  0.542815     0.0

Considering the interaction between gender and ethnicity, the largest difference is between white males and females. For non-white males the bias still exists, but is less. n reflects not only each pairwise comparisons but at all levels of offset.

        left_key     right_key       n       k       pct         pvalue    sig
0     white_male  other_female   92214   25963  0.281552   0.000000e+00   True
1     white_male  white_female  184428   56958  0.308836   0.000000e+00   True
2     other_male  other_female   18894    7092  0.375357  2.401844e-287   True
3     other_male  white_female   37788   15129  0.400365   0.000000e+00   True
4     white_male    other_male  131454   54628  0.415567   0.000000e+00   True
5   white_female  other_female   26508   13085  0.493625   2.722065e-05   True
6     other_male    other_male   26532   13336  0.502638   2.081554e-01  False
7     white_male    white_male  639612  322983  0.504967   1.319497e-02  False
8   other_female  other_female   12972    6627  0.510870   3.253872e-01  False
9   white_female  white_female   52452   26799  0.510924   4.365079e-02  False
10  other_female  white_female   26508   14083  0.531274   7.589998e-16   True
11    other_male    white_male  131454   77808  0.591903   0.000000e+00   True
12  white_female    other_male   37788   23337  0.617577   0.000000e+00   True
13  other_female    other_male   18894   12115  0.641209  1.623488e-304   True
14  white_female    white_male  184428  130629  0.708293   0.000000e+00   True
15  other_female    white_male   92214   67669  0.733826   0.000000e+00   True

Next we consider the effects of the exploit. The raw breakdown along demographics show that there is a difference with non-white females and white female from this expected 16.4%. Here, n reflects only the pairwise comparisons:

            key       n      k       pct        pvalue    sig
0  white_female  100392  15912  0.158499  7.780860e-07   True
1    other_male   71556  11551  0.161426  4.103492e-02  False
2    white_male  349236  57507  0.164665  5.122208e-01  False
3  other_female   50196   8882  0.176946  3.039690e-14   True

Finally, we show the main results using both subgroups of ethnicity and gender.

        left_key     right_key       n      k       pct        pvalue    sig
0   white_female    white_male   30738   4503  0.146496  1.549290e-17   True
1     white_male  white_female   30738   4608  0.149912  6.904292e-12   True
2     other_male    other_male    4422    677  0.153098  4.668935e-02  False
3     other_male  white_female    6298    981  0.155764  6.884341e-02  False
4     other_male    white_male   21909   3538  0.161486  2.739183e-01  False
5     white_male    other_male   21909   3545  0.161806  3.338248e-01  False
6   white_female    other_male    6298   1021  0.162115  6.584081e-01  False
7   other_female    white_male   15369   2528  0.164487  9.392602e-01  False
8     white_male  other_female   15369   2571  0.167285  3.113621e-01  False
9     white_male    white_male  106602  18107  0.169856  8.897830e-07   True
10  white_female  white_female    8742   1506  0.172272  4.481319e-02  False
11  other_female    other_male    3149    555  0.176246  7.125701e-02  False
12    other_male  other_female    3149    557  0.176882  5.743075e-02  False
13  other_female  white_female    4418    885  0.200317  2.955497e-10   True
14  white_female  other_female    4418    902  0.204165  3.565989e-12   True
15  other_female  other_female    2162    442  0.204440  9.085229e-07   True

Useful links for the submission:

You might also like...
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

wsss-analysis The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Algorithmic trading using machine learning.
Algorithmic trading using machine learning.

Algorithmic Trading This machine learning algorithm was built using Python 3 and scikit-learn with a Decision Tree Classifier. The program gathers sto

High frequency AI based algorithmic trading module.

Flow Flow is a high frequency algorithmic trading module that uses machine learning to self regulate and self optimize for maximum return. The current

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

Algorithmic trading with deep learning experiments

Deep-Trading Algorithmic trading with deep learning experiments. Now released part one - simple time series forecasting. I plan to implement more soph

Algorithmic Trading using RNN
Algorithmic Trading using RNN

Deep-Trading This an implementation adapted from Rachnog Neural networks for algorithmic trading. Part One — Simple time series forecasting and this c

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

A supplementary code for Editable Neural Networks, an ICLR 2020 submission.
A supplementary code for Editable Neural Networks, an ICLR 2020 submission.

Editable neural networks A supplementary code for Editable Neural Networks, an ICLR 2020 submission by Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry Py

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

Owner
Travis Hoppe
Travis Hoppe
Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021) Official Pytorch implementation of Unbiased Classification

Youngkyu 17 Jan 1, 2023
ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

Hao Su's Lab, UCSD 48 Dec 30, 2022
[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Counterfactual VQA (CF-VQA) This repository is the Pytorch implementation of our paper "Counterfactual VQA: A Cause-Effect Look at Language Bias" in C

Yulei Niu 94 Dec 3, 2022
A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Pytorch-MBNet A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK Training To train a new model, please ru

null 46 Dec 28, 2022
This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Self-Diagnosis and Self-Debiasing This repository contains the source code for Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based

Timo Schick 62 Dec 12, 2022
Repository for the Bias Benchmark for QA dataset.

BBQ Repository for the Bias Benchmark for QA dataset. Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Tho

ML² AT CILVR 18 Nov 18, 2022
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:

Joseph P. Robinson 41 Dec 12, 2022
Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Johannes G. 24 Dec 7, 2022
Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

DSBF Introduction This repository contains the implementation code for paper: Domain-Specific Bias Filtering for Single Labeled Domain Generalization

ScottYuan 7 Jan 5, 2023
(under submission) Bayesian Integration of a Generative Prior for Image Restoration

BIGPrior: Towards Decoupling Learned Prior Hallucination and Data Fidelity in Image Restoration Authors: Majed El Helou, and Sabine Süsstrunk {Note: p

Majed El Helou 22 Dec 17, 2022