The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Last update: Nov 13, 2021

Related tags

Deep Learning coda

Overview

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Overview

Code and dataset for The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color.

This repository is roughly split into 2 parts:

probing: The probing implementations, including code for generating CoDa.
mturk-survey: Instruction pages and used for crowdsourcing annotations.

How to use

Using CoDa

If you'd like to use CoDa, we highly recommend using the version hosted on the Huggingface Hub as it requires no additional dependencies.

from datasets import load_dataset

ds = load_dataset('corypaik/coda')

You can find more details about how to use Huggingface Datasets here.

Running experiments

This repository is developed and tested on linux systems and uses Bazel. If you are on other platforms, you might consider running Bazel in a docker container. If you'd like more guidance on this, please open an Issue on GitHub.

First, clone the project

# clone project
git clone https://github.com/nala-cub/coda

# goto project
cd coda

You can run the specific tasks as:

# run zeroshot
bazel run //projects/coda/probing/zeroshot
# representation probing
bazel run //projects/coda/probing/representations
# ngrams
bazel run //projects/coda/probing/ngram_stats
# generate dataset from annotations (relative to workspace root)
bazel run //projects/coda/probing/dataset:create_dataset -- \
  --coda_ds_export_dir=<export_dir>

To see help for any of the commands, use:

bazel run <target> -- --help
# for example:
# bazel run //projects/coda/probing/zeroshot -- --help

Annotation Instructions

Annotations were collected using an Angular app on Firebase. The included files contain all instructions, but not the app itself. If you're interested in the latter please open an issue on GitHub.

Citation

If this code was useful, please cite the paper:

@misc{paik2021world,
      title={The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color},
      author={Cory Paik and Stéphane Aroca-Ouellette and Alessandro Roncone and Katharina Kann},
      year={2021},
      eprint={2110.08182},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

CoDa is licensed under the Apache 2.0 license. The text of the license can be found here.

You might also like...

Submission to Twitter's algorithmic bias bounty challenge

Twitter Ethics Challenge: Pixel Perfect Submission to Twitter's algorithmic bias bounty challenge, by Travis Hoppe (@metasemantic). Abstract We build

4 Aug 19, 2022

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

Certifiable Outlier-Robust Geometric Perception

Certifiable Outlier-Robust Geometric Perception About This repository holds the implementation for certifiably solving outlier-robust geometric percep

83 Dec 31, 2022

Repository for the Bias Benchmark for QA dataset.

BBQ Repository for the Bias Benchmark for QA dataset. Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Tho

18 Nov 18, 2022

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:

41 Dec 12, 2022

PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner [Li et al., 2020].

VGPL-Visual-Prior PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner (VGPL). Give

8 Dec 29, 2022

Comments

Color names associated with the distribution?

Hey, I have downloaded the CoDa dataset from HF.

I want to know the color label names asssociated with the color distribution in the label column.

Thanks!

opened by Axe-- 1

NonMatchingChecksumError when loading dataset in HuggingFace

Hello!

I'm getting the following error when trying to load the dataset in huggingface datasets

NonMatchingChecksumError: Checksums didn't match for dataset source files:
['https://huggingface.co/datasets/corypaik/coda/resolve/main/data/default_train.jsonl', 'https://huggingface.co/datasets/corypaik/coda/resolve/main/data/default_validation.jsonl', 'https://huggingface.co/datasets/corypaik/coda/resolve/main/data/default_test.jsonl']

Colab to reproduce: https://colab.research.google.com/drive/1FRxDKyW4E6XUxYTCzGKKbvsCTHGK2KAO?usp=sharing

Is this expected?

Thanks!

opened by cfierro94 1

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Related tags

Overview

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Overview

How to use

Using CoDa

Running experiments

Annotation Instructions

Citation

License

You might also like...

Submission to Twitter's algorithmic bias bounty challenge

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Certifiable Outlier-Robust Geometric Perception

Repository for the Bias Benchmark for QA dataset.

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner [Li et al., 2020].

Code for "The Box Size Confidence Bias Harms Your Object Detector"

Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

Autonomous Perception: 3D Object Detection with Complex-YOLO

Comments

Color names associated with the distribution?

NonMatchingChecksumError when loading dataset in HuggingFace

Owner

[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Reporting and Visualization for Hazardous Events

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

Code for Towards Streaming Perception (ECCV 2020) :car:

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".