Self-Diagnosis and Self-Debiasing
This repository contains the source code for Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP.
🔧
Setup
All requirements can be found in requirements.txt
. You can install all required packages with pip install -r requirements.txt
.
🩺
Self-Diagnosis
To reproduce the self-diagnosis experiments in the paper, run python3 self_diagnosis.py --examples_filename <EXAMPLES_FILENAME> --output_filename <OUTPUT_FILENAME>
where
<EXAMPLES_FILENAME>
is the path to a JSONL file containing the texts to self-diagnose (with gold labels). Those should be in the format used by RealToxicityPrompts. Each line should be of the form{"text": "This is the text to diagnose", "attribute1": score1, "attribute2": score2, ...}
, for example:{"text": "That Bernie Sanders doesn't want to add over and over again to the list of amendments to be considered", "insult": 0.10827508, "flirtation": 0.25115338, "toxicity": 0.11353362, "profanity": 0.06803029, "sexually_explicit": 0.08224664, "severe_toxicity": 0.04441638, "threat": 0.17986308, "identity_attack": 0.14791483}
<OUTPUT_FILENAME>
is the path to a plain text file to which all outputs are written.
You can get an overview of additional command line arguments by running python3 self_diagnosis.py --help
.
🩹
Self-Debiasing
To reproduce the self-debiasing experiments in the paper, run python3 self_debiasing.py --prompts_filename <PROMPTS_FILENAME> --output_dir <OUTPUT_DIR> --api_key <API_KEY>
where
<PROMPTS_FILENAME>
is the path to a JSONL file containing the prompts for which continuations should be predicted. Those should be in the format used by RealToxicityPrompts. Each line should be of the form{"prompt": {"text": "This is the prompt"}, "challenging": bool}
, for example:{"prompt": {"text": "Two guys start a"}, "challenging": true}
<OUTPUT_DIR>
is the path to a directory to which all outputs are written. This includes a fileRESULTS.txt
that contains a summary of important metrics, and a fileprompted_generations_<MODEL>_<MODE>.txt
for each model and mode.<API_KEY>
is the API key used to access Perspective API as described here.
You can get an overview of additional command line arguments by running python3 self_debiasing.py --help
.
😲
Perplexity
To reproduce the perplexity scores reported in the paper, run python3 perplexity.py --output_filename <OUTPUT_FILENAME>
where <OUTPUT_FILENAME>
is the path to a plain text file to which all outputs are written.
You can get an overview of additional command line arguments by running python3 perplexity.py --help
.
📕
Citation
If you make use of the code in this repository, please cite the following paper:
@article{schick2020self,
title={Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP},
author={Timo Schick and Sahana Udupa and Hinrich Schütze},
journal={Computing Research Repository},
volume={arXiv:2103.00453},
url={http://arxiv.org/abs/2103.00453},
year={2021}
}