A library for finding knowledge neurons in pretrained transformer models.

Overview

knowledge-neurons

An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the technique to autoregressive models, as well as MLMs.

The Huggingface Transformers library is used as the backend, so any model you want to probe must be implemented there.

Currently integrated models:

BERT_MODELS = ["bert-base-uncased", "bert-base-multilingual-uncased"]
GPT2_MODELS = ["gpt2"]
GPT_NEO_MODELS = [
    "EleutherAI/gpt-neo-125M",
    "EleutherAI/gpt-neo-1.3B",
    "EleutherAI/gpt-neo-2.7B",
]

The technique from Dai et al. has been used to locate knowledge neurons in the huggingface bert-base-uncased model for all the head/relation/tail entities in the PARAREL dataset. Both the neurons, and more detailed results of the experiment are published at bert_base_uncased_neurons/*.json and can be replicated by running pararel_evaluate.py. More details in the Evaluations on the PARAREL dataset section.

Setup

Either clone the github, and run scripts from there:

git clone knowledge-neurons
cd knowledge-neurons

Or install as a pip package:

pip install knowledge-neurons

Usage & Examples

An example using bert-base-uncased:

from knowledge_neurons import KnowledgeNeurons, initialize_model_and_tokenizer, model_type
import random

# first initialize some hyperparameters
MODEL_NAME = "bert-base-uncased"

# to find the knowledge neurons, we need the same 'facts' expressed in multiple different ways, and a ground truth
TEXTS = [
    "Sarah was visiting [MASK], the capital of france",
    "The capital of france is [MASK]",
    "[MASK] is the capital of france",
    "France's capital [MASK] is a hotspot for romantic vacations",
    "The eiffel tower is situated in [MASK]",
    "[MASK] is the most populous city in france",
    "[MASK], france's capital, is one of the most popular tourist destinations in the world",
]
TEXT = TEXTS[0]
GROUND_TRUTH = "paris"

# these are some hyperparameters for the integrated gradients step
BATCH_SIZE = 20
STEPS = 20 # number of steps in the integrated grad calculation
ADAPTIVE_THRESHOLD = 0.3 # in the paper, they find the threshold value `t` by multiplying the max attribution score by some float - this is that float.
P = 0.5 # the threshold for the sharing percentage

# setup model & tokenizer
model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)

# initialize the knowledge neuron wrapper with your model, tokenizer and a string expressing the type of your model ('gpt2' / 'gpt_neo' / 'bert')
kn = KnowledgeNeurons(model, tokenizer, model_type=model_type(MODEL_NAME))

# use the integrated gradients technique to find some refined neurons for your set of prompts
refined_neurons = kn.get_refined_neurons(
    TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
    coarse_adaptive_threshold=ADAPTIVE_THRESHOLD,
)

# suppress the activations at the refined neurons + test the effect on a relevant prompt
# 'results_dict' is a dictionary containing the probability of the ground truth being generated before + after modification, as well as other info
# 'unpatch_fn' is a function you can use to undo the activation suppression in the model. 
# By default, the suppression is removed at the end of any function that applies a patch, but you can set 'undo_modification=False', 
# run your own experiments with the activations / weights still modified, then run 'unpatch_fn' to undo the modifications
results_dict, unpatch_fn = kn.suppress_knowledge(
    TEXT, GROUND_TRUTH, refined_neurons
)

# suppress the activations at the refined neurons + test the effect on an unrelated prompt
results_dict, unpatch_fn = kn.suppress_knowledge(
    "[MASK] is the official language of the solomon islands",
    "english",
    refined_neurons,
)

# enhance the activations at the refined neurons + test the effect on a relevant prompt
results_dict, unpatch_fn = kn.enhance_knowledge(TEXT, GROUND_TRUTH, refined_neurons)

# erase the weights of the output ff layer at the refined neurons (replacing them with zeros) + test the effect
results_dict, unpatch_fn = kn.erase_knowledge(
    TEXT, refined_neurons, target=GROUND_TRUTH, erase_value="zero"
)

# erase the weights of the output ff layer at the refined neurons (replacing them with an unk token) + test the effect
results_dict, unpatch_fn = kn.erase_knowledge(
    TEXT, refined_neurons, target=GROUND_TRUTH, erase_value="unk"
)

# edit the weights of the output ff layer at the refined neurons (replacing them with the word embedding of 'target') + test the effect
# we can make the model think the capital of france is London!
results_dict, unpatch_fn = kn.edit_knowledge(
    TEXT, target="london", neurons=refined_neurons
)

for bert models, the position where the "[MASK]" token is located is used to evaluate the knowledge neurons, (and the ground truth should be what the mask token is expected to be), but due to the nature of GPT models, the last position in the prompt is used by default, and the ground truth is expected to immediately follow.

In GPT models, due to the subword tokenization, the integrated gradients are taken n times, where n is the length of the expected ground truth in tokens, and the mean of the integrated gradients at each step is taken.

for bert models, the ground truth is currently expected to be a single token. Multi-token ground truths are on the todo list.

Evaluations on the PARAREL dataset

To ensure that the repo works correctly, figures 3 and 4 from the knowledge neurons paper are reproduced below. In general the results appear similar, except suppressing unrelated facts appears to have a little more of an affect in this repo than in the paper's original results.*

Below are Dai et al's, and our result, respectively, for suppressing the activations of the refined knowledge neurons in pararel: knowledge neuron suppression / dai et al. knowledge neuron suppression / ours

And Dai et al's, and our result, respectively, for enhancing the activations of the knowledge neurons: knowledge neuron enhancement / dai et al. knowledge neuron enhancement / ours

To find the knowledge neurons in bert-base-uncased for the PARAREL dataset, and replicate figures 3. and 4. from the paper, you can run

# find knowledge neurons + test suppression / enhancement (this will take a day or so on a decent gpu) 
# you can skip this step since the results are provided in `bert_base_uncased_neurons`
python -m torch.distributed.launch --nproc_per_node=NUM_GPUS_YOU_HAVE pararel_evaluate.py
# plot results 
python plot_pararel_results.py

*It's unclear where the difference comes from, but my suspicion is they made sure to only select facts with different relations, whereas in the plots below, only a different pararel UUID was selected. In retrospect, this could actually express the same fact, so I'll rerun these experiments soon.

TODO:

  • Better documentation
  • Publish PARAREL results for bert-base-multilingual-uncased
  • Publish PARAREL results for bert-large-uncased
  • Publish PARAREL results for bert-large-multilingual-uncased
  • Multiple masked tokens for bert models
  • Find good dataset for GPT-like models to evaluate knowledge neurons (PARAREL isn't applicable since the tail entities aren't always at the end of the sentence)
  • Add negative examples for getting refined neurons (i.e expressing a different fact in the same way)
  • Look into different attribution methods (cf. https://arxiv.org/pdf/2010.02695.pdf)

Citations

@article{Dai2021KnowledgeNI,
  title={Knowledge Neurons in Pretrained Transformers},
  author={Damai Dai and Li Dong and Y. Hao and Zhifang Sui and Furu Wei},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.08696}
}
You might also like...
Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks
Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset.

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset. Through its Python API, the pretrained model can be fine-tuned on any protein-related task in a matter of minutes. Based on our experiments with a wide range of benchmarks, ProteinBERT usually achieves state-of-the-art performance. ProteinBERT is built on TenforFlow/Keras.

Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

transformers-into-vaes Code for Finetuning Pretrained Transformers into Variational Autoencoders (our submission to NLP Insights Workshop 2021). Gathe

Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT)
Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT)

CIRPLANT This repository contains the code and pre-trained models for Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) For d

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

IndoBERTweet 🐦 🇮🇩 1. Paper Fajri Koto, Jey Han Lau, and Timothy Baldwin. IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effe

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)
🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

Pretrained BigBird Model for Korean What is BigBird • How to Use • Pretraining • Evaluation Result • Docs • Citation 한국어 | English What is BigBird? Bi

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

GPT-NeoX An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hun

Comments
  • Error when using KnowledgeNeurons with model_name =

    Error when using KnowledgeNeurons with model_name = "gpt2"

    When I initialize KnowledgeNeurons with model_name = 'gpt2', I get an AttributeError when trying to run get_refined_neurons()

    The following snippet would be able to reproduce the error on a colab notebook:

    from knowledge_neurons import (
        KnowledgeNeurons,
        initialize_model_and_tokenizer,
    )
    
    # setup model, tokenizer + kn class
    MODEL_NAME = "gpt2"  ## setting this to "bert-base-uncased" worked, but not on "gpt2"
    model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)
    kn = KnowledgeNeurons(model, tokenizer)
    
    TEXT = "Sarah was visiting [MASK], the capital of france"
    GROUND_TRUTH = "paris"
    BATCH_SIZE = 10
    STEPS = 20
    
    ENG_TEXTS = [
        "Sarah was visiting [MASK], the capital of france",
        "The capital of france is [MASK]",
        "[MASK] is the capital of france",
        "France's capital [MASK] is a hotspot for romantic vacations",
        "The eiffel tower is situated in [MASK]",
        "[MASK] is the most populous city in france",
        "[MASK], france's capital, is one of the most popular tourist destinations in the world",
    ]
    FRENCH_TEXTS = [
        "Sarah visitait [MASK], la capitale de la france",
        "La capitale de la france est [MASK]",
        "[MASK] est la capitale de la france",
        "La capitale de la France [MASK] est un haut lieu des vacances romantiques",
        "La tour eiffel est située à [MASK]",
        "[MASK] est la ville la plus peuplée de france",
        "[MASK], la capitale de la france, est l'une des destinations touristiques les plus prisées au monde",
    ]
    TEXTS = ENG_TEXTS + FRENCH_TEXTS
    
    P = 0.5 # sharing percentage
    
    refined_neurons_eng = kn.get_refined_neurons(
        ENG_TEXTS,
        GROUND_TRUTH,
        p=P,
        batch_size=BATCH_SIZE,
        steps=STEPS,
    )
    

    Given below is the full traceback:

    Getting coarse neurons for each prompt...:   0%|          | 0/7 [00:00<?, ?it/s]
    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-8-1164d97fbee5> in <module>()
          4     p=P,
          5     batch_size=BATCH_SIZE,
    ----> 6     steps=STEPS,
          7 )
    
    6 frames
    /usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_refined_neurons(self, prompts, ground_truth, p, batch_size, steps, coarse_adaptive_threshold, coarse_threshold, coarse_percentile, quiet)
        340                     threshold=coarse_threshold,
        341                     percentile=coarse_percentile,
    --> 342                     pbar=False,
        343                 )
        344             )
    
    /usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_coarse_neurons(self, prompt, ground_truth, batch_size, steps, threshold, adaptive_threshold, percentile, pbar)
        270         """
        271         attribution_scores = self.get_scores(
    --> 272             prompt, ground_truth, batch_size=batch_size, steps=steps, pbar=pbar
        273         )
        274         assert sum(e is not None for e in [threshold, adaptive_threshold, percentile]) == 1, f"Provide one and only one of threshold / adaptive_threshold / percentile"
    
    /usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_scores(self, prompt, ground_truth, batch_size, steps, pbar)
        223         encoded_input = self.tokenizer(prompt, return_tensors="pt").to(self.device)
        224         for layer_idx in tqdm(
    --> 225             range(self.n_layers()),
        226             desc="Getting attribution scores for each layer...",
        227             disable=not pbar,
    
    /usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in n_layers(self)
        136 
        137     def n_layers(self):
    --> 138         return len(self._get_transformer_layers())
        139 
        140     def intermediate_size(self):
    
    /usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in _get_transformer_layers(self)
         69 
         70     def _get_transformer_layers(self):
    ---> 71         return get_attributes(self.model, self.transformer_layers_attr)
         72 
         73     def _prepare_inputs(self, prompt, target=None, encoded_input=None):
    
    /usr/local/lib/python3.7/dist-packages/knowledge_neurons/patch.py in get_attributes(x, attributes)
         16     """
         17     for attr in attributes.split("."):
    ---> 18         x = getattr(x, attr)
         19     return x
         20 
    
    /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
       1176                 return modules[name]
       1177         raise AttributeError("'{}' object has no attribute '{}'".format(
    -> 1178             type(self).__name__, name))
       1179 
       1180     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:
    
    AttributeError: 'GPT2LMHeadModel' object has no attribute 'bert'
    
    bug 
    opened by Mayukhdeb 1
  • grad error using KnowledgeNeurons with model_name =

    grad error using KnowledgeNeurons with model_name = "gpt2"

    When I initialize KnowledgeNeurons with model_name = 'gpt2', I get an RuntimeError when trying to run kn.get_refined_neurons()

    The following snippet would be able to reproduce the error on a colab notebook:

    !pip install knowledge-neurons
    !nvidia-smi
    from knowledge_neurons import (
        KnowledgeNeurons,
        initialize_model_and_tokenizer,
        model_type,
    )
    import random
    import torch
    import torch.nn.functional as F
    
    # setup model, tokenizer + kn class
    MODEL_NAME = "gpt2"
    model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)
    kn = KnowledgeNeurons(model, tokenizer,model_type(MODEL_NAME))
    
    TEXT = "Sarah was visiting [MASK], the capital of france"
    GROUND_TRUTH = "paris"
    BATCH_SIZE = 10
    STEPS = 20
    
    ENG_TEXTS = [
        "Sarah was visiting [MASK], the capital of france",
        "The capital of france is [MASK]",
        "[MASK] is the capital of france",
        "France's capital [MASK] is a hotspot for romantic vacations",
        "The eiffel tower is situated in [MASK]",
        "[MASK] is the most populous city in france",
        "[MASK], france's capital, is one of the most popular tourist destinations in the world",
    ]
    FRENCH_TEXTS = [
        "Sarah visitait [MASK], la capitale de la france",
        "La capitale de la france est [MASK]",
        "[MASK] est la capitale de la france",
        "La capitale de la France [MASK] est un haut lieu des vacances romantiques",
        "La tour eiffel est située à [MASK]",
        "[MASK] est la ville la plus peuplée de france",
        "[MASK], la capitale de la france, est l'une des destinations touristiques les plus prisées au monde",
    ]
    TEXTS = ENG_TEXTS + FRENCH_TEXTS
    
    refined_neurons_eng = kn.get_refined_neurons(
        ENG_TEXTS,
        GROUND_TRUTH,
        p=P,
        batch_size=BATCH_SIZE,
        steps=STEPS,
    )
    refined_neurons_fr = kn.get_refined_neurons(
        FRENCH_TEXTS,
        GROUND_TRUTH,
        p=P,
        batch_size=BATCH_SIZE,
        steps=STEPS,
    )
    refined_neurons = kn.get_refined_neurons(
        TEXTS,
        GROUND_TRUTH,
        p=P,
        batch_size=BATCH_SIZE,
        steps=STEPS,
    )
    

    Given below is the full traceback:

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    <ipython-input-38-8b2477c6ac66> in <module>()
         45     p=P,
         46     batch_size=BATCH_SIZE,
    ---> 47     steps=STEPS,
         48 )
         49 refined_neurons_fr = kn.get_refined_neurons(
    
    5 frames
    /usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py in _make_grads(outputs, grads)
         49             if out.requires_grad:
         50                 if out.numel() != 1:
    ---> 51                     raise RuntimeError("grad can be implicitly created only for scalar outputs")
         52                 new_grads.append(torch.ones_like(out, memory_format=torch.preserve_format))
         53             else:
    
    RuntimeError: grad can be implicitly created only for scalar outputs
    

    Grad Error when initializing for the "gpt2" model
    @StellaAthena @sdtblck also what should be the input and the target for "gpt2" model given it's autoregressive mechanism??

    opened by vivekvkashyap 3
Owner
EleutherAI
EleutherAI
Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

Jifan Chen 22 Oct 21, 2022
This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO ?? ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

Timo Schick 154 Jan 1, 2023
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

simple_diarizer Simplified diarization pipeline using some pretrained models. Made to be a simple as possible to go from an input audio file to diariz

Chau 65 Dec 30, 2022
Code for evaluating Japanese pretrained models provided by NTT Ltd.

japanese-dialog-transformers 日本語の説明文はこちら This repository provides the information necessary to evaluate the Japanese Transformer Encoder-decoder dialo

NTT Communication Science Laboratories 216 Dec 22, 2022
BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

OpenBMB 377 Jan 2, 2023
This repo contains simple to use, pretrained/training-less models for speaker diarization.

PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o

null 12 Jan 20, 2022
T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets (product titles, images, comments, etc.).

null 55 Nov 22, 2022
Finding Label and Model Errors in Perception Data With Learned Observation Assertions

Finding Label and Model Errors in Perception Data With Learned Observation Assertions This is the project page for Finding Label and Model Errors in P

Stanford Future Data Systems 17 Oct 14, 2022
EMNLP'2021: Can Language Models be Biomedical Knowledge Bases?

BioLAMA BioLAMA is biomedical factual knowledge triples for probing biomedical LMs. The triples are collected and pre-processed from three sources: CT

DMIS Laboratory - Korea University 41 Nov 18, 2022
A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

NEC Laboratories Europe 13 Sep 8, 2022