A library for finding knowledge neurons in pretrained transformer models.

EleutherAI

Last update: Dec 21, 2022

Related tags

Overview

knowledge-neurons

An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the technique to autoregressive models, as well as MLMs.

The Huggingface Transformers library is used as the backend, so any model you want to probe must be implemented there.

Currently integrated models:

BERT_MODELS = ["bert-base-uncased", "bert-base-multilingual-uncased"]
GPT2_MODELS = ["gpt2"]
GPT_NEO_MODELS = [
    "EleutherAI/gpt-neo-125M",
    "EleutherAI/gpt-neo-1.3B",
    "EleutherAI/gpt-neo-2.7B",
]

The technique from Dai et al. has been used to locate knowledge neurons in the huggingface bert-base-uncased model for all the head/relation/tail entities in the PARAREL dataset. Both the neurons, and more detailed results of the experiment are published at bert_base_uncased_neurons/*.json and can be replicated by running pararel_evaluate.py. More details in the Evaluations on the PARAREL dataset section.

Setup

Either clone the github, and run scripts from there:

git clone knowledge-neurons
cd knowledge-neurons

Or install as a pip package:

pip install knowledge-neurons

Usage & Examples

An example using bert-base-uncased:

from knowledge_neurons import KnowledgeNeurons, initialize_model_and_tokenizer, model_type
import random

# first initialize some hyperparameters
MODEL_NAME = "bert-base-uncased"

# to find the knowledge neurons, we need the same 'facts' expressed in multiple different ways, and a ground truth
TEXTS = [
    "Sarah was visiting [MASK], the capital of france",
    "The capital of france is [MASK]",
    "[MASK] is the capital of france",
    "France's capital [MASK] is a hotspot for romantic vacations",
    "The eiffel tower is situated in [MASK]",
    "[MASK] is the most populous city in france",
    "[MASK], france's capital, is one of the most popular tourist destinations in the world",
]
TEXT = TEXTS[0]
GROUND_TRUTH = "paris"

# these are some hyperparameters for the integrated gradients step
BATCH_SIZE = 20
STEPS = 20 # number of steps in the integrated grad calculation
ADAPTIVE_THRESHOLD = 0.3 # in the paper, they find the threshold value `t` by multiplying the max attribution score by some float - this is that float.
P = 0.5 # the threshold for the sharing percentage

# setup model & tokenizer
model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)

# initialize the knowledge neuron wrapper with your model, tokenizer and a string expressing the type of your model ('gpt2' / 'gpt_neo' / 'bert')
kn = KnowledgeNeurons(model, tokenizer, model_type=model_type(MODEL_NAME))

# use the integrated gradients technique to find some refined neurons for your set of prompts
refined_neurons = kn.get_refined_neurons(
    TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
    coarse_adaptive_threshold=ADAPTIVE_THRESHOLD,
)

# suppress the activations at the refined neurons + test the effect on a relevant prompt
# 'results_dict' is a dictionary containing the probability of the ground truth being generated before + after modification, as well as other info
# 'unpatch_fn' is a function you can use to undo the activation suppression in the model. 
# By default, the suppression is removed at the end of any function that applies a patch, but you can set 'undo_modification=False', 
# run your own experiments with the activations / weights still modified, then run 'unpatch_fn' to undo the modifications
results_dict, unpatch_fn = kn.suppress_knowledge(
    TEXT, GROUND_TRUTH, refined_neurons
)

# suppress the activations at the refined neurons + test the effect on an unrelated prompt
results_dict, unpatch_fn = kn.suppress_knowledge(
    "[MASK] is the official language of the solomon islands",
    "english",
    refined_neurons,
)

# enhance the activations at the refined neurons + test the effect on a relevant prompt
results_dict, unpatch_fn = kn.enhance_knowledge(TEXT, GROUND_TRUTH, refined_neurons)

# erase the weights of the output ff layer at the refined neurons (replacing them with zeros) + test the effect
results_dict, unpatch_fn = kn.erase_knowledge(
    TEXT, refined_neurons, target=GROUND_TRUTH, erase_value="zero"
)

# erase the weights of the output ff layer at the refined neurons (replacing them with an unk token) + test the effect
results_dict, unpatch_fn = kn.erase_knowledge(
    TEXT, refined_neurons, target=GROUND_TRUTH, erase_value="unk"
)

# edit the weights of the output ff layer at the refined neurons (replacing them with the word embedding of 'target') + test the effect
# we can make the model think the capital of france is London!
results_dict, unpatch_fn = kn.edit_knowledge(
    TEXT, target="london", neurons=refined_neurons
)

for bert models, the position where the "[MASK]" token is located is used to evaluate the knowledge neurons, (and the ground truth should be what the mask token is expected to be), but due to the nature of GPT models, the last position in the prompt is used by default, and the ground truth is expected to immediately follow.

In GPT models, due to the subword tokenization, the integrated gradients are taken n times, where n is the length of the expected ground truth in tokens, and the mean of the integrated gradients at each step is taken.

for bert models, the ground truth is currently expected to be a single token. Multi-token ground truths are on the todo list.

Evaluations on the PARAREL dataset

To ensure that the repo works correctly, figures 3 and 4 from the knowledge neurons paper are reproduced below. In general the results appear similar, except suppressing unrelated facts appears to have a little more of an affect in this repo than in the paper's original results.*

Below are Dai et al's, and our result, respectively, for suppressing the activations of the refined knowledge neurons in pararel:

And Dai et al's, and our result, respectively, for enhancing the activations of the knowledge neurons:

To find the knowledge neurons in bert-base-uncased for the PARAREL dataset, and replicate figures 3. and 4. from the paper, you can run

# find knowledge neurons + test suppression / enhancement (this will take a day or so on a decent gpu) 
# you can skip this step since the results are provided in `bert_base_uncased_neurons`
python -m torch.distributed.launch --nproc_per_node=NUM_GPUS_YOU_HAVE pararel_evaluate.py
# plot results 
python plot_pararel_results.py

*It's unclear where the difference comes from, but my suspicion is they made sure to only select facts with different relations, whereas in the plots below, only a different pararel UUID was selected. In retrospect, this could actually express the same fact, so I'll rerun these experiments soon.

TODO:

Better documentation
Publish PARAREL results for bert-base-multilingual-uncased
Publish PARAREL results for bert-large-uncased
Publish PARAREL results for bert-large-multilingual-uncased
Multiple masked tokens for bert models
Find good dataset for GPT-like models to evaluate knowledge neurons (PARAREL isn't applicable since the tail entities aren't always at the end of the sentence)
Add negative examples for getting refined neurons (i.e expressing a different fact in the same way)
Look into different attribution methods (cf. https://arxiv.org/pdf/2010.02695.pdf)

Citations

@article{Dai2021KnowledgeNI,
  title={Knowledge Neurons in Pretrained Transformers},
  author={Damai Dai and Li Dong and Y. Hao and Zhifang Sui and Furu Wei},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.08696}
}

You might also like...

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

141 Dec 30, 2022

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

1.2k Jan 8, 2023

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

903 Feb 17, 2021

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset.

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset. Through its Python API, the pretrained model can be fine-tuned on any protein-related task in a matter of minutes. Based on our experiments with a wide range of benchmarks, ProteinBERT usually achieves state-of-the-art performance. ProteinBERT is built on TenforFlow/Keras.

241 Jan 4, 2023

Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

transformers-into-vaes Code for Finetuning Pretrained Transformers into Variational Autoencoders (our submission to NLP Insights Workshop 2021). Gathe

22 Nov 26, 2022

Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT)

CIRPLANT This repository contains the code and pre-trained models for Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) For d

29 Nov 17, 2022

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

IndoBERTweet 🐦 🇮🇩 1. Paper Fajri Koto, Jey Han Lau, and Timothy Baldwin. IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effe

40 Nov 30, 2022

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

Pretrained BigBird Model for Korean What is BigBird • How to Use • Pretraining • Evaluation Result • Docs • Citation 한국어 | English What is BigBird? Bi

183 Dec 14, 2022

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

GPT-NeoX An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hun

3.1k Jan 8, 2023

Comments

Error when using KnowledgeNeurons with model_name = "gpt2"

When I initialize KnowledgeNeurons with model_name = 'gpt2', I get an AttributeError when trying to run get_refined_neurons()

The following snippet would be able to reproduce the error on a colab notebook:

from knowledge_neurons import (
    KnowledgeNeurons,
    initialize_model_and_tokenizer,
)

# setup model, tokenizer + kn class
MODEL_NAME = "gpt2"  ## setting this to "bert-base-uncased" worked, but not on "gpt2"
model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)
kn = KnowledgeNeurons(model, tokenizer)

TEXT = "Sarah was visiting [MASK], the capital of france"
GROUND_TRUTH = "paris"
BATCH_SIZE = 10
STEPS = 20

ENG_TEXTS = [
    "Sarah was visiting [MASK], the capital of france",
    "The capital of france is [MASK]",
    "[MASK] is the capital of france",
    "France's capital [MASK] is a hotspot for romantic vacations",
    "The eiffel tower is situated in [MASK]",
    "[MASK] is the most populous city in france",
    "[MASK], france's capital, is one of the most popular tourist destinations in the world",
]
FRENCH_TEXTS = [
    "Sarah visitait [MASK], la capitale de la france",
    "La capitale de la france est [MASK]",
    "[MASK] est la capitale de la france",
    "La capitale de la France [MASK] est un haut lieu des vacances romantiques",
    "La tour eiffel est située à [MASK]",
    "[MASK] est la ville la plus peuplée de france",
    "[MASK], la capitale de la france, est l'une des destinations touristiques les plus prisées au monde",
]
TEXTS = ENG_TEXTS + FRENCH_TEXTS

P = 0.5 # sharing percentage

refined_neurons_eng = kn.get_refined_neurons(
    ENG_TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)

Given below is the full traceback:

Getting coarse neurons for each prompt...:   0%|          | 0/7 [00:00<?, ?it/s]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-1164d97fbee5> in <module>()
      4     p=P,
      5     batch_size=BATCH_SIZE,
----> 6     steps=STEPS,
      7 )

6 frames
/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_refined_neurons(self, prompts, ground_truth, p, batch_size, steps, coarse_adaptive_threshold, coarse_threshold, coarse_percentile, quiet)
    340                     threshold=coarse_threshold,
    341                     percentile=coarse_percentile,
--> 342                     pbar=False,
    343                 )
    344             )

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_coarse_neurons(self, prompt, ground_truth, batch_size, steps, threshold, adaptive_threshold, percentile, pbar)
    270         """
    271         attribution_scores = self.get_scores(
--> 272             prompt, ground_truth, batch_size=batch_size, steps=steps, pbar=pbar
    273         )
    274         assert sum(e is not None for e in [threshold, adaptive_threshold, percentile]) == 1, f"Provide one and only one of threshold / adaptive_threshold / percentile"

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_scores(self, prompt, ground_truth, batch_size, steps, pbar)
    223         encoded_input = self.tokenizer(prompt, return_tensors="pt").to(self.device)
    224         for layer_idx in tqdm(
--> 225             range(self.n_layers()),
    226             desc="Getting attribution scores for each layer...",
    227             disable=not pbar,

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in n_layers(self)
    136 
    137     def n_layers(self):
--> 138         return len(self._get_transformer_layers())
    139 
    140     def intermediate_size(self):

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in _get_transformer_layers(self)
     69 
     70     def _get_transformer_layers(self):
---> 71         return get_attributes(self.model, self.transformer_layers_attr)
     72 
     73     def _prepare_inputs(self, prompt, target=None, encoded_input=None):

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/patch.py in get_attributes(x, attributes)
     16     """
     17     for attr in attributes.split("."):
---> 18         x = getattr(x, attr)
     19     return x
     20 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1176                 return modules[name]
   1177         raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1178             type(self).__name__, name))
   1179 
   1180     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'GPT2LMHeadModel' object has no attribute 'bert'

bug

opened by Mayukhdeb 1

grad error using KnowledgeNeurons with model_name = "gpt2"

When I initialize KnowledgeNeurons with model_name = 'gpt2', I get an RuntimeError when trying to run kn.get_refined_neurons()

The following snippet would be able to reproduce the error on a colab notebook:

!pip install knowledge-neurons
!nvidia-smi
from knowledge_neurons import (
    KnowledgeNeurons,
    initialize_model_and_tokenizer,
    model_type,
)
import random
import torch
import torch.nn.functional as F

# setup model, tokenizer + kn class
MODEL_NAME = "gpt2"
model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)
kn = KnowledgeNeurons(model, tokenizer,model_type(MODEL_NAME))

TEXT = "Sarah was visiting [MASK], the capital of france"
GROUND_TRUTH = "paris"
BATCH_SIZE = 10
STEPS = 20

ENG_TEXTS = [
    "Sarah was visiting [MASK], the capital of france",
    "The capital of france is [MASK]",
    "[MASK] is the capital of france",
    "France's capital [MASK] is a hotspot for romantic vacations",
    "The eiffel tower is situated in [MASK]",
    "[MASK] is the most populous city in france",
    "[MASK], france's capital, is one of the most popular tourist destinations in the world",
]
FRENCH_TEXTS = [
    "Sarah visitait [MASK], la capitale de la france",
    "La capitale de la france est [MASK]",
    "[MASK] est la capitale de la france",
    "La capitale de la France [MASK] est un haut lieu des vacances romantiques",
    "La tour eiffel est située à [MASK]",
    "[MASK] est la ville la plus peuplée de france",
    "[MASK], la capitale de la france, est l'une des destinations touristiques les plus prisées au monde",
]
TEXTS = ENG_TEXTS + FRENCH_TEXTS

refined_neurons_eng = kn.get_refined_neurons(
    ENG_TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)
refined_neurons_fr = kn.get_refined_neurons(
    FRENCH_TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)
refined_neurons = kn.get_refined_neurons(
    TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)

Given below is the full traceback:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-38-8b2477c6ac66> in <module>()
     45     p=P,
     46     batch_size=BATCH_SIZE,
---> 47     steps=STEPS,
     48 )
     49 refined_neurons_fr = kn.get_refined_neurons(

5 frames
/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py in _make_grads(outputs, grads)
     49             if out.requires_grad:
     50                 if out.numel() != 1:
---> 51                     raise RuntimeError("grad can be implicitly created only for scalar outputs")
     52                 new_grads.append(torch.ones_like(out, memory_format=torch.preserve_format))
     53             else:

RuntimeError: grad can be implicitly created only for scalar outputs

Grad Error when initializing for the "gpt2" model
@StellaAthena @sdtblck also what should be the input and the target for "gpt2" model given it's autoregressive mechanism??

opened by vivekvkashyap 3

Owner

EleutherAI

GitHub

A library for finding knowledge neurons in pretrained transformer models.

Related tags

Overview

knowledge-neurons

Setup

Usage & Examples

Evaluations on the PARAREL dataset

TODO:

Citations

You might also like...

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset.

Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT)

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Comments

Error when using KnowledgeNeurons with model_name = "gpt2"

grad error using KnowledgeNeurons with model_name = "gpt2"

Owner

EleutherAI

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Code for evaluating Japanese pretrained models provided by NTT Ltd.

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

This repo contains simple to use, pretrained/training-less models for speaker diarization.

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

EMNLP'2021: Can Language Models be Biomedical Knowledge Bases?

A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.