A library for finding knowledge neurons in pretrained transformer models.

EleutherAI

Last update: Dec 21, 2022

Related tags

Overview

knowledge-neurons

An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the technique to autoregressive models, as well as MLMs.

The Huggingface Transformers library is used as the backend, so any model you want to probe must be implemented there.

Currently integrated models:

BERT_MODELS = ["bert-base-uncased", "bert-base-multilingual-uncased"]
GPT2_MODELS = ["gpt2"]
GPT_NEO_MODELS = [
    "EleutherAI/gpt-neo-125M",
    "EleutherAI/gpt-neo-1.3B",
    "EleutherAI/gpt-neo-2.7B",
]

The technique from Dai et al. has been used to locate knowledge neurons in the huggingface bert-base-uncased model for all the head/relation/tail entities in the PARAREL dataset. Both the neurons, and more detailed results of the experiment are published at bert_base_uncased_neurons/*.json and can be replicated by running pararel_evaluate.py. More details in the Evaluations on the PARAREL dataset section.

Setup

Either clone the github, and run scripts from there:

git clone knowledge-neurons
cd knowledge-neurons

Or install as a pip package:

pip install knowledge-neurons

Usage & Examples

An example using bert-base-uncased:

from knowledge_neurons import KnowledgeNeurons, initialize_model_and_tokenizer, model_type
import random

# first initialize some hyperparameters
MODEL_NAME = "bert-base-uncased"

# to find the knowledge neurons, we need the same 'facts' expressed in multiple different ways, and a ground truth
TEXTS = [
    "Sarah was visiting [MASK], the capital of france",
    "The capital of france is [MASK]",
    "[MASK] is the capital of france",
    "France's capital [MASK] is a hotspot for romantic vacations",
    "The eiffel tower is situated in [MASK]",
    "[MASK] is the most populous city in france",
    "[MASK], france's capital, is one of the most popular tourist destinations in the world",
]
TEXT = TEXTS[0]
GROUND_TRUTH = "paris"

# these are some hyperparameters for the integrated gradients step
BATCH_SIZE = 20
STEPS = 20 # number of steps in the integrated grad calculation
ADAPTIVE_THRESHOLD = 0.3 # in the paper, they find the threshold value `t` by multiplying the max attribution score by some float - this is that float.
P = 0.5 # the threshold for the sharing percentage

# setup model & tokenizer
model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)

# initialize the knowledge neuron wrapper with your model, tokenizer and a string expressing the type of your model ('gpt2' / 'gpt_neo' / 'bert')
kn = KnowledgeNeurons(model, tokenizer, model_type=model_type(MODEL_NAME))

# use the integrated gradients technique to find some refined neurons for your set of prompts
refined_neurons = kn.get_refined_neurons(
    TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
    coarse_adaptive_threshold=ADAPTIVE_THRESHOLD,
)

# suppress the activations at the refined neurons + test the effect on a relevant prompt
# 'results_dict' is a dictionary containing the probability of the ground truth being generated before + after modification, as well as other info
# 'unpatch_fn' is a function you can use to undo the activation suppression in the model. 
# By default, the suppression is removed at the end of any function that applies a patch, but you can set 'undo_modification=False', 
# run your own experiments with the activations / weights still modified, then run 'unpatch_fn' to undo the modifications
results_dict, unpatch_fn = kn.suppress_knowledge(
    TEXT, GROUND_TRUTH, refined_neurons
)

# suppress the activations at the refined neurons + test the effect on an unrelated prompt
results_dict, unpatch_fn = kn.suppress_knowledge(
    "[MASK] is the official language of the solomon islands",
    "english",
    refined_neurons,
)

# enhance the activations at the refined neurons + test the effect on a relevant prompt
results_dict, unpatch_fn = kn.enhance_knowledge(TEXT, GROUND_TRUTH, refined_neurons)

# erase the weights of the output ff layer at the refined neurons (replacing them with zeros) + test the effect
results_dict, unpatch_fn = kn.erase_knowledge(
    TEXT, refined_neurons, target=GROUND_TRUTH, erase_value="zero"
)

# erase the weights of the output ff layer at the refined neurons (replacing them with an unk token) + test the effect
results_dict, unpatch_fn = kn.erase_knowledge(
    TEXT, refined_neurons, target=GROUND_TRUTH, erase_value="unk"
)

# edit the weights of the output ff layer at the refined neurons (replacing them with the word embedding of 'target') + test the effect
# we can make the model think the capital of france is London!
results_dict, unpatch_fn = kn.edit_knowledge(
    TEXT, target="london", neurons=refined_neurons
)

for bert models, the position where the "[MASK]" token is located is used to evaluate the knowledge neurons, (and the ground truth should be what the mask token is expected to be), but due to the nature of GPT models, the last position in the prompt is used by default, and the ground truth is expected to immediately follow.

In GPT models, due to the subword tokenization, the integrated gradients are taken n times, where n is the length of the expected ground truth in tokens, and the mean of the integrated gradients at each step is taken.

for bert models, the ground truth is currently expected to be a single token. Multi-token ground truths are on the todo list.

Evaluations on the PARAREL dataset

To ensure that the repo works correctly, figures 3 and 4 from the knowledge neurons paper are reproduced below. In general the results appear similar, except suppressing unrelated facts appears to have a little more of an affect in this repo than in the paper's original results.*

Below are Dai et al's, and our result, respectively, for suppressing the activations of the refined knowledge neurons in pararel:

And Dai et al's, and our result, respectively, for enhancing the activations of the knowledge neurons:

To find the knowledge neurons in bert-base-uncased for the PARAREL dataset, and replicate figures 3. and 4. from the paper, you can run

# find knowledge neurons + test suppression / enhancement (this will take a day or so on a decent gpu) 
# you can skip this step since the results are provided in `bert_base_uncased_neurons`
python -m torch.distributed.launch --nproc_per_node=NUM_GPUS_YOU_HAVE pararel_evaluate.py
# plot results 
python plot_pararel_results.py

*It's unclear where the difference comes from, but my suspicion is they made sure to only select facts with different relations, whereas in the plots below, only a different pararel UUID was selected. In retrospect, this could actually express the same fact, so I'll rerun these experiments soon.

TODO:

Better documentation
Publish PARAREL results for bert-base-multilingual-uncased
Publish PARAREL results for bert-large-uncased
Publish PARAREL results for bert-large-multilingual-uncased
Multiple masked tokens for bert models
Find good dataset for GPT-like models to evaluate knowledge neurons (PARAREL isn't applicable since the tail entities aren't always at the end of the sentence)
Add negative examples for getting refined neurons (i.e expressing a different fact in the same way)
Look into different attribution methods (cf. https://arxiv.org/pdf/2010.02695.pdf)

Citations

@article{Dai2021KnowledgeNI,
  title={Knowledge Neurons in Pretrained Transformers},
  author={Damai Dai and Li Dong and Y. Hao and Zhifang Sui and Furu Wei},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.08696}
}

You might also like...

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

225 Nov 13, 2022

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

67 Dec 20, 2022

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

4 May 20, 2022

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

Hierarchical Pretraining: Research Repository This is a research repository for reproducing the results from the project "Self-supervised pretraining

53 Nov 9, 2022

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Res2Net The official pytorch implemention of the paper "Res2Net: A New Multi-scale Backbone Architecture" Our paper is accepted by IEEE Transactions o

928 Dec 29, 2022

Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Face Recognition Using Pytorch Python 3.7 3.6 3.5 Status This is a repository for Inception Resnet (V1) models in pytorch, pretrained on VGGFace2 and

3.3k Jan 4, 2023

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

169 Dec 26, 2022

This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging To appear on KDD'21...[pdf] This project provides an unsupervised framework for mining and

146 Dec 22, 2022

Comments

Error when using KnowledgeNeurons with model_name = "gpt2"

When I initialize KnowledgeNeurons with model_name = 'gpt2', I get an AttributeError when trying to run get_refined_neurons()

The following snippet would be able to reproduce the error on a colab notebook:

from knowledge_neurons import (
    KnowledgeNeurons,
    initialize_model_and_tokenizer,
)

# setup model, tokenizer + kn class
MODEL_NAME = "gpt2"  ## setting this to "bert-base-uncased" worked, but not on "gpt2"
model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)
kn = KnowledgeNeurons(model, tokenizer)

TEXT = "Sarah was visiting [MASK], the capital of france"
GROUND_TRUTH = "paris"
BATCH_SIZE = 10
STEPS = 20

ENG_TEXTS = [
    "Sarah was visiting [MASK], the capital of france",
    "The capital of france is [MASK]",
    "[MASK] is the capital of france",
    "France's capital [MASK] is a hotspot for romantic vacations",
    "The eiffel tower is situated in [MASK]",
    "[MASK] is the most populous city in france",
    "[MASK], france's capital, is one of the most popular tourist destinations in the world",
]
FRENCH_TEXTS = [
    "Sarah visitait [MASK], la capitale de la france",
    "La capitale de la france est [MASK]",
    "[MASK] est la capitale de la france",
    "La capitale de la France [MASK] est un haut lieu des vacances romantiques",
    "La tour eiffel est située à [MASK]",
    "[MASK] est la ville la plus peuplée de france",
    "[MASK], la capitale de la france, est l'une des destinations touristiques les plus prisées au monde",
]
TEXTS = ENG_TEXTS + FRENCH_TEXTS

P = 0.5 # sharing percentage

refined_neurons_eng = kn.get_refined_neurons(
    ENG_TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)

Given below is the full traceback:

Getting coarse neurons for each prompt...:   0%|          | 0/7 [00:00<?, ?it/s]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-1164d97fbee5> in <module>()
      4     p=P,
      5     batch_size=BATCH_SIZE,
----> 6     steps=STEPS,
      7 )

6 frames
/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_refined_neurons(self, prompts, ground_truth, p, batch_size, steps, coarse_adaptive_threshold, coarse_threshold, coarse_percentile, quiet)
    340                     threshold=coarse_threshold,
    341                     percentile=coarse_percentile,
--> 342                     pbar=False,
    343                 )
    344             )

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_coarse_neurons(self, prompt, ground_truth, batch_size, steps, threshold, adaptive_threshold, percentile, pbar)
    270         """
    271         attribution_scores = self.get_scores(
--> 272             prompt, ground_truth, batch_size=batch_size, steps=steps, pbar=pbar
    273         )
    274         assert sum(e is not None for e in [threshold, adaptive_threshold, percentile]) == 1, f"Provide one and only one of threshold / adaptive_threshold / percentile"

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in get_scores(self, prompt, ground_truth, batch_size, steps, pbar)
    223         encoded_input = self.tokenizer(prompt, return_tensors="pt").to(self.device)
    224         for layer_idx in tqdm(
--> 225             range(self.n_layers()),
    226             desc="Getting attribution scores for each layer...",
    227             disable=not pbar,

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in n_layers(self)
    136 
    137     def n_layers(self):
--> 138         return len(self._get_transformer_layers())
    139 
    140     def intermediate_size(self):

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/knowledge_neurons.py in _get_transformer_layers(self)
     69 
     70     def _get_transformer_layers(self):
---> 71         return get_attributes(self.model, self.transformer_layers_attr)
     72 
     73     def _prepare_inputs(self, prompt, target=None, encoded_input=None):

/usr/local/lib/python3.7/dist-packages/knowledge_neurons/patch.py in get_attributes(x, attributes)
     16     """
     17     for attr in attributes.split("."):
---> 18         x = getattr(x, attr)
     19     return x
     20 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1176                 return modules[name]
   1177         raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1178             type(self).__name__, name))
   1179 
   1180     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'GPT2LMHeadModel' object has no attribute 'bert'

bug

opened by Mayukhdeb 1

grad error using KnowledgeNeurons with model_name = "gpt2"

When I initialize KnowledgeNeurons with model_name = 'gpt2', I get an RuntimeError when trying to run kn.get_refined_neurons()

The following snippet would be able to reproduce the error on a colab notebook:

!pip install knowledge-neurons
!nvidia-smi
from knowledge_neurons import (
    KnowledgeNeurons,
    initialize_model_and_tokenizer,
    model_type,
)
import random
import torch
import torch.nn.functional as F

# setup model, tokenizer + kn class
MODEL_NAME = "gpt2"
model, tokenizer = initialize_model_and_tokenizer(MODEL_NAME)
kn = KnowledgeNeurons(model, tokenizer,model_type(MODEL_NAME))

TEXT = "Sarah was visiting [MASK], the capital of france"
GROUND_TRUTH = "paris"
BATCH_SIZE = 10
STEPS = 20

ENG_TEXTS = [
    "Sarah was visiting [MASK], the capital of france",
    "The capital of france is [MASK]",
    "[MASK] is the capital of france",
    "France's capital [MASK] is a hotspot for romantic vacations",
    "The eiffel tower is situated in [MASK]",
    "[MASK] is the most populous city in france",
    "[MASK], france's capital, is one of the most popular tourist destinations in the world",
]
FRENCH_TEXTS = [
    "Sarah visitait [MASK], la capitale de la france",
    "La capitale de la france est [MASK]",
    "[MASK] est la capitale de la france",
    "La capitale de la France [MASK] est un haut lieu des vacances romantiques",
    "La tour eiffel est située à [MASK]",
    "[MASK] est la ville la plus peuplée de france",
    "[MASK], la capitale de la france, est l'une des destinations touristiques les plus prisées au monde",
]
TEXTS = ENG_TEXTS + FRENCH_TEXTS

refined_neurons_eng = kn.get_refined_neurons(
    ENG_TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)
refined_neurons_fr = kn.get_refined_neurons(
    FRENCH_TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)
refined_neurons = kn.get_refined_neurons(
    TEXTS,
    GROUND_TRUTH,
    p=P,
    batch_size=BATCH_SIZE,
    steps=STEPS,
)

Given below is the full traceback:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-38-8b2477c6ac66> in <module>()
     45     p=P,
     46     batch_size=BATCH_SIZE,
---> 47     steps=STEPS,
     48 )
     49 refined_neurons_fr = kn.get_refined_neurons(

5 frames
/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py in _make_grads(outputs, grads)
     49             if out.requires_grad:
     50                 if out.numel() != 1:
---> 51                     raise RuntimeError("grad can be implicitly created only for scalar outputs")
     52                 new_grads.append(torch.ones_like(out, memory_format=torch.preserve_format))
     53             else:

RuntimeError: grad can be implicitly created only for scalar outputs

Grad Error when initializing for the "gpt2" model
@StellaAthena @sdtblck also what should be the input and the target for "gpt2" model given it's autoregressive mechanism??

opened by vivekvkashyap 3

Owner

EleutherAI

GitHub

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Official code Cross-Covariance Image Transformer (XCiT)

605 Jan 2, 2023

Using pretrained language models for biomedical knowledge graph completion.

LMs for biomedical KG completion This repository contains code to run the experiments described in: Scientific Language Models for Biomedical Knowledg

41 Nov 30, 2022

source code for 'Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge' by A. Shah, K. Shanmugam, K. Ahuja

Source code for "Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge" Reference: Abhin Shah, Karthikeyan Shanmugam, Kartik Ahu

1 Jun 3, 2022

A library for finding knowledge neurons in pretrained transformer models.

Related tags

Overview

knowledge-neurons

Setup

Usage & Examples

Evaluations on the PARAREL dataset

TODO:

Citations

You might also like...

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

Comments

Error when using KnowledgeNeurons with model_name = "gpt2"

grad error using KnowledgeNeurons with model_name = "gpt2"

Owner

EleutherAI

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Using pretrained language models for biomedical knowledge graph completion.

source code for 'Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge' by A. Shah, K. Shanmugam, K. Ahuja

Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Neuron Merging: Compensating for Pruned Neurons (NeurIPS 2020)

Binary Stochastic Neurons in PyTorch

Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

WormMovementSimulation - 3D Simulation of Worm Body Movement with Neurons attached to its body

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models