MEND: Model Editing Networks using Gradient Decomposition

Eric Mitchell

Last update: Dec 2, 2022

Related tags

Deep Learning mend

Overview

MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

This codebase uses Python 3.7.9. Other versions may work as well.

Create a virtualenv (pyenv can help with this) and install the dependencies:

$ python -m venv env
$ source env/bin/activate
(env) $ pip install -r requirements.txt

Data

You can download the data needed for this project from this Google Drive link. Unzip each sub-directory into mend/data and you should be good to go.

Running the code

Run MEND training/evaluation for distilGPT-2 on the wikitext editing problem with:

(env) $ python -m run +alg=mend +experiment=gen +model=distilgpt2

Other valid algs include efk (KnowledgeEditor) and enn (Editable Neural Networks). Valid experiments include fc (FEVER fact checking) and qa (zsRE question-answering). Splits and rephrases for both come from De Cao et. al. Check config/model for options for editable models (note that all models don't work for all experiments; GPT-style models only work with gen, seq2seq models only work with qa, and BERT only works with fc).

Also note that in the paper, we sample locality data from different datasets depending on the model. By default, training will use Natural Questions data (not zsRE data) for computing drawdown in the qa experiment and OpenWebText. For models such as the distilgpt2 model we use (which was fine-tuned on wikitext) or the BART-base model, this behavior should be disabled with data.wiki_webtext=False or data.zsre_nq=False, respectively.

Citing the paper

If this code or paper was useful, please consider using the following citation:

@article{mitchell2021fast,
    title={Fast Model Editing at Scale},
    author={Mitchell, Eric and Lin, Charles and Bosselut, Antoine and Finn, Chelsea and Manning, Chris}
    year={2021}
}

Comments

not being able to install the requirements

Hi I am getting the following error when installing the requirements with pip install -r requirements.txt

Collecting git+git://github.com/eric-mitchell/higher@master (from -r requirements.txt (line 7)) Cloning git://github.com/eric-mitchell/higher (to revision master) to /tmp/pip-req-build-z1v0erey Running command git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey fatal: unable to connect to github.com: github.com[0: 192.30.255.113]: errno=Connection timed out

error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey did not run successfully. │ exit code: 128 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey did not run successfully. │ exit code: 128 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

opened by cameronalonso2 2

About the data file in google drive

Hi, I downloaded the zip files from google drive but failed to unzip them.

Archive:  10token.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of 10token.zip or
        10token.zip.zip, and cannot find 10token.zip.ZIP, period.

opened by xwwwwww 2

Why Choose QA Task in Batched edits with MEND and ENN?

Editing tasks have three categories: binary classification, QA and generation. In Batched edits, Why choose QA? And What is the result under finetune(FT)?

Thanks!

opened by sev777 1
Can't find 10token data

Hi, I'm trying to run python -m run +alg=mend +experiment=gen +model=distilgpt2 data.wiki_webtext=False, but I get a file not found error for data/10token/data/self_sample/train.json. I downloaded 10token from the linked google drive folder and unzipped it to data/10token. However, when I unzip it, all I get is a single 10token file, no train.json. Not sure if I'm missing something here. Thanks!

opened by salemohamedo 1
How to find the define of __x__ and __delta__ ?

Hi, Mr.Eric, I find the code : param_idx = lambda n, p: self.shape_dict[self.get_shape(p)].index(n) if self.config.mend.shared else None # noqa: E731 transformed_factors = { n: self.mend[str(tuple(self.get_shape(p)))](p.__x__, p.__delta__, param_idx(n, p)) for n, p in _inner_params(self.model.named_parameters(), self.config.model.inner_params) } I want to know where can I find the define of x and delta in code ? Is it in torch or python? Thank you!

opened by sev777 1
A post about your good repository in the medium

@eric-mitchell Thanks for your great repo. I have written the following brief post for introducing your great repo: Fast Model Editing at Scale via Model Editor Networks with Gradient Decomposition (MEND) Best

opened by ahkarami 1
How to apply this methods to a new LMs without fine-tune?

Hi, In your paper, does the model edited in MEND methods must be fine-tuned? How to apply the MEND to the model without fine-tuning ? When I try to do this, the acc_train is always 0.

Thanks.

opened by sev777 0

Owner

Eric Mitchell

PhD Student at Stanford University

GitHub

A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Intro PyTorch implementation of Learning to learn by gradient descent by gradient descent. Run python main.py TODO Initial implementation Toy data LST

300 Dec 11, 2022

PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

48 Dec 15, 2022

PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

snn-localization repo PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch. Install Dependencies Orig

1 Jan 6, 2022

DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

DeepLM DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021) Run Please install th

130 Dec 2, 2022

Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

Cancer-and-Tumor-Detection-Using-Inception-model In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks

1 Jan 1, 2022

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Core ML Tools Use coremltools to convert machine learning models from third-party libraries to the Core ML format. The Python package contains the sup

3k Jan 8, 2023

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

2.9k Jan 4, 2023

Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Protein GLM (wip) Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capabil

17 May 6, 2022

Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

63 Sep 21, 2022

On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

46 Dec 15, 2022

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

14 Nov 25, 2021

[ICLR 2021] Is Attention Better Than Matrix Decomposition?

Enjoy-Hamburger ?? Official implementation of Hamburger, Is Attention Better Than Matrix Decomposition? (ICLR 2021) Under construction. Introduction T

271 Dec 29, 2022

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

114 Dec 12, 2022

official code for dynamic convolution decomposition

Revisiting Dynamic Convolution via Matrix Decomposition (ICLR 2021) A pytorch implementation of DCD. If you use this code in your research please cons

110 Nov 23, 2022

Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

DTI-Sprites Pytorch implementation of "Unsupervised Layered Image Decomposition into Object Prototypes" paper Check out our paper and webpage for deta

40 Dec 22, 2022

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

71 Dec 29, 2022

NeRD: Neural Reflectance Decomposition from Image Collections

NeRD: Neural Reflectance Decomposition from Image Collections Project Page | Video | Paper | Dataset Implementation for NeRD. A novel method which dec

Computergraphics (University of Tübingen)

195 Dec 29, 2022

This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

34 Nov 30, 2022

Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

28 Dec 2, 2022

MEND: Model Editing Networks using Gradient Decomposition

Related tags

Overview

MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

Data

Running the code

Citing the paper

Comments

not being able to install the requirements

About the data file in google drive

Why Choose QA Task in Batched edits with MEND and ENN?

Can't find 10token data

How to find the define of __x__ and __delta__ ?

A post about your good repository in the medium

How to apply this methods to a new LMs without fine-tune?

Owner

Eric Mitchell

A PyTorch implementation of Learning to learn by gradient descent by gradient descent

PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

On the model-based stochastic value gradient for continuous reinforcement learning

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

[ICLR 2021] Is Attention Better Than Matrix Decomposition?

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

official code for dynamic convolution decomposition

Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

NeRD: Neural Reflectance Decomposition from Image Collections

This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

How to find the define of x and delta ?