MEND: Model Editing Networks using Gradient Decomposition

Related tags

Deep Learning mend
Overview

MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

This codebase uses Python 3.7.9. Other versions may work as well.

Create a virtualenv (pyenv can help with this) and install the dependencies:

$ python -m venv env
$ source env/bin/activate
(env) $ pip install -r requirements.txt

Data

You can download the data needed for this project from this Google Drive link. Unzip each sub-directory into mend/data and you should be good to go.

Running the code

Run MEND training/evaluation for distilGPT-2 on the wikitext editing problem with:

(env) $ python -m run +alg=mend +experiment=gen +model=distilgpt2

Other valid algs include efk (KnowledgeEditor) and enn (Editable Neural Networks). Valid experiments include fc (FEVER fact checking) and qa (zsRE question-answering). Splits and rephrases for both come from De Cao et. al. Check config/model for options for editable models (note that all models don't work for all experiments; GPT-style models only work with gen, seq2seq models only work with qa, and BERT only works with fc).

Also note that in the paper, we sample locality data from different datasets depending on the model. By default, training will use Natural Questions data (not zsRE data) for computing drawdown in the qa experiment and OpenWebText. For models such as the distilgpt2 model we use (which was fine-tuned on wikitext) or the BART-base model, this behavior should be disabled with data.wiki_webtext=False or data.zsre_nq=False, respectively.

Citing the paper

If this code or paper was useful, please consider using the following citation:

@article{mitchell2021fast,
    title={Fast Model Editing at Scale},
    author={Mitchell, Eric and Lin, Charles and Bosselut, Antoine and Finn, Chelsea and Manning, Chris}
    year={2021}
}
Comments
  • not being able to install the requirements

    not being able to install the requirements

    Hi I am getting the following error when installing the requirements with pip install -r requirements.txt

    Collecting git+git://github.com/eric-mitchell/higher@master (from -r requirements.txt (line 7)) Cloning git://github.com/eric-mitchell/higher (to revision master) to /tmp/pip-req-build-z1v0erey Running command git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey fatal: unable to connect to github.com: github.com[0: 192.30.255.113]: errno=Connection timed out

    error: subprocess-exited-with-error

    × git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey did not run successfully. │ exit code: 128 ╰─> See above for output.

    note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

    × git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey did not run successfully. │ exit code: 128 ╰─> See above for output.

    note: This error originates from a subprocess, and is likely not a problem with pip.

    opened by cameronalonso2 2
  • About the data file in google drive

    About the data file in google drive

    Hi, I downloaded the zip files from google drive but failed to unzip them.

    Archive:  10token.zip
      End-of-central-directory signature not found.  Either this file is not
      a zipfile, or it constitutes one disk of a multi-part archive.  In the
      latter case the central directory and zipfile comment will be found on
      the last disk(s) of this archive.
    unzip:  cannot find zipfile directory in one of 10token.zip or
            10token.zip.zip, and cannot find 10token.zip.ZIP, period.
    
    opened by xwwwwww 2
  • Why Choose QA Task in Batched edits with MEND and ENN?

    Why Choose QA Task in Batched edits with MEND and ENN?

    Editing tasks have three categories: binary classification, QA and generation. In Batched edits, Why choose QA? And What is the result under finetune(FT)?

    Thanks!

    opened by sev777 1
  • Can't find 10token data

    Can't find 10token data

    Hi, I'm trying to run python -m run +alg=mend +experiment=gen +model=distilgpt2 data.wiki_webtext=False, but I get a file not found error for data/10token/data/self_sample/train.json. I downloaded 10token from the linked google drive folder and unzipped it to data/10token. However, when I unzip it, all I get is a single 10token file, no train.json. Not sure if I'm missing something here. Thanks!

    opened by salemohamedo 1
  • How to find the define of __x__ and __delta__ ?

    How to find the define of __x__ and __delta__ ?

    Hi, Mr.Eric, I find the code : param_idx = lambda n, p: self.shape_dict[self.get_shape(p)].index(n) if self.config.mend.shared else None # noqa: E731 transformed_factors = { n: self.mend[str(tuple(self.get_shape(p)))](p.__x__, p.__delta__, param_idx(n, p)) for n, p in _inner_params(self.model.named_parameters(), self.config.model.inner_params) } I want to know where can I find the define of x and delta in code ? Is it in torch or python? Thank you!

    opened by sev777 1
  • A post about your good repository in the medium

    A post about your good repository in the medium

    opened by ahkarami 1
  • How to apply this methods to a new LMs without fine-tune?

    How to apply this methods to a new LMs without fine-tune?

    Hi, In your paper, does the model edited in MEND methods must be fine-tuned? How to apply the MEND to the model without fine-tuning ? When I try to do this, the acc_train is always 0.

    Thanks.

    opened by sev777 0
Owner
Eric Mitchell
PhD Student at Stanford University
Eric Mitchell
A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Intro PyTorch implementation of Learning to learn by gradient descent by gradient descent. Run python main.py TODO Initial implementation Toy data LST

Ilya Kostrikov 300 Dec 11, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

snn-localization repo PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch. Install Dependencies Orig

Sami BARCHID 1 Jan 6, 2022
DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

DeepLM DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021) Run Please install th

Jingwei Huang 130 Dec 2, 2022
Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

Cancer-and-Tumor-Detection-Using-Inception-model In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks

Deepak Nandwani 1 Jan 1, 2022
Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Core ML Tools Use coremltools to convert machine learning models from third-party libraries to the Core ML format. The Python package contains the sup

Apple 3k Jan 8, 2023
Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

OpenAI 2.9k Jan 4, 2023
Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Protein GLM (wip) Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capabil

Phil Wang 17 May 6, 2022
Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

Yam Peleg 63 Sep 21, 2022
On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

Facebook Research 46 Dec 15, 2022
Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

Sami Abu-El-Haija 14 Nov 25, 2021
[ICLR 2021] Is Attention Better Than Matrix Decomposition?

Enjoy-Hamburger ?? Official implementation of Hamburger, Is Attention Better Than Matrix Decomposition? (ICLR 2021) Under construction. Introduction T

Gsunshine 271 Dec 29, 2022
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

Keon Lee 114 Dec 12, 2022
official code for dynamic convolution decomposition

Revisiting Dynamic Convolution via Matrix Decomposition (ICLR 2021) A pytorch implementation of DCD. If you use this code in your research please cons

Yunsheng Li 110 Nov 23, 2022
Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

DTI-Sprites Pytorch implementation of "Unsupervised Layered Image Decomposition into Object Prototypes" paper Check out our paper and webpage for deta

null 40 Dec 22, 2022
Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

UCL Natural Language Processing 71 Dec 29, 2022
NeRD: Neural Reflectance Decomposition from Image Collections

NeRD: Neural Reflectance Decomposition from Image Collections Project Page | Video | Paper | Dataset Implementation for NeRD. A novel method which dec

Computergraphics (University of Tübingen) 195 Dec 29, 2022
This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

刘彦超 34 Nov 30, 2022
Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

null 28 Dec 2, 2022