TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Aston Zhang

Last update: Oct 26, 2022

Related tags

Deep Learning Parameterization-of-Hypercomplex-Multiplications

Overview

Parameterization of Hypercomplex Multiplications (PHM)

This repository contains the TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication) layers and PHM-Transformers in the paper Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters at ICLR 2021.

Installation

One may install the following libraries before running our code:

tensorflow-gpu (1.14.0)
tensor2tensor (1.14.0)

Usage

The usage of this repository follows the original tensor2tensor repository (e.g., t2t-datagen, t2t-trainer, t2t-avg-all, followed by t2t-decoder). It helps to gain familiarity on tensor2tensor before attempting to run our code. Specifically, setting --t2t_usr_dir=./Parameterization-of-Hypercomplex-Multiplications will allow tensor2tensor to register PHM-Transformers.

Training

For example, to evaluate PHM-Transformer (n=4) on the En-Vi machine translation task (t2t-datagen --problem=translate_envi_iwslt32k), one may set the following flags when training:

t2t-trainer \
--problem=translate_envi_iwslt32k \
--model=light_transformer \
--hparams_set=light_transformer_base_single_gpu \
--hparams="light_mode='random',hidden_size=512,factor=4" \
--train_steps=50000

where light_transformer with light_mode='random' is the alias of the PHM-Transformer in our implementation.

Aggretating Checkpoints

After training, the latest 8 checkpoints are averaged:

t2t-avg-all --model_dir $TRAIN_DIR --output_dir $AVG_DIR --n 8

where $TRAIN_DIR and $AVG_DIR need to be specified by users.

Testing

To decode the target sequence, one has to additionally set the decode_hparams as follows:

t2t-decoder \
--decode_hparams="beam_size=5,alpha=0.6"

Then t2t-bleu is invoked for calculating the BLEU.

PHM Implementations

PHM is implemented with operations in make_random_mul and random_ffn, which are mathematically equivalent to sum of Kronecker products.

Among works that use PHM, some have offered alternative PHM implementations:

Citation

If you find this repository helpful, please cite our paper:

@inproceedings{zhang2021beyond,
  title={Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters},
  author={Zhang, Aston and Tay, Yi and Zhang, Shuai and Chan, Alvin and Luu, Anh Tuan and Hui, ‪Siu Cheung and Fu, Jie},
  booktitle={International Conference on Learning Representations},
  year={2021}
}

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

14 Nov 25, 2021

StyleGAN2 - Official TensorFlow Implementation

10.1k Dec 28, 2022

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

44 Oct 20, 2022

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Related tags

Overview

Parameterization of Hypercomplex Multiplications (PHM)

Installation

Usage

Training

Aggretating Checkpoints

Testing

PHM Implementations

Citation

You might also like...

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

StyleGAN2 - Official TensorFlow Implementation

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

Unofficial Implementation of MLP-Mixer in TensorFlow

Tensorflow implementation for Self-supervised Graph Learning for Recommendation

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Owner

Aston Zhang

Spectral Tensor Train Parameterization of Deep Learning Layers

Joint parameterization and fitting of stroke clusters

CVPR 2022 "Online Convolutional Re-parameterization"

[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy.

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow