Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

wlad

Last update: Dec 19, 2022

Related tags

Overview

The dual numbers can do efficient autodiff!

The codual numbers are a simple method of doing automatic differentiation in reverse mode. They contrast with the dual numbers which provide an easy way of doing automatic differentiation in forward mode. The difference between the two modes is that sometimes one is faster than the other.

The folklore appears to be that forward mode autodiff is easy to implement because it can be done using the beautiful algebra of dual numbers, while the same is assumed to not be the case for reverse mode. This repository presents a counterargument that a variant of the dual numbers – called the codual numbers – can be used to represent an implementation of reverse mode autodiff that is just as elegant and terse as can be done for forward mode. This idea was first suggested by Sandro Magi (pseudonym: Naasking).

This implementation of the codual numbers differs from Sandro Magi’s by using simple memoisation to eliminate the exponential worst-case behaviour he encountered. In Magi’s original implementation, this idea seems obscured, largely because the code was more effectful and therefore the opportunity for memoisation was less apparent. The memoisation is achieved using only one additional line of code!

This implementation should be simpler and more transparent than Magi’s, I hope. It also suggests that Magi’s reasoning behind the term “codual numbers” is perhaps misleading.

Definition of dual number and codual number

The codual numbers are the set

$\mathbb R \times \mathbb R,$

while the codual numbers are a subset of

$\mathbb R \times \mathbb R ^ {\mathbb R}$

where the second component is always a linear map.

A notation that’s used to write a dual number is $a + b \varepsilon$ , which stands for . Formally, $\varepsilon^2 = 0$ while $\varepsilon \neq 0$ .

The codual numbers may be represented using exactly the same notation as the dual numbers. They are no different than the dual numbers, except in how they’re represented on a computer! Using lambda calculus notation (which I assume you are familiar with) any dual number can be turned into the codual number $(a, \lambda k. \,kb)$ , and conversely every codual number can be turned into the dual number . The difference is merely one of data structure; we need a closure to represent the codual numbers.

The definition of an operation on the codual numbers can be inferred from its definition on the dual numbers. We demonstrate this using multiplication. For dual numbers, we may define multiplication by:

$(a,a') \times (b,b') = (ab, ab' + ba').$

For the codual numbers, we may use the correspondence $(a,b') \mapsto (a, \lambda k. \,kb)$ to get:

$(a,A) \times (b,B) = (ab, \lambda k. \,k\cdot(a\cdot B(1) + b\cdot A(1))),$

where by “ $\cdot$ ”, we mean multiplication of real numbers. Using the fact that and are linear maps, we can rearrange this to:

$(a,A) \times (b,B) = (ab, \lambda k. \,B(ak) + A(bk))).$

This is precisely how we define multiplication of codual numbers in the code.

Relationship with other autodiff strategies

It appears that there are three ways of doing reverse-mode autodiff, which correspond directly to the three “stages” of solving a problem using dynamic programming. See the table below:

Idea	Example	Corresponding autodiff algorithm
Unmemoised recursion	Exhibit A	Unmemoised coduals
Memoised recursion, or top-down dynamic programming	Exhibit B	Memoised coduals
Bottom-up dynamic programming	Exhibit C	Tape-based autodiff

This suggests that the tape-based approach can be derived from the coduals.

Exhibit A:

def fib(n):
    if n == 0 or n == 1:
        return n
    else:
        return fib(n-1) + fib(n-2)

Exhibit B:

from functools import cache

@cache
def fib(n):
    if n == 0 or n == 1:
        return n
    else:
        return fib(n-1) + fib(n-2)

Exhibit C:

def fib(n):
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a + b
    return a

Generalized hybrid model for mode-locked laser diodes with an extended passive cavity

GenHybridMLLmodel Generalized hybrid model for mode-locked laser diodes with an extended passive cavity This hybrid simulation strategy combines a tra

3 Sep 21, 2022

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes Introduction This is the unofficial code of Deep Dual-re

113 Dec 23, 2022

PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll

7 Feb 10, 2022

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Rayvens augments Ray with events. With Rayvens, Ray applications can subscribe to event streams, process and produce events. Rayvens leverages Apache

32 Dec 25, 2022

Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

ARAPReg Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators.. Installation The cod

132 Nov 28, 2022

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

mtomo Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation.

24 Mar 2, 2022

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least complexity possible

2 Aug 1, 2022

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

pyrelational is a python active learning library developed by Relation Therapeutics for rapidly implementing active learning pipelines from data management, model development (and Bayesian approximation), to creating novel active learning strategies.

95 Dec 27, 2022

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

3 Jan 3, 2023

Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

Related tags

Overview

The dual numbers can do efficient autodiff!

Definition of dual number and codual number

Relationship with other autodiff strategies

You might also like...

Generalized hybrid model for mode-locked laser diodes with an extended passive cavity

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Owner

wlad

This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

This project deploys a yolo fastest model in the form of tflite on raspberry 3b+. The model is from another repository of mine called -Trash-Classification-Car

Research shows Google collects 20x more data from Android than Apple collects from iOS. Block this non-consensual telemetry using pihole blocklists.

Custom TensorFlow2 implementations of forward and backward computation of soft-DTW algorithm in batch mode.

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.