An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

Overview

GLOM TensorFlow Twitter

PyPI Flake8 Lint Upload Python Package Python Version

Binder Open In Colab

GitHub license PEP8 GitHub stars GitHub followers Twitter Follow

This Python package attempts to implement GLOM in TensorFlow, which allows advances made by several different groups transformers, neural fields, contrastive representation learning, distillation and capsules to be combined. This was suggested by Geoffrey Hinton in his paper "How to represent part-whole hierarchies in a neural network".

Further, Yannic Kilcher's video and Phil Wang's repo was very helpful for me to implement this project.

Installation

Run the following to install:

pip install glom-tf

Developing glom-tf

To install glom-tf, along with tools you need to develop and test, run the following in your virtualenv:

git clone https://github.com/Rishit-dagli/GLOM-TensorFlow.git
# or clone your own fork

cd GLOM-TensorFlow
pip install -e .[dev]

A bit about GLOM

The GLOM architecture is composed of a large number of columns which all use exactly the same weights. Each column is a stack of spatially local autoencoders that learn multiple levels of representation for what is happening in a small image patch. Each autoencoder transforms the embedding at one level into the embedding at an adjacent level using a multilayer bottom-up encoder and a multilayer top-down decoder. These levels correspond to the levels in a part-whole hierarchy.

Interactions among the 3 levels in one column

An example shared by the author was as an example when show a face image, a single column might converge on embedding vectors representing a nostril, a nose, a face, and a person.

At each discrete time and in each column separately, the embedding at a level is updated to be the weighted average of:

  • bottom-up neural net acting on the embedding at the level below at the previous time
  • top-down neural net acting on the embedding at the level above at the previous time
  • embedding vector at the previous time step
  • attention-weighted average of the embeddings at the same level in nearby columns at the previous time

For a static image, the embeddings at a level should settle down over time to produce similar vectors.

A picture of the embeddings at a particular time

Usage

from glomtf import Glom

model = Glom(dim = 512,
             levels = 5,
             image_size = 224,
             patch_size = 14)

img = tf.random.normal([1, 3, 224, 224])
levels = model(img, iters = 12) # (1, 256, 5, 12)
# 1 - batch
# 256 - patches
# 5 - levels
# 12 - dimensions

Use the return_all = True argument to get all the column and level states per iteration. This also gives you access to all the level data across iterations for clustering, from which you can inspect the islands too.

from glomtf import Glom

model = Glom(dim = 512,
             levels = 5,
             image_size = 224,
             patch_size = 14)

img = tf.random.normal([1, 3, 224, 224])
all_levels = model(img, iters = 12, return_all = True) # (13, 1, 256, 5, 12)
# 13 - time

# top level outputs after iteration 6
top_level_output = all_levels[7, :, :, -1] # (1, 256, 512)
# 1 - batch
# 256 - patches
# 512 - dimensions

Want to Contribute πŸ™‹β€β™‚οΈ ?

Awesome! If you want to contribute to this project, you're always welcome! See Contributing Guidelines. You can also take a look at open issues for getting more information about current or upcoming tasks.

Want to discuss? πŸ’¬

Have any questions, doubts or want to present your opinions, views? You're always welcome. You can start discussions.

Citations

@misc{hinton2021represent,
    title   = {How to represent part-whole hierarchies in a neural network}, 
    author  = {Geoffrey Hinton},
    year    = {2021},
    eprint  = {2102.12627},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

License

Copyright 2020 Rishit Dagli

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
You might also like...
Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images
Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter
The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

FAPIS The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter Introduction This repo is primari

Utility tools for the
Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

Divide and Remaster Utility Tools Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper The DnR d

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud
Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud This repository contains a reference implementation of our Part-Aware Data Augment

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646
Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

Towards Part-Based Understanding of RGB-D Scans
Towards Part-Based Understanding of RGB-D Scans

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021) We propose the task of part-based scene understanding of real-world 3D environments: from

Kaggle | 9th place (part of) solution for the Bristol-Myers Squibb – Molecular Translation challenge

Part of the 9th place solution for the Bristol-Myers Squibb – Molecular Translation challenge translating images containing chemical structures into I

TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.
TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.
Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Minesweeper-AI Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweep

Comments
Releases(v0.1.1)
Owner
Rishit Dagli
High School,TEDx,2xTED-Ed speaker | International Speaker | Microsoft Student Ambassador | Mentor, @TFUGMumbai | Organize @KotlinMumbai
Rishit Dagli
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

GLOM - Pytorch (wip) An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding,

Phil Wang 173 Dec 14, 2022
Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

Shuyang Sun 117 Dec 11, 2022
PyTorch implementation of GLOM

GLOM PyTorch implementation of GLOM, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attent

Yeonwoo Sung 20 Aug 17, 2022
PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

Saim Wani 4 May 8, 2022
Code for Emergent Translation in Multi-Agent Communication

Emergent Translation in Multi-Agent Communication PyTorch implementation of the models described in the paper Emergent Translation in Multi-Agent Comm

Facebook Research 75 Jul 15, 2022
A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

ParallelFold Author: Bozitao Zhong This is a modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (p

Bozitao Zhong 77 Dec 22, 2022
Replication attempt for the Protein Folding Model

RGN2-Replica (WIP) To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding f

Eric Alcaide 36 Nov 29, 2022
Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow 18 Oct 6, 2022
HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation Official PyTroch implementation of HPRNet. HPRNet: Hierarchical Point Regre

Nermin Samet 53 Dec 4, 2022
minimizer-space de Bruijn graphs (mdBG) for whole genome assembly

rust-mdbg: Minimizer-space de Bruijn graphs (mdBG) for whole-genome assembly rust-mdbg is an ultra-fast minimizer-space de Bruijn graph (mdBG) impleme

Barış Ekim 148 Dec 1, 2022