Neural Architecture Search Powered by Swarm Intelligence 🐜

Overview

Neural Architecture Search Powered by Swarm Intelligence 🐜

DeepSwarm

DeepSwarm is an open-source library which uses Ant Colony Optimization to tackle the neural architecture search problem. The main goal of DeepSwarm is to automate one of the most tedious and daunting tasks, so people can spend more of their time on more important and interesting things. DeepSwarm offers a powerful configuration system which allows you to fine-tune the search space to your needs.

Example πŸ–Ό

from deepswarm.backends import Dataset, TFKerasBackend
from deepswarm.deepswarm import DeepSwarm

dataset = Dataset(training_examples=x_train, training_labels=y_train, testing_examples=x_test, testing_labels=y_test)
backend = TFKerasBackend(dataset=dataset)
deepswarm = DeepSwarm(backend=backend)
topology = deepswarm.find_topology()
trained_topology = deepswarm.train_topology(topology, 50)

Installation πŸ’Ύ

  1. Install the package

    pip install deepswarm
  2. Install one of the implemented backends that you want to use

    pip install tensorflow-gpu==1.13.1

Usage πŸ•Ή

  1. Create a new file containing the example code

    touch train.py
  2. Create settings directory which contains default.yaml file. Alternatively you can run the script and instantly stop it, as this should automatically create settings directory which contains default.yaml file

  3. Update the newly created YAML file to your dataset needs. The only two important changes you must make are: (1) change the loss function to reflect your task (2) change the shape of input and output nodes

Search πŸ”Ž

(1) The ant is placed on the input node. (2) The ant checks what transitions are available. (3) The ant uses the ACS selection rule to choose the next node. (4) After choosing the next node the ant selects node’s attributes. (5) After all ants finished their tour the pheromone is updated. (6) The maximum allowed depth is increased and the new ant population is generated.

Note: Arrow thickness indicates the pheromone amount, meaning that thicker arrows have more pheromone.

Configuration πŸ› 

Node type Attributes
Input shape: tuple which defines the input shape, depending on the backend could be (width, height, channels) or (channels, width, height).
Conv2D filter_count: defines how many filters can be used.
kernel_size: defines what size kernels can be used. For example, if it is set to [1, 3], then only 1x1 and 3x3 kernels will be used.
activation: defines what activation functions can be used. Allowed values are: ReLU, ELU, LeakyReLU, Sigmoid and Softmax.
Dropout rate: defines the allowed dropout rates. For example, if it is set to [0.1, 0.3], then either 10% or 30% of input units will be dropped.
BatchNormalization -
Pool2D pool_type: defines the types of allowed pooling nodes. Allowed values are: max (max pooling) and average (average pooling).
pool_size: defines the allowed pooling window sizes. For example, if it is set to [2], then only 2x2 pooling windows will be used.
stride: defines the allowed stride sizes.
Flatten -
Dense output_size: defines the allowed output space dimensionality.
activation: defines what activation functions can be used. Allowed values are: ReLU, ELU, LeakyReLU, Sigmoid and Softmax.
Output output_size: defines the output size (how many different classes to classify).
activation: defines what activation functions can be used. Allowed value are ReLU, ELU, LeakyReLU, Sigmoid and Softmax.
Setting Description
save_folder Specifies the name of the folder which should be used to load the backup. If not specified the search will start from zero.
metrics Specifies what metrics should algorithm use to evaluate the models. Currently available options are: accuracy and loss.
max_depth Specifies the maximum allowed network depth (how deeply the graph can be expanded). The search is performed until the maximum depth is reached. However, it does not mean that the depth of the best architecture will be equal to the max_depth.
reuse_patience Specifies the maximum number of times that weights can be reused without improving the cost. For example, if it is set to 1 it means that when some model X reuses weights from model Y and model X cost did not improve compared to model Y, next time instead of reusing model Y weights, new random weights will be generated.
start Specifies the starting pheromone value for all the new connections.
decay Specifies the local pheromone decay rate in percentage. For example, if it is set to 0.1 it means that during the local pheromone update the pheromone value will be decreased by 10%.
evaporation Specifies the global pheromone evaporation rate in percentage. For example, if it is set to 0.1 it means that during the global pheromone update the pheromone value will be decreased by 10%.
greediness Specifies how greedy should ants be during the edge selection (the number is given in percentage). For example, 0.5 means that 50% of the time when ant selects a new edge it should select the one with the highest associated probability.
ant_count Specifies how many ants should be generated during each generation (time before the depth is increased).
epochs Specifies for how many epochs each candidate architecture should be trained.
batch_size Specifies the batch size (number of samples used to calculate a single gradient step) used during the training process.
patience Specifies the early stopping number used during the training (after how many epochs when the cost is not improving the training process should be stopped).
loss Specifies what loss function should be used during the training. Currently available options are sparse_categorical_crossentropy and categorical_crossentropy.
spatial_nodes Specifies which nodes are placed before the flattening node. Values in this array must correspond to node names.
flat_nodes Specifies which nodes are placed after the flattening node (array should also include the flattening node). Values in this array must correspond to node names.
verbose Specifies if the associated component should log the output.

Future goals 🌟

  • Add a node which can combine the input from the two previous nodes.
  • Add a node which can skip the depth n in order to connect to the node in depth n+1.
  • Delete the models which are not referenced anymore.
  • Add an option to assemble the best n models into one model.
  • Add functionality to reuse the weights from the non-continues blocks, i.e. take the best weights for depth n-1 from one model and then take the best weights for depth n+1 from another model.

Citation πŸ–‹

Online version is available at: arXiv:1905.07350

@article{byla2019deepswarm,
  title   =  {DeepSwarm: Optimising Convolutional Neural Networks using Swarm Intelligence},
  author  =  {Edvinas Byla and Wei Pang},
  journal =  {arXiv preprint arXiv:1905.07350},
  year    =  {2019}
}

Acknowledgments πŸŽ“

DeepSwarm was developed under the supervision of Dr Wei Pang in partial fulfilment of the requirements for the degree of Bachelor of Science of the University of Aberdeen.

Comments
  • Out of memory when saving object

    Out of memory when saving object

        def save_object(self, data, name):
            """Saves given object to the object backup directory.
    
            Args:
                data: object that needs to be saved.
                name: string value representing the name of the object.
            """
    
            with open(str(self.current_path / Storage.DIR["OBJECT"] / name), 'wb') as f:
                pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
    

    this yeilds a MemoryError when trying to do the pickle. It takes up at least 20GB of RAM. My model is fairly small. See log excerpt following:

    -------------------------------NEW BEST ANT FOUND-------------------------------
    ---------------------------BEST ANT DURING ITERATION----------------------------
    ======= 
     Ant: 0x5eaa4358 
     Loss: 0.246397 
     Accuracy: 0.908695 
     Path: InputNode(shape:(256, 256, 1)) -> Conv2DNode(kernel_size:3, filter_count:64, activation:ReLU) -> FlattenNode() -> OutputNode(output_size:1, activation:Sigmoid) 
     Hash: 0f8bcb2b4fbd88fb3c5ffca7be08add054c8d35df1f122f4db0f043a91f7c901 
    =======
    
    opened by isaacgerg 9
  • Error after generating ant 1

    Error after generating ant 1

    I get the following error everytime after generating ant 1

    builtins.ValueError: Tensor("Adam/iterations:0", shape=(), dtype=resource) must be from the same graph as Tensor("training/Adam/Const:0", shape=(), dtype=int64).

    Prior output is:

    ======= Ant: 0x5f7a92e8 Loss: 0.531042 Accuracy: 0.750683 Path: InputNode(shape:(256, 256, 1)) -> Conv2DNode(kernel_size:1, filter_count:64, activation:ReLU) -> FlattenNode() -> OutputNode(output_size:1, activation:Sigmoid) Hash: 2510cfe0ba8648855dc73f4c2cb8e7ff75878eeee906c211c51f470bc6ff0547

    ---------------------------Current search depth is 1---------------------------- --------------------------------GENERATING ANT 1-------------------------------- Train on 5270 samples, validate on 586 samples

    opened by isaacgerg 6
  •  builtins.NotImplementedError: numpy() is only available when eager execution is enabled.

    builtins.NotImplementedError: numpy() is only available when eager execution is enabled.

    Got this error after running a bit more.

    Stack:

    File "D:\ASASINATR\projects\muscle_deep_swarm\trainer.py", line 332, in main() File "D:\ASASINATR\projects\muscle_deep_swarm\trainer.py", line 325, in main topology = deepswarm.find_topology() File "c:\python35\Lib\site-packages\deepswarm\deepswarm.py", line 43, in find_topology best_ant = self.aco.search() File "c:\python35\Lib\site-packages\deepswarm\aco.py", line 61, in search self.storage.perform_backup() File "c:\python35\Lib\site-packages\deepswarm\storage.py", line 67, in perform_backup self.save_object(self.deepswarm, Storage.ITEM["BACKUP"]) File "c:\python35\Lib\site-packages\deepswarm\storage.py", line 213, in save_object pickle.dump(data, f, pickle.HIGHEST_PROTOCOL) File "c:\python35\Lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 923, in reduce return (ResourceVariable, (self.numpy(),)) File "c:\python35\Lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 687, in numpy "numpy() is only available when eager execution is enabled.")

    builtins.NotImplementedError: numpy() is only available when eager execution is enabled.

    opened by isaacgerg 4
  • Readme should indicate this work only in python 3.6 and greater

    Readme should indicate this work only in python 3.6 and greater

    WindowsPath doesnt integrate nicely into open until python 3.6.

    EDIT 1: you could also wrap the open(windowsPath) methods with open(str(windowsPath))

    opened by isaacgerg 2
  • Is

    Is "SPMT" spearmint?

    Since it is not referenced, so I have this question in the paper.

    If so that would be very amazing, the comparison between PSO and Bayesian Optimization has been validated in SigOpt's paper where two of them were very close instead of such a big gap in your paper. However, they didn't optimize topology but only the number of hidden units. I think maybe this is the main reason why NAS requires a different techniques than traditional ML where topology is hard-wired in the model itself.

    opened by pswpswpsw 1
  • Pytorch implementation

    Pytorch implementation

    First up, thanks for this wonderfully structured code. Works perfectly with Tensorflow 1.13.1

    I am currently working on adapting this for time series data for my use-case. In the backends.py there is a provision to extend this to other backends than TF. Is there a PyTorch implementation available for the same?

    opened by tushara21 1
  • Bump pyyaml from 5.1 to 5.4

    Bump pyyaml from 5.1 to 5.4

    Bumps pyyaml from 5.1 to 5.4.

    Changelog

    Sourced from pyyaml's changelog.

    5.4 (2021-01-19)

    5.3.1 (2020-03-18)

    • yaml/pyyaml#386 -- Prevents arbitrary code execution during python/object/new constructor

    5.3 (2020-01-06)

    5.2 (2019-12-02)

    • Repair incompatibilities introduced with 5.1. The default Loader was changed, but several methods like add_constructor still used the old default yaml/pyyaml#279 -- A more flexible fix for custom tag constructors yaml/pyyaml#287 -- Change default loader for yaml.add_constructor yaml/pyyaml#305 -- Change default loader for add_implicit_resolver, add_path_resolver
    • Make FullLoader safer by removing python/object/apply from the default FullLoader yaml/pyyaml#347 -- Move constructor for object/apply to UnsafeConstructor
    • Fix bug introduced in 5.1 where quoting went wrong on systems with sys.maxunicode <= 0xffff yaml/pyyaml#276 -- Fix logic for quoting special characters
    • Other PRs: yaml/pyyaml#280 -- Update CHANGES for 5.1

    5.1.2 (2019-07-30)

    • Re-release of 5.1 with regenerated Cython sources to build properly for Python 3.8b2+

    ... (truncated)

    Commits
    • 58d0cb7 5.4 release
    • a60f7a1 Fix compatibility with Jython
    • ee98abd Run CI on PR base branch changes
    • ddf2033 constructor.timezone: _copy & deepcopy
    • fc914d5 Avoid repeatedly appending to yaml_implicit_resolvers
    • a001f27 Fix for CVE-2020-14343
    • fe15062 Add 3.9 to appveyor file for completeness sake
    • 1e1c7fb Add a newline character to end of pyproject.toml
    • 0b6b7d6 Start sentences and phrases for capital letters
    • c976915 Shell code improvements
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

    ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

    When I try to run the program I get following error:


    ValueError Traceback (most recent call last) in 2 import tensorflow as tf 3 ----> 4 from deepswarm.backends import Dataset, TFKerasBackend 5 from deepswarm.deepswarm import DeepSwarm 6

    ~\AppData\Local\Programs\Python\Python38\Scripts\DeepSwarm-master\deepswarm\backends.py in 7 8 from abc import ABC, abstractmethod ----> 9 from sklearn.model_selection import train_test_split 10 from tensorflow.keras import backend as K 11

    c:\users\user\appdata\local\programs\python\python38\lib\site-packages\sklearn_init_.py in 62 else: 63 from . import __check_build ---> 64 from .base import clone 65 from .utils._show_versions import show_versions 66

    c:\users\user\appdata\local\programs\python\python38\lib\site-packages\sklearn\base.py in 12 from scipy import sparse 13 from .externals import six ---> 14 from .utils.fixes import signature 15 from .utils import _IS_32BIT 16 from . import version

    c:\users\user\appdata\local\programs\python\python38\lib\site-packages\sklearn\utils_init_.py in 10 from scipy.sparse import issparse 11 ---> 12 from .murmurhash import murmurhash3_32 13 from .class_weight import compute_class_weight, compute_sample_weight 14 from . import _joblib

    init.pxd in init sklearn.utils.murmurhash()

    ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

    I use: python 3.8 numpy 1.20.1

    I would appreciate your help.

    opened by Billnewgate327 12
  • 1D Option?

    1D Option?

    Hi, thanks for your awesome tool! It would be great if there were an option to train models for sequences or 1d inputs - will there be a future version with 1D CNN layer options? Thanks!

    Type: Enhancement Status: Accepted Priority: Low 
    opened by collinskatie 1
  • [feature request] Add new evaluate_model function which can return a more generalized metric

    [feature request] Add new evaluate_model function which can return a more generalized metric

    In evaluate_model, the code below can be used to return metrics which can only be computed on all of the data as opposed to averaged by batches as currently done. For simplicity, you can set numThreads and qSize to 1.

    def evaluate(model, generator, steps, numThreads=2, qSize=5):
        numItemsPushed_predict = 0
        dataQueue = queue.Queue(maxsize=qSize)
        mutex = threading.Lock()
    
        def producer(steps):
            nonlocal numItemsPushed_predict
            killMe = False
            while True:
                mutex.acquire()
                if numItemsPushed_predict < steps:
                    numItemsPushed_predict += 1
                else:
                    killMe = True
                myUid = numItemsPushed_predict
                mutex.release()
                if killMe:
                    return
                #
                x, y = generator.next(myUid-1)
                dataQueue.put((x,y,myUid-1))
                #
            #
        #
    
        tVec = []
        for k in range(numThreads):
            t = threading.Thread(target=producer, args=(steps,))
            t.daemon = True
            t.start()
            tVec.append(t)
    
        resultVec = []
        batchSize = None
        pBar = tqdm.tqdm(range(steps), desc='EVALUATE')
        for k in pBar:
            currentQSize = dataQueue.qsize()
            item = dataQueue.get()
            x = item[0]
            y = item[1]
            uid = item[2] # For debug
            if batchSize is None:
                if type(x) is list:
                    batchSize = x[0].shape[0]
                else:
                    batchSize = x.shape[0]
                #
                resultVec = np.zeros(steps)
            r = model.evaluate(x, y, batch_size = batchSize, verbose=0)
            resultVec[k] = r
            #if type(y_pred) is list:
            #    predVec[k*batchSize : (k+1)*batchSize] = y_pred[0].flatten()
            #else:
            #    predVec[k*batchSize : (k+1)*batchSize] = y_pred.flatten()
            pBar.set_description('EVALUATE | QSize: {0}/{1}'.format(currentQSize, qSize))
        #
    
        return resultVec
    

    evaluate_model(self, model) becomes:

    y_true, y_pred = evaluate(model, data -- will have to convert from generator (easy), 1, 1) loss = lossFunction(y_true, y_pred) accuracy can be computed from sklearn.accuracy_score

    You could also support losses like AUC now.

    Type: Enhancement Status: Accepted Priority: Medium 
    opened by isaacgerg 3
Releases(v0.0.9)
Owner
null
Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

DenseNAS The code of the CVPR2020 paper Densely Connected Search Space for More Flexible Neural Architecture Search. Neural architecture search (NAS)

Jamin Fong 291 Nov 18, 2022
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

null 19 Sep 29, 2022
Meta graph convolutional neural network-assisted resilient swarm communications

Resilient UAV Swarm Communications with Graph Convolutional Neural Network This repository contains the source codes of Resilient UAV Swarm Communicat

null 62 Dec 6, 2022
A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries.

Yolo-Powered-Detector A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries

Luke Wilson 1 Dec 3, 2021
School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

F-Principle This is an exercise problem of the digital signal processing (DSP) course at School of Artificial Intelligence at the Nanjing University (

Thyrix 5 Nov 23, 2022
DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

What is DeepHyper? DeepHyper is a software package that uses learning, optimization, and parallel computing to automate the design and development of

DeepHyper Team 214 Jan 8, 2023
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model architecture for their classification problems (i.e., DNNs with different types of layers).

Google 3.2k Dec 31, 2022
Randstad Artificial Intelligence Challenge (powered by VGEN). Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato

Randstad Artificial Intelligence Challenge (powered by VGEN) Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato Struttura director

Stefano Fiorucci 1 Nov 13, 2021
πŸ”₯ Cannlytics-powered artificial intelligence πŸ€–

Cannlytics AI ?? Cannlytics-powered artificial intelligence ?? ??️ Installation ??‍♀️ Quickstart ?? Development ?? Automation ?? Support ??️ License ?

Cannlytics 3 Nov 11, 2022
Building Ellee β€” A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors, I built Elleeβ€Š-β€Ša robotic teddy bear who can move her head and converse naturally.

null 24 Oct 26, 2022
A research toolkit for particle swarm optimization in Python

PySwarms is an extensible research toolkit for particle swarm optimization (PSO) in Python. It is intended for swarm intelligence researchers, practit

Lj Miranda 1k Dec 30, 2022
A 3D Dense mapping backend library of SLAM based on taichi-Lang designed for the aerial swarm.

TaichiSLAM This project is a 3D Dense mapping backend library of SLAM based Taichi-Lang, designed for the aerial swarm. Intro Taichi is an efficient d

XuHao 230 Dec 19, 2022
Implemented fully documented Particle Swarm Optimization algorithm (basic model with few advanced features) using Python programming language

Implemented fully documented Particle Swarm Optimization (PSO) algorithm in Python which includes a basic model along with few advanced features such as updating inertia weight, cognitive, social learning coefficients and maximum velocity of the particle.

null 9 Nov 29, 2022
Racing line optimization algorithm in python that uses Particle Swarm Optimization.

Racing Line Optimization with PSO This repository contains a racing line optimization algorithm in python that uses Particle Swarm Optimization. Requi

Parsa Dahesh 6 Dec 14, 2022
Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

null 139 Jan 1, 2023
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [PDF] Wuyang Chen, Xinyu Gong, Zhangyang Wang In ICLR 2

VITA 156 Nov 28, 2022
[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark Accepted as a spotlight paper at ICLR 2021. Table of content File structure Prerequi

null 72 Jan 3, 2023
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

BossNAS This repository contains PyTorch evaluation code, retraining code and pretrained models of our paper: BossNAS: Exploring Hybrid CNN-transforme

Changlin Li 127 Dec 26, 2022
Deep Multimodal Neural Architecture Search

MMNas: Deep Multimodal Neural Architecture Search This repository corresponds to the PyTorch implementation of the MMnas for visual question answering

Vision and Language Group@ MIL 23 Dec 21, 2022