LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Sujith Vishwajith

Last update: Nov 27, 2022

Related tags

Deep Learning Deep-Leafsnap

Overview

Deep-Leafsnap

Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhevsky, et al. in their famous paper ImageNet Classification with Deep Convolutional Neural Networks. Famous models such as AlexNet, VGG-16, ResNet-50, etc. have scored state of the art results on image classfication datasets such as ImageNet and CIFAR-10.

We present an application of CNN's to the task of classifying trees by images of their leaves; specifically all 185 types of trees in the United States. This task proves to be difficult for traditional computer vision methods due to the high number of classes, inconsistency in images, and large visual similarity between leaves.

Kumar, et al. developed a automatic visual recognition algorithm in their 2012 paper Leafsnap: A Computer Vision System for Automatic Plant Species Identification to attempt to solve this problem.

Our model is based off VGG-16 except modified to work with 64x64 size inputs. We achieved state of the art results at the time. Our deep learning approach to this problem further improves the accuracy from 70.8% to 86.2% for the top-1 prediction accuracy and from 96.8% to 98.4% for top-5 prediction accuracy.

	Top-1 Accuracy	Top-5 Accuracy
Leafsnap	70.8%	96.8%
Deep-Leafsnap	86.2%	98.4%

We noticed that our model failed to recognize specific classes of trees constantly causing our overall accuracy to derease. This is primarily due to the fact that those trees had very small leaves which were hard to preprocess and crop. Our training images were also resized to 64x64 due to limited computational resources. We plan on further improving our data preprocessing and increasing our image size to 224x224 in order to exceed 90% for our top-1 prediction acurracy.

The following goes over the code and how to set it up on your own machine.

Files

model.py trains a convolutional neural network on the dataset.
vgg.py PyTorch model code for VGG-16.
densenet.py PyTorch model code for DenseNet-121.
resnet.py PyTorch model code for ResNet.
dataset.py creates a new train/test dataset by cropping the leaf and augmenting the data.
utils.py helps do some of the hardcore image processing in dataset.py.
averagemeter.py helper class which keeps track of a bunch of averages when training.
leafsnap-dataset-images.csv is the CSV file corresponding to the dataset.
requirements.txt contains the pip requirements to run the code.

Installation

To run the models and code make sure you Python installed.

Install PyTorch by following the directions here.

Clone the repo onto your local machine and cd into the directory.

git clone https://github.com/sujithv28/Deep-Leafsnap.git
cd Deep-Leafsnap

Install all the python dependencies:

pip install -r requirements.txt

Make sure sklearn is updated to the latest version.

pip install --upgrade sklearn

Also make sure you have OpenCV installed either through pip or homebrew. You can check if this works by running and making sure nothing complains:

python
import cv2

Download Leafsnap's image data and extract it to the main directory by running in the directory. Original data can be found here.

wget https://www.dropbox.com/s/dp3sk8wpiu9yszg/data.zip?dl=0
unzip -a data.zip?dl=0
rm data.zip?dl=0

Create the Training and Testing Data

To create the dataset, run

python dataset.py

This cleans the dataset by cropping only neccesary portions of the images containing the leaves and also resizes them to 64x64. If you want to change the image size go to utils.py and change img = misc.imresize(img, (64,64))to any size you want.

Training Model

To train the model, run

python model.py

Comments

Add MobileNet with new results

Hi. I added MobileNet to the current implementations. It runs on the 224x224 images rather than the resized pictures. It does pretty well. I also uploaded a folder where you can load the trained weights.

opened by Eric-Wallace 0

cv2 TypeError: src data type = 0 is not supported

Hi, I got this error in first step 'python dataset.py'

[INFO]  Training Samples : 23598
	Testing Samples  :  5900
[INFO] Processing Images
Height: 469, Width:  700

Traceback (most recent call last):
  File "dataset.py", line 114, in <module>
    csv_name='leafsnap-dataset-train-images.csv', augment=True)
  File "dataset.py", line 80, in save_images
    image_to_write = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

The source code and dataset is from where your docs, please tell here, if you have any answer.

opened by alexwang3322 1

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

310 Dec 28, 2022

Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab

VQGAN-CLIP-Video cat.mp4 policeman.mp4 schoolboy.mp4 forsenBOG.mp4

23 Oct 26, 2022

Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

2 Oct 20, 2022

Computer vision - fun segmentation experience using classic and deep tools :)

Computer_Vision_Segmentation_Fun Segmentation of Images and Video. Tools: pytorch Models: Classic model - GrabCut Deep model - Deeplabv3_resnet101 Flo

1 Dec 18, 2021

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Bayesian Methods for Hackers Using Python and PyMC The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chap

25.1k Jan 2, 2023

A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

Knodle (Knowledge-supervised Deep Learning Framework) - a new framework for weak supervision with neural networks. It provides a modularization for se

93 Nov 6, 2022

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

5 Sep 16, 2022

CVNets: A library for training computer vision networks

CVNets: A library for training computer vision networks This repository contains the source code for training computer vision models. Specifically, it

1.1k Jan 3, 2023

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

1 Jan 17, 2022

LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Related tags

Overview

Deep-Leafsnap

Files

Installation

Create the Training and Testing Data

Training Model

You might also like...

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab

Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

Computer vision - fun segmentation experience using classic and deep tools :)

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

CVNets: A library for training computer vision networks

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

Comments

Add MobileNet with new results

cv2 TypeError: src data type = 0 is not supported

Owner

Sujith Vishwajith

QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

Machine learning framework for both deep learning and traditional algorithms

Lacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Simple improvement of VQVAE that allow to generate x2 sized images compared to baseline

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

Pytorch implementation of "Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet"