Pytorch implementation of MalConv

Overview

MalConv-Pytorch

A Pytorch implementation of MalConv


Desciprtion

This is the implementation of MalConv proposed in Malware Detection by Eating a Whole EXE.

Dependency

Please make sure each of them is installed with the correct version

  • numpy
  • pytorch (0.3.0.post4)
  • pandas (0.20.3)

Setup

Preparing data

For the training data, please place PE files under data/train/ and build the label table for training set with each row being

    <File Name>, <Label>

where label = 1 refers to malware. Validation set should be handled in the same way.

Training

Run the following command for training progress

    python3 train.py <config_file_path> <random_seed>
    Example : python3 train.py config/example.yaml 123

Training Log & Checkpoint

Log file, prediction on validation set & Model checkpoint will be stored at the path specified in config file.

Parameters & Model Options

For parameters and options availible, please refer to config/example.yaml.

Comments
  • train.py is not working

    train.py is not working

    Hey together,

    i run

    (myenv) mnoppel@srv:~/projects/MalConv-Pytorch$ python3 train.py config/example.yaml 123
    Usage: python3 run_exp.py <config file path> <seed>
    

    like explained in the readme.md.

    Any idea what might be wrong?

    Kind regards, Max

    opened by noppelmax 1
  • dataloader issue due to reading files with a name that is composed of mixed lower and upper case characters

    dataloader issue due to reading files with a name that is composed of mixed lower and upper case characters

    Hi

    When I run your repo (training code), dataloader reads file name from a file that keeps file name and label as you instructed. But I got a problem with this when it cannot read a file name that is composed of mixed lower and upper case characters. I am not sure why it is the case. When I changed the file name to all lower cases or upper cases, it could read and loaded this file. But I could not change for all files since thousands of files have name with mixed lower and upper case characters.

    How can I modify your code to read all files with mixing upper and lower case characters?

    Thanks

    opened by vietvo89 0
  • loss.cpu().data.numpy() is not a list

    loss.cpu().data.numpy() is not a list

    I might be because of the newest version of pytorch or something else, but loss.cpu().data.numpy() is no longer a list and thus should be changed.

    Before:

    https://github.com/Alexander-H-Liu/MalConv-Pytorch/blob/939cb59ff1338dbc2b71a186f03bbc677b05c408/train.py#L151

    https://github.com/Alexander-H-Liu/MalConv-Pytorch/blob/939cb59ff1338dbc2b71a186f03bbc677b05c408/train.py#L183

    Now

    history['tr_loss'].append(loss.cpu().data.numpy())
    
    history['val_loss'].append(loss.cpu().data.numpy())
    
    opened by londumas 1
  • depreciated calling yaml.load

    depreciated calling yaml.load

    Current yaml.load has a depreciated behavior:

    train.py:21: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
      conf = yaml.load(open(config_path,'r'))
    

    https://github.com/Alexander-H-Liu/MalConv-Pytorch/blob/939cb59ff1338dbc2b71a186f03bbc677b05c408/train.py#L21

    The fix is given there https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation, and the previous line should be replace by the following:

    conf = yaml.load(open(config_path,'r'), Loader=yaml.FullLoader)
    
    opened by londumas 0
  • requirements.txt is missing

    requirements.txt is missing

    Thank you for the work. One simple thing that is missing is the requirements.txt that lists all necessary python packages. To my experience running your package, and from a simple grep "import" -r *, here is what is required:

    pyyaml
    torch
    numpy
    pandas
    
    opened by londumas 0
Owner
Alexander H. Liu
Alexander H. Liu
An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

Enrico Fini 48 Sep 27, 2022
RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

Simo Ryu 90 Dec 8, 2022
A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

Harry Yang 121 Dec 17, 2022
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

LEI TAI 111 Dec 8, 2022
Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

Arthur Juliani 76 Jan 7, 2023
Fang Zhonghao 13 Nov 19, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 4, 2023
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 6, 2023
Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

Luca Moschella 520 Dec 30, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

NVIDIA Corporation 6.9k Jan 3, 2023
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Mayur 119 Nov 24, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 1, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 360 Dec 10, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 2, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 359 Jan 5, 2023
A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

Vince 0 Jul 13, 2021
PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Subin An 8 Nov 21, 2022
PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

Amin Rezaei 157 Dec 11, 2022