Malware-Related Sentence Classification
This repo contains the code for the ICTAI 2021 paper "Enrichment of Features for Malware-Related Sentence Classification using External Knowledge".
Installation
Installation from the source. Python's virtual or Conda environments are recommended.
git clone https://github.com/chaumng/malware_related_sentence_classification.git
cd malware_related_sentence_classification
pip install -r requirements.txt
This repo is tested on Python 3.7.
Classification and Evaluation
Preprocess data
python preprocess_data.py
Parameter searching: Classify and evaluate
In this repo, we already provided the GAT weak labels in a file. To perform parameter searching, run the following command. The default value is to perform the second grid search. You can change the value of the argument param_grid_setting to "first_grid_search" perform the first grid search, or to "best_setting" to run only the best setting.
python svm_param_search.py --param_grid_setting second_grid_search
Citation
If you find this paper or this code useful, please cite this paper:
@inproceedings{chaunguyen_et_al_2021,
title={Enrichment of Features for Malware-Related Sentence Classification using External Knowledge},
author={Nguyen, Chau and Tran, Vu and Nguyen, Le Minh},
booktitle={Proceedings of the 33rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI)},
year={2021},
organization={IEEE},
}