Malware Authorship Attribution Model using Runtime Modules based on Automated Analysis

Sangwoo Lee - Department of Convergence Information Security, Graduate School, Jeju National University, 102 Jejudaehak ro, Jeju, Republic of Korea
Jungwon Cho - Department of Computer Education, Jeju National University, 102 Jejudaehak ro, Jeju, Republic of Korea


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.6.1-2.941

Abstract


Malware authorship attribution is a research field that identifies the author of malware by extracting and analyzing features that relate the authors from the source code or binary code of malware. Currently, it is being used as one of the detection techniques based on malware forensics or identifying patterns of continuous attacks such as APT attacks. The analysis methods to identify the author are as follows. One is a source code-based analysis method that extracts features from the source code, and the other is a binary-based analysis method that extracts features from the binary. However, to handle the modularization and the increasing amount of malicious code with these methods, both time and manpower are insufficient to figure out the characteristics of the malware. Therefore, we propose the model for malware authorship attribution by rapidly extracting and analyzing features using automated analysis. Automated analysis uses a tool and can be analyzed through a file of malware and the specific hash values without experts. Furthermore, it is the fastest to figure out among other malware analysis methods. We have experimented by applying various machine learning classification algorithms to six malware author groups, and Runtime Modules and Kernel32.dll API extracted from the automated analysis were selected as features for author identification. The result shows more high accuracy than the previous studies. By using the automated analysis, it extracts features of malware faster than source code and binary-based analysis methods.

Keywords


Malware authorship attribution; automated analysis; runtime modules; machine learning classification.

Full Text:

PDF

References


Kamundala Espoir K ,and Kim Chang Hoon. "CNN Model to Classify Malware Using Image Feature." KIISE Transactions on Computing Practices(KTCP), Vol. 24, No. 5, pp. 256-261, May. 2018

Young-Bok Cho. "The Malware Detection Using Deep Learning based R-CNN." Journal of Digital Contents Society, Vol.19, No. 6, pp. 1177-1183, Jun.2018

Ji-Won Hong, Sang-Hyun Park, Sang-Wook Kim. "Malware Feature Selection for Author Group Classification." KIISE Database Society of Korea, Vol. 34, No. 1, pp. 14-24, Apr. 2018

Gun-Yoon Shin, Dong-Wook Kim, Myung-Mook Han. "The attacker group feature extraction framework: Authorship Clustering based on Genetic Algorithm for Malware Authorship Group Identification." Journal of Internet Computing and Services Vol. 21, No. 2, pp. 1-8, Apr. 2020.

Saed Alrabaee, Paria Shirani, Mourad Debbabi, Lingyu Wang. “On the Feasibility of Malware Authorship Attribution." International Symposium on Foundations and Practice of Security, vol 10128, pp. 256-272, Dec. 2017

E. Stamatatos. “A Survey of Modern Authorship Attribution Methods." American Society for Information Science and Technology Vol. 60, No. 3, pp. 538-556, Mar. 2009

A. Caliskan-Islam, R. Harang, A. Liu, A. Narayanan, C. Voss, F. Yamaguchi. “De-anonymizing Programmers via Code Stylometry." 24th USENIX security Symposium Security 15, pp. 255-270, Aug. 2015

Muhammad Ljaz, Muhammad Hanif Durad, Maliha Ismail. “Static and Dynamic Malware Analysis Using Machine Learning." Proceedings of 2019 16th International Bhurban Conference on Applied Sciences&Technology(IBCAST), pp. 687-691, Jan. 2019

Gerard Biau, Luc Devroye. Lectures on the Nearest Neigbor Method. Springer Series in the Data Sciences. pp. 25-32, 2015

S.V.N vishwanathan, M. Narasimha Murty. “SSVM : A Simple SVM Algorithm." Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290). pp. 2393-2398, Aug. 2002

Antony J. Myles, Robert N. Feudale, Yang Liu, Nathanile A. Woody and Steven D. Brown. “An Introduction to Decision Tree Modeling." Journal of Chemometrics, Vol.18, Issue. 6, pp. 275-285, Jun. 2004

Zhi-Hua Zhou. Ensemble Methods: Foundations and Algorithms. A Chapman&Hall Book. pp. 23-44, 2019

Rosenblum, Nathan, Xiaojin Zhu, and Barton Miller. "Who wrote this code? identifying the authors of program binaries", ESORICS, pp. 172-189. 2011

Saed Alrabaee, Noman Saleem, Stere Preda, Lingyu Wang, Mourad Debbabi. “OBA2: An Onion approach to Binary code Authorship Attribution." Digital Investigation, Vol. 11, Supplement. 1, pp. S94-S103, May. 2014

Suk-Jin Hong, Ji-Won Hong, Sang-Wook Kim, Dong-Phil Kim, Won-ho Kim. "Malware Author Group Classification using Deep Learning Classifier." KIISE Database Society of Korea, Vol. 34, No. 2, pp. 34-45, Aug. 2018