Automated identification of malicious code variants
Date of Award
Honors Thesis (Colby Access Only)
Colby College. Computer Science Dept.
Malicious code is one of the most dynamic threats to computers and computer networks. Authors are constantly modifying their malicious code to fix bugs, add new features, and evade detection. Some families have over fifty variants in the wild. When these new variants are discovered, correctly identifying the maliclous code's family can be a very time consuming and manual process for security researchers. This project's goal was to create a system to automate the family identification process. The system that was built for this project uses run-time analysis to analyze the API calls that a malicious Win32 binary makes. These calls are then compared to data collected from other malicious code. If the new malicious code is found to be similar enough to any other malicious code, the two are considered to be variants of the same family. The system performed very well during testing. It was able to identify the correct family of about 82% of the malicious programs in the dataset. The system was also able to provide explanations in cases when different antivirus scanners did not agree on the family of a piece of malicious code.
Computer viruses, Computer security
Recommended CitationRies, Christopher, "Automated identification of malicious code variants" (2005). Honors Theses. Paper 427.
Colby College theses are protected by copyright. They may be viewed or downloaded from this site for the purposes of research and scholarship. Reproduction or distribution for commercial purposes is prohibited without written permission of the author.