AI and NLP for Advanced Malware Classification & Malware Family Attribution

Botconf 2025
Friday
2025-05-23 | 11:40 – 12:25

Solomon Sonya 🗣

Malware creation and proliferation is on the rise! Generative AI and large language models (LLMs) exacerbate this issue by assisting in malware code creation and automating malware binary development, accelerating the spread of malicious software. Traditional detection mechanisms, including antivirus software, fail to adequately detect novel and varied malware. While academia & industry have studied malware classification techniques for many decades, challenges such as malware dataset standardization, sample diversity, and dataset sample size have limited the generalizability and effectiveness of these classification techniques using updated, real-world datasets. This is a practical hands-on talk in Artificial Intelligence and Natural Language Processing (NLP) that teaches the audience exactly how analyze malware using NLP and build AI classifiers for malware detection and malware family attribution. Participants will walk away with new state of the art AI models to analyze malware using NLP starting from a corpus of malicious binaries and ending with analysis from our AI models. More importantly, participants will learn how to convert these advanced frameworks into any domain in cybersecurity. Many people like to say they “use AI”, without truly knowing what is going on. This talk will actually teach and demonstrate how to code and train these AI models and apply these models to solve real world problems.