How many Mirai variants are there?

Botconf 2018
Friday
2023-04-25 | 15:30 – 16:00

Wenji Qu 🗣 | Hui Wang 🗣

Mirai was soon open-sourced after overwhelming several high-profile targets including Krebsonsecurity, OVH, and DYN in Autumn 2016, which leads to a proliferation of Mirai variants in the past 2 years. For better fight against Mirai botnets, effective variant classification schemes are very necessary. Currently, Mirai variants are usually classified with their branch names (e.g., JOSHO, OWARI, MASUTA) which come from a command line of “/bin/busybox ” found in the Mirai sample. While the default name is “MIRAI”, the was usually replaced with an author interested one (e.g., MASUTA, SATORI, SORA) in later variants.
However, we think branch-based classification scheme is too coarse-grained to reveal: 1) the variances in single variant of different stages, and 2) the connections among different branches. In this talk, we would like to present our classification schemes concluded from 32K+ collected samples and 1,000+ extracted CNCs. Our schemes are mainly based on the data of configurations, supported attack methods, and credential dictionaries, which are all extracted from the samples. For example, we successfully classify Mirai samples into 106 variants based on the combination of supported attack methods. We also successfully connected multiple branches based on the keys used in configuration encryption. To summarize, the content of this talk is as follows:
1)We will demonstrate the idea of automatically extracting configurations, supported attack methods, and credential dictionaries from samples for classification purpose.
2)We will propose a fingerprint technique to recognize Mirai attack methods (e.g., syn_flood, http_flood) with information extracted from samples without reverse engineering work.
3)We will introduce a set of classification schemes based on the extracted data, and will investigate popular Mirai branches with proposed schemes.

It’s worth mentioning that since the used data is processor-independent (e.g., x86, x64, ARM, MIPS, SPARC, PowerPC), our schemes can classify the same variant’s samples even if they are for different CPU architectures.

PDF