Baselight

Domain Generation Algorithm

Domain Generation Algorithm dataset

@kaggle.slashtea_domain_generation_algorithm

About this Dataset

Domain Generation Algorithm

This dataset has been collected from Alexa website ranking a blacklist of previous DGA domain names both sources are avaiblable within the provenance section.

The purpose is to build a classifier which can help us detect a potential machine infected by the DGA (Domain Generation Algorithm) malware.

Typically machines that are infected tend to generate a bunch of random domain names which will contain one active C&C server.

The image above depicts the overall approach of how DGA works. Thus our goal is to build a binomial classifier which can differentiate random domain names from legitimate ones.

Share link

Anyone who has the link will be able to view this.