This dataset has been collected from Alexa website ranking a blacklist of previous DGA domain names both sources are avaiblable within the provenance section.
The purpose is to build a classifier which can help us detect a potential machine infected by the DGA (Domain Generation Algorithm) malware.
Typically machines that are infected tend to generate a bunch of random domain names which will contain one active C&C server.
The image above depicts the overall approach of how DGA works. Thus our goal is to build a binomial classifier which can differentiate random domain names from legitimate ones.
Share link
Anyone who has the link will be able to view this.