Predicting Antibiotic Resistance Genes
Dataset Description
This dataset has been preprocessed for ML training. First of all, the class imbalance problem has been fixed by applying CTGAN. Then feature selection techniques and PCA have been applied for dimensionality reduction. It contains features as unitigs, which are short strands of DNA, and the dataset records whether the specific strands of genes are present or not in the specific sample. The presence or absence of thousands of unitigs indicates whether the antibiotic will be resistant or susceptible.
Related Datasets
-
Antibiotic Dataset
@kaggle
-
Dhds Dataset
@cdc