Multiple Machine Learning Datasets
@kaggle.ericamohadjei_trending_public_datasets
@kaggle.ericamohadjei_trending_public_datasets
Trending Public Datasets Overview
These Datasets contain a diverse collection of datasets intended for machine learning research and practice. Each dataset is curated to support different types of machine learning challenges, including classification, regression, and clustering. Below is a detailed list of the datasets available in this repository, along with descriptions and links to their sources.
Available Datasets
Iris Dataset
Description: This classic dataset includes measurements for 150 iris flowers from three different species. It includes four features: sepal length, sepal width, petal length, and petal width.
Source: Iris Dataset Source
Files: iris.csv
DHFR Dataset
Description: Contains data for 325 molecules with biological activity against the DHFR enzyme, relevant in anti-malarial drug research. It includes 228 molecular descriptors as features.
Source: DHFR Dataset Source
Files: dhfr.csv
Heart Disease Dataset (Cleveland)
Description: Comprises diagnostic measurements from 303 patients tested for heart disease at the Cleveland Clinic. It features 13 clinical attributes.
Source: UCI Machine Learning Repository
Files: heart-disease-cleveland.csv
HCV Data
Description: Detailed datasets related to Hepatitis C Virus (HCV) progression, with features for classification and regression tasks.
Files: HCV_NS5B_Curated.csv, hcv_classification.csv, hcv_regression.arff
NBA Seasons Stats
Description: Player statistics from the NBA 2020 and 2021 seasons for detailed sports analytics.
Files: NBA_2020.csv, NBA_2021.csv
Boston Housing Dataset
Description: Data concerning housing values in the suburbs of Boston, suitable for regression analysis.
Files: BostonHousing.csv, BostonHousing_train.csv, BostonHousing_test.csv
Acetylcholinesterase Inhibitor Bioactivity
Description: Chemical bioactivity data against acetylcholinesterase, a target relevant to Alzheimer's research. It includes raw and processed formats with chemical fingerprints.
Files: acetylcholinesterase_01_bioactivity_data_raw.csv to acetylcholinesterase_07_bioactivity_data_2class_pIC50_pubchem_fp.csv
California Housing Dataset
Description: Data aimed at predicting median house prices in California districts.
Files: california_housing_train.csv, california_housing_test.csv
Virtual Reality Experiences Data
Description: Data from user experiences with various virtual reality setups to study user engagement and satisfaction.
Files: Virtual Reality Experiences-data.csv
Fast-Food Chains in USA
Description: Overview of various fast-food chains operating in the USA, their locations, and popularity.
Files: Fast-Food Chains in USA.csv
Contributing
We welcome contributions to this dataset repository. If you have a dataset that you believe would be beneficial for the machine learning community, please see our contribution guidelines in CONTRIBUTING.md.
License
This dataset is available under the MIT License.
@kaggle
Share link
Anyone who has the link will be able to view this.