Baselight

Benchmarks Datasets For Cluster Analysis

25 simulated datasets generated by either Gaussian or Uniform distributions

@kaggle.onthada_benchmarks_datasets_for_clustering

Loading...
Loading...

About this Dataset

Benchmarks Datasets For Cluster Analysis

25 Artificial Datasets

The datasets are generated using either Gaussian or Uniform distributions. Each dataset contains several known sub-groups intended for testing centroid-based clustering results and cluster validity indices.

Cluster analysis is a popular machine learning used for segmenting datasets with similar data points in the same group. For those who are familiar with R, there is a new R package called "UniversalCVI" https://CRAN.R-project.org/package=UniversalCVI used for cluster evaluation. This package provides algorithms for checking the accuracy of a clustering result with known classes, computing cluster validity indices, and generating plots for comparing them. The package is compatible with K-means, fuzzy C means, EM clustering, and hierarchical clustering (single, average, and complete linkage). To use the "UniversalCVI" package, one can follow the instructions provided in the R documentation.

For more in-depth details of the package and cluster evaluation, please see the papers
https://doi.org/10.1016/j.patcog.2023.109910 and https://arxiv.org/abs/2308.14785

All the datasets are also available on GitHub at

https://github.com/O-PREEDASAWAKUL/FuzzyDatasets.git .

Tables

D10 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d10_data
  • 24.79 KB
  • 1250 rows
  • 3 columns
Loading...

CREATE TABLE d10_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D11 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d11_data
  • 11.63 KB
  • 500 rows
  • 3 columns
Loading...

CREATE TABLE d11_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D12 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d12_data
  • 20.79 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE d12_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D13 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d13_data
  • 20.77 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE d13_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D14 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d14_data
  • 20.79 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE d14_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D15 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d15_data
  • 20.79 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE d15_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D16 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d16_data
  • 10.3 KB
  • 425 rows
  • 3 columns
Loading...

CREATE TABLE d16_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D17 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d17_data
  • 34.79 KB
  • 1750 rows
  • 3 columns
Loading...

CREATE TABLE d17_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D18 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d18_data
  • 44.49 KB
  • 2250 rows
  • 3 columns
Loading...

CREATE TABLE d18_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D1 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d1_data
  • 30.22 KB
  • 1500 rows
  • 3 columns
Loading...

CREATE TABLE d1_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D2 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d2_data
  • 24.72 KB
  • 1200 rows
  • 3 columns
Loading...

CREATE TABLE d2_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D3 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d3_data
  • 28.36 KB
  • 1400 rows
  • 3 columns
Loading...

CREATE TABLE d3_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D4 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d4_data
  • 47.27 KB
  • 2400 rows
  • 3 columns
Loading...

CREATE TABLE d4_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D5 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d5_data
  • 8.94 KB
  • 350 rows
  • 3 columns
Loading...

CREATE TABLE d5_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D6 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d6_data
  • 22.89 KB
  • 1100 rows
  • 3 columns
Loading...

CREATE TABLE d6_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D7 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d7_data
  • 30.22 KB
  • 1500 rows
  • 3 columns
Loading...

CREATE TABLE d7_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D8 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d8_data
  • 39.37 KB
  • 2000 rows
  • 3 columns
Loading...

CREATE TABLE d8_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

D9 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.d9_data
  • 19.15 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE d9_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

R1 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.r1_data
  • 10.77 KB
  • 450 rows
  • 3 columns
Loading...

CREATE TABLE r1_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

R2 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.r2_data
  • 34.8 KB
  • 1750 rows
  • 3 columns
Loading...

CREATE TABLE r2_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

R3 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.r3_data
  • 32.11 KB
  • 1600 rows
  • 3 columns
Loading...

CREATE TABLE r3_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

R4 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.r4_data
  • 25.64 KB
  • 1250 rows
  • 3 columns
Loading...

CREATE TABLE r4_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

R5 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.r5_data
  • 24.72 KB
  • 1200 rows
  • 3 columns
Loading...

CREATE TABLE r5_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

R6 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.r6_data
  • 30.22 KB
  • 1500 rows
  • 3 columns
Loading...

CREATE TABLE r6_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

R7 Data

@kaggle.onthada_benchmarks_datasets_for_clustering.r7_data
  • 19.17 KB
  • 1200 rows
  • 3 columns
Loading...

CREATE TABLE r7_data (
  "x" DOUBLE,
  "y" DOUBLE,
  "label" BIGINT
);

Share link

Anyone who has the link will be able to view this.