Baselight

Cyberbullying Dataset

Cyberbullying Data from Various Sources

@kaggle.saurabhshahane_cyberbullying_dataset

Loading...
Loading...

About this Dataset

Cyberbullying Dataset

Context

This dataset is a collection of datasets from different sources related to the automatic detection of cyber-bullying. The data is from different social media platforms like Kaggle, Twitter, Wikipedia Talk pages and YouTube. The data contain text and labeled as bullying or not. The data contains different types of cyber-bullying like hate speech, aggression, insults and toxicity.

Content

The data is from different social media platforms like Kaggle, Twitter, Wikipedia Talk pages and YouTube. The data contain text and labeled as bullying or not. The data contains different types of cyber-bullying like hate speech, aggression, insults and toxicity.

Acknowledgements

Elsafoury, Fatma (2020), “Cyberbullying datasets”, Mendeley Data, V1, doi: 10.17632/jf4pzyvnpj.1

Tables

Aggression Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.aggression_parsed_dataset
  • 28.15 MB
  • 115864 rows
  • 5 columns
Loading...

CREATE TABLE aggression_parsed_dataset (
  "index" BIGINT,
  "text" VARCHAR,
  "ed_label_0" DOUBLE,
  "ed_label_1" DOUBLE,
  "oh_label" BIGINT
);

Attack Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.attack_parsed_dataset
  • 28.14 MB
  • 115864 rows
  • 5 columns
Loading...

CREATE TABLE attack_parsed_dataset (
  "index" BIGINT,
  "text" VARCHAR,
  "ed_label_0" DOUBLE,
  "ed_label_1" DOUBLE,
  "oh_label" BIGINT
);

Kaggle Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.kaggle_parsed_dataset
  • 1.15 MB
  • 8799 rows
  • 4 columns
Loading...

CREATE TABLE kaggle_parsed_dataset (
  "index" BIGINT,
  "oh_label" BIGINT,
  "date" VARCHAR,
  "text" VARCHAR
);

Toxicity Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.toxicity_parsed_dataset
  • 38.33 MB
  • 159686 rows
  • 5 columns
Loading...

CREATE TABLE toxicity_parsed_dataset (
  "index" BIGINT,
  "text" VARCHAR,
  "ed_label_0" DOUBLE,
  "ed_label_1" DOUBLE,
  "oh_label" BIGINT
);

Twitter Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.twitter_parsed_dataset
  • 1.69 MB
  • 16851 rows
  • 5 columns
Loading...

CREATE TABLE twitter_parsed_dataset (
  "index" VARCHAR,
  "id" VARCHAR,
  "text" VARCHAR,
  "annotation" VARCHAR,
  "oh_label" DOUBLE
);

Twitter Racism Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.twitter_racism_parsed_dataset
  • 1.18 MB
  • 13471 rows
  • 5 columns
Loading...

CREATE TABLE twitter_racism_parsed_dataset (
  "index" DOUBLE,
  "id" DOUBLE,
  "text" VARCHAR,
  "annotation" VARCHAR,
  "oh_label" BIGINT
);

Twitter Sexism Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.twitter_sexism_parsed_dataset
  • 1.46 MB
  • 14881 rows
  • 5 columns
Loading...

CREATE TABLE twitter_sexism_parsed_dataset (
  "index" VARCHAR,
  "id" VARCHAR,
  "text" VARCHAR,
  "annotation" VARCHAR,
  "oh_label" DOUBLE
);

Youtube Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.youtube_parsed_dataset
  • 2.54 MB
  • 3464 rows
  • 10 columns
Loading...

CREATE TABLE youtube_parsed_dataset (
  "index" BIGINT,
  "userindex" VARCHAR,
  "text" VARCHAR,
  "number_of_comments" BIGINT,
  "number_of_subscribers" BIGINT,
  "membership_duration" BIGINT,
  "number_of_uploads" BIGINT,
  "profanity_in_userid" BIGINT,
  "age" BIGINT,
  "oh_label" BIGINT
);

Share link

Anyone who has the link will be able to view this.