Baselight

Cyberbullying Dataset

Cyberbullying Data from Various Sources

@kaggle.saurabhshahane_cyberbullying_dataset

Loading...
Loading...

About this Dataset

Cyberbullying Dataset

Context

This dataset is a collection of datasets from different sources related to the automatic detection of cyber-bullying. The data is from different social media platforms like Kaggle, Twitter, Wikipedia Talk pages and YouTube. The data contain text and labeled as bullying or not. The data contains different types of cyber-bullying like hate speech, aggression, insults and toxicity.

Content

The data is from different social media platforms like Kaggle, Twitter, Wikipedia Talk pages and YouTube. The data contain text and labeled as bullying or not. The data contains different types of cyber-bullying like hate speech, aggression, insults and toxicity.

Acknowledgements

Elsafoury, Fatma (2020), “Cyberbullying datasets”, Mendeley Data, V1, doi: 10.17632/jf4pzyvnpj.1

Tables

Aggression Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.aggression_parsed_dataset
  • 29.52 MB
  • 115,864 rows
  • 5 columns
Loading...
CREATE TABLE aggression_parsed_dataset (
  "index" BIGINT,
  "text" VARCHAR,
  "ed_label_0" DOUBLE,
  "ed_label_1" DOUBLE,
  "oh_label" BIGINT
);

Attack Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.attack_parsed_dataset
  • 29.51 MB
  • 115,864 rows
  • 5 columns
Loading...
CREATE TABLE attack_parsed_dataset (
  "index" BIGINT,
  "text" VARCHAR,
  "ed_label_0" DOUBLE,
  "ed_label_1" DOUBLE,
  "oh_label" BIGINT
);

Kaggle Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.kaggle_parsed_dataset
  • 1.21 MB
  • 8,799 rows
  • 4 columns
Loading...
CREATE TABLE kaggle_parsed_dataset (
  "index" BIGINT,
  "oh_label" BIGINT,
  "date" VARCHAR,
  "text" VARCHAR
);

Toxicity Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.toxicity_parsed_dataset
  • 40.19 MB
  • 159,686 rows
  • 5 columns
Loading...
CREATE TABLE toxicity_parsed_dataset (
  "index" BIGINT,
  "text" VARCHAR,
  "ed_label_0" DOUBLE,
  "ed_label_1" DOUBLE,
  "oh_label" BIGINT
);

Twitter Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.twitter_parsed_dataset
  • 1.77 MB
  • 16,851 rows
  • 5 columns
Loading...
CREATE TABLE twitter_parsed_dataset (
  "index" VARCHAR,
  "id" VARCHAR,
  "text" VARCHAR,
  "annotation" VARCHAR,
  "oh_label" DOUBLE
);

Twitter Racism Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.twitter_racism_parsed_dataset
  • 1.24 MB
  • 13,471 rows
  • 5 columns
Loading...
CREATE TABLE twitter_racism_parsed_dataset (
  "index" DOUBLE,
  "id" DOUBLE,
  "text" VARCHAR,
  "annotation" VARCHAR,
  "oh_label" BIGINT
);

Twitter Sexism Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.twitter_sexism_parsed_dataset
  • 1.53 MB
  • 14,881 rows
  • 5 columns
Loading...
CREATE TABLE twitter_sexism_parsed_dataset (
  "index" VARCHAR,
  "id" VARCHAR,
  "text" VARCHAR,
  "annotation" VARCHAR,
  "oh_label" DOUBLE
);

Youtube Parsed Dataset

@kaggle.saurabhshahane_cyberbullying_dataset.youtube_parsed_dataset
  • 2.67 MB
  • 3,464 rows
  • 10 columns
Loading...
CREATE TABLE youtube_parsed_dataset (
  "index" BIGINT,
  "userindex" VARCHAR,
  "text" VARCHAR,
  "number_of_comments" BIGINT,
  "number_of_subscribers" BIGINT,
  "membership_duration" BIGINT,
  "number_of_uploads" BIGINT,
  "profanity_in_userid" BIGINT,
  "age" BIGINT,
  "oh_label" BIGINT
);

Share link

Anyone who has the link will be able to view this.