ETHOS Hate Speech Dataset by Kaggle | Technology and IT

About this Dataset

ETHOS Hate Speech Dataset

From the project repo: https://github.com/intelligence-csd-auth-gr/Ethos-Hate-Speech-Dataset

ETHOS: multi-labEl haTe speecH detectiOn dataSet. This repository contains a dataset for hate speech detection on social media platforms, called Ethos. There are two variations of the dataset:

Ethos_Dataset_Binary.csv[Ethos_Dataset_Binary.csv] contains 998 comments in the dataset alongside with a label about hate speech presence or absence. 565 of them do not contain hate speech, while the rest of them, 433, contain.
Ethos_Dataset_Multi_Label.csv [Ethos_Dataset_Multi_Label.csv] which contains 8 labels for the 433 comments with hate speech content. These labels are violence (if it incites (1) or not (0) violence), directed_vs_general (if it is directed to a person (1) or a group (0)), and 6 labels about the category of hate speech like, gender, race, national_origin, disability, religion and sexual_orientation.

Tables

En Dataset With Stop Words

@kaggle.konradb_ethos_hate_speech_dataset.en_dataset_with_stop_words

355.16 KB
5647 rows
7 columns


CREATE TABLE en_dataset_with_stop_words (
  "hitid" BIGINT,
  "tweet" VARCHAR,
  "sentiment" VARCHAR,
  "directness" VARCHAR,
  "annotator_sentiment" VARCHAR,
  "target" VARCHAR,
  "group" VARCHAR
);

Ethos Dataset Binary

@kaggle.konradb_ethos_hate_speech_dataset.ethos_dataset_binary

80.56 KB
998 rows
2 columns


CREATE TABLE ethos_dataset_binary (
  "comment" VARCHAR,
  "ishate" DOUBLE
);

Ethos Dataset Multi Label

@kaggle.konradb_ethos_hate_speech_dataset.ethos_dataset_multi_label

45.41 KB
433 rows
9 columns


CREATE TABLE ethos_dataset_multi_label (
  "comment" VARCHAR,
  "violence" DOUBLE,
  "directed_vs_generalized" DOUBLE,
  "gender" DOUBLE,
  "race" DOUBLE,
  "national_origin" DOUBLE,
  "disability" DOUBLE,
  "religion" DOUBLE,
  "sexual_orientation" DOUBLE
);

Hate Speech And Offensive Language

@kaggle.konradb_ethos_hate_speech_dataset.hate_speech_and_offensive_language

1.62 MB
24783 rows
7 columns


CREATE TABLE hate_speech_and_offensive_language (
  "unnamed_0" BIGINT,
  "count" BIGINT,
  "hate_speech" BIGINT,
  "offensive_language" BIGINT,
  "neither" BIGINT,
  "class" BIGINT,
  "tweet" VARCHAR
);

ETHOS Hate Speech Dataset

About this Dataset